Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Showing
8 changed files
with
280 additions
and
288 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,2 @@ | ||
arch/ | ||
arch/ | ||
.DS_Store |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,167 @@ | ||
# 06-cloud-deployment | ||
# 04-cloud-deployment | ||
|
||
## Overview | ||
|
||
In this lesson, we will deploy our weather data pipeline to Google Cloud Platform (GCP) using Google Kubernetes Engine (GKE). We'll cover the process of setting up a GKE cluster, building and pushing Docker images to Google Container Registry, and deploying our application components using Kubernetes. | ||
|
||
By the end of this section, you will have: | ||
1. Set up a GKE cluster | ||
2. Built and pushed Docker images for our data pipeline and Flask app | ||
3. Deployed PostgreSQL, our data pipeline jobs, and the Flask app to Kubernetes | ||
4. Learned how to monitor and debug your deployment | ||
|
||
This deployment process demonstrates how to take a locally developed data pipeline and deploy it to a cloud environment, showcasing the scalability and flexibility of containerized applications. | ||
|
||
## Prerequisites | ||
|
||
Before starting this lesson, please ensure that you have: | ||
|
||
1. Completed the [05-cloud-deployment](../05-cloud-deployment/README.md) lesson | ||
2. Google Cloud SDK installed | ||
3. kubectl installed | ||
4. Docker installed | ||
5. A Google Cloud Platform account with billing enabled | ||
|
||
## Lesson Content | ||
|
||
### 4.1 Setup and GKE Cluster Creation | ||
|
||
1. List and set your GCP project: | ||
```bash | ||
gcloud projects list | ||
gcloud config set project <insert_name_of_project> | ||
export PROJECT_ID=$(gcloud config get-value project) | ||
echo $PROJECT_ID | ||
``` | ||
|
||
2. Create a GKE cluster: | ||
```bash | ||
gcloud container clusters create weather-cluster --num-nodes=2 --zone=us-central1-a --quiet > /dev/null 2>&1 & | ||
``` | ||
|
||
To check the status of the cluster deployment: | ||
```bash | ||
gcloud container clusters describe weather-cluster --zone=us-central1-a | ||
``` | ||
|
||
### 4.2 Create Container Repository on Artifact Registry | ||
|
||
```bash | ||
gcloud artifacts repositories create my-docker-repo --project=$PROJECT_ID --location=us --repository-format=docker | ||
``` | ||
|
||
### 4.3 Build and Push Docker Images | ||
|
||
```bash | ||
# Navigate to the data-pipeline directory | ||
cd data-pipeline | ||
|
||
# Build and push data pipeline images | ||
docker build --target extract -t us-docker.pkg.dev/${PROJECT_ID}/my-docker-repo/data-pipeline-extract:latest . | ||
docker build --target load -t us-docker.pkg.dev/${PROJECT_ID}/my-docker-repo/data-pipeline-load:latest . | ||
docker build --target transform -t us-docker.pkg.dev/${PROJECT_ID}/my-docker-repo/data-pipeline-transform:latest . | ||
|
||
docker push us-docker.pkg.dev/${PROJECT_ID}/my-docker-repo/data-pipeline-extract:latest | ||
docker push us-docker.pkg.dev/${PROJECT_ID}/my-docker-repo/data-pipeline-load:latest | ||
docker push us-docker.pkg.dev/${PROJECT_ID}/my-docker-repo/data-pipeline-transform:latest | ||
|
||
# Navigate to the flask-app directory | ||
cd ../flask-app | ||
|
||
# Build and push Flask app image | ||
docker build -t us-docker.pkg.dev/${PROJECT_ID}/my-docker-repo/flask-app:latest . | ||
docker push us-docker.pkg.dev/${PROJECT_ID}/my-docker-repo/flask-app:latest | ||
``` | ||
|
||
### 4.4 Kubernetes Deployment | ||
|
||
1. Get cluster credentials: | ||
```bash | ||
gcloud container clusters get-credentials weather-cluster --zone=us-central1-a | ||
``` | ||
|
||
2. Create a Kubernetes secret for database credentials: | ||
```bash | ||
kubectl create secret generic db-credentials \ | ||
--from-literal=DB_NAME=your_db_name \ | ||
--from-literal=DB_USER=your_db_user \ | ||
--from-literal=DB_PASSWORD=your_db_password \ | ||
--from-literal=DB_HOST=postgres \ | ||
--from-literal=DB_PORT=5432 | ||
``` | ||
|
||
3. Deploy PostgreSQL: | ||
```bash | ||
cd ../gcp-deployment/k8s-artifacts | ||
envsubst < postgres-deployment.yaml | kubectl apply -f - | ||
envsubst < postgres-service.yaml | kubectl apply -f - | ||
``` | ||
|
||
4. Wait for PostgreSQL to be ready: | ||
```bash | ||
kubectl wait --for=condition=ready pod -l app=postgres --timeout=300s | ||
``` | ||
|
||
5. Deploy data pipeline job: | ||
```bash | ||
envsubst < data-pipeline-job.yaml | kubectl apply -f - | ||
kubectl create job --from=cronjob/data-pipeline-sequence data-pipeline-sequence | ||
kubectl wait --for=condition=complete job/data-pipeline-sequence --timeout=600s | ||
``` | ||
|
||
6. Deploy Flask app: | ||
```bash | ||
envsubst < flask-app-deployment.yaml | kubectl apply -f - | ||
envsubst < flask-app-service.yaml | kubectl apply -f - | ||
``` | ||
|
||
### 4.5 Monitoring and Debugging | ||
|
||
Here are some useful commands for monitoring and debugging your deployment: | ||
|
||
1. List all pods: | ||
```bash | ||
kubectl get pods | ||
``` | ||
|
||
2. View logs for all containers in a pod: | ||
```bash | ||
kubectl logs <pod-name> --all-containers=true | ||
``` | ||
|
||
3. Describe a pod: | ||
```bash | ||
kubectl describe pod <pod-name> | ||
``` | ||
|
||
4. Port forward to access services locally: | ||
```bash | ||
kubectl port-forward service/flask-app 8080:80 | ||
``` | ||
|
||
5. View cluster events: | ||
```bash | ||
kubectl get events --sort-by=.metadata.creationTimestamp | ||
``` | ||
|
||
## Conclusion | ||
|
||
In this lesson, you learned how to deploy your weather data pipeline to Google Cloud Platform using Google Kubernetes Engine. You created a GKE cluster, built and pushed Docker images to Google Container Registry, and deployed your application components using Kubernetes. | ||
|
||
This deployment process demonstrates how to take a locally developed data pipeline and scale it in a cloud environment. The containerized approach ensures consistency across different environments and simplifies the deployment process. | ||
|
||
## Key Points | ||
|
||
- GKE provides a managed Kubernetes environment, simplifying cluster setup and management | ||
- Building and pushing Docker images to Google Container Registry enables easy deployment to GKE | ||
- Kubernetes secrets provide a secure way to manage sensitive information like database credentials | ||
- Kubernetes jobs and cronjobs allow for scheduled and one-time execution of tasks | ||
- Monitoring and debugging tools in Kubernetes help manage and troubleshoot deployments | ||
|
||
## Further Reading | ||
|
||
- [Google Kubernetes Engine documentation](https://cloud.google.com/kubernetes-engine/docs) | ||
- [Kubernetes documentation](https://kubernetes.io/docs/home/) | ||
- [Docker documentation](https://docs.docker.com/) | ||
- [Google Container Registry documentation](https://cloud.google.com/container-registry/docs) | ||
- [Kubernetes best practices](https://kubernetes.io/docs/concepts/configuration/overview/) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,111 @@ | ||
# 05-cloud-cleanup | ||
|
||
## Overview | ||
|
||
In this lesson, we will clean up the resources we created in Google Cloud Platform (GCP) during our weather data pipeline deployment. Proper cleanup is essential to avoid unnecessary costs and maintain a tidy cloud environment. We'll cover the process of deleting Kubernetes resources, the GKE cluster, container images, and other associated resources. | ||
|
||
By the end of this section, you will have: | ||
1. Deleted all Kubernetes resources created during the deployment | ||
2. Removed the GKE cluster | ||
3. Cleaned up container images from Google Container Registry | ||
4. Removed any other associated GCP resources | ||
|
||
This cleanup process demonstrates responsible cloud resource management and helps you avoid unexpected charges on your GCP account. | ||
|
||
## Prerequisites | ||
|
||
Before starting this lesson, please ensure that you have: | ||
|
||
1. Completed the [06-cloud-deployment](../06-cloud-deployment/README.md) lesson | ||
2. Google Cloud SDK installed and configured | ||
3. kubectl installed and configured to work with your GKE cluster | ||
4. Access to the Google Cloud Console | ||
|
||
## Lesson Content | ||
|
||
### 5.1 Delete Kubernetes Resources | ||
|
||
First, we'll remove all the Kubernetes resources we created: | ||
|
||
```bash | ||
kubectl delete cronjob data-pipeline-sequence | ||
kubectl delete job,deployment,service --all | ||
kubectl delete secret db-credentials | ||
``` | ||
|
||
These commands will delete the cronjob, all jobs, deployments, services, and the database credentials secret we created. | ||
|
||
### 5.2 Delete GKE Cluster | ||
|
||
Now, let's delete the GKE cluster: | ||
|
||
```bash | ||
gcloud container clusters delete weather-cluster --zone=us-central1-a --quiet > /dev/null 2>&1 & | ||
``` | ||
|
||
To check the status of the cluster deletion: | ||
|
||
```bash | ||
gcloud container clusters describe weather-cluster --zone=us-central1-a | ||
``` | ||
|
||
If the cluster has been successfully deleted, this command should return an error indicating that the cluster doesn't exist. | ||
|
||
### 5.3 Delete Container Images | ||
|
||
Clean up the container images you pushed to Google Container Registry: | ||
|
||
```bash | ||
# List images | ||
gcloud container images list | ||
|
||
# Delete images (repeat for each image) | ||
gcloud container images list-tags us-docker.pkg.dev/${PROJECT_ID}/my-docker-repo/IMAGE_NAME --format='get(digest)' | xargs -I {} gcloud container images delete us-docker.pkg.dev/${PROJECT_ID}/my-docker-repo/IMAGE_NAME@{} --force-delete-tags --quiet | ||
``` | ||
|
||
Replace `${PROJECT_ID}` with your actual project ID and `IMAGE_NAME` with each image name (data-pipeline-extract, data-pipeline-load, data-pipeline-transform, flask-app). | ||
|
||
### 5.4 Clean Up Other Resources | ||
|
||
Check for and delete any persistent disks that might have been created: | ||
|
||
```bash | ||
# List disks | ||
gcloud compute disks list | ||
|
||
# Delete disks if any exist | ||
gcloud compute disks delete DISK_NAME --zone=ZONE | ||
``` | ||
|
||
Replace `DISK_NAME` and `ZONE` with the appropriate values if any disks are listed. | ||
|
||
### 5.5 Final Verification | ||
|
||
After running all the cleanup commands, it's a good practice to double-check the Google Cloud Console to ensure all resources have been removed. Pay special attention to: | ||
|
||
1. Kubernetes Engine | ||
2. Artifact Registry | ||
3. Compute Engine (for any lingering disks or instances) | ||
4. VPC Network (for any created firewall rules or IP addresses) | ||
|
||
## Conclusion | ||
|
||
In this lesson, you learned how to properly clean up the resources created during the deployment of your weather data pipeline on Google Cloud Platform. This process included deleting Kubernetes resources, removing the GKE cluster, cleaning up container images, and verifying the deletion of all associated resources. | ||
|
||
Proper cleanup is crucial in cloud environments to avoid unnecessary costs and maintain a well-organized cloud infrastructure. The steps you've learned here can be applied to other projects and deployments, ensuring you always leave your cloud environment in a clean state after completing your work. | ||
|
||
## Key Points | ||
|
||
- Always clean up cloud resources when they're no longer needed to avoid unnecessary costs | ||
- Kubernetes resources should be deleted before deleting the cluster | ||
- GKE cluster deletion may take some time; always verify its status | ||
- Container images in Google Container Registry should be cleaned up to save storage costs | ||
- Double-check the Google Cloud Console to ensure all resources are properly removed | ||
|
||
## Further Reading | ||
|
||
- [Google Kubernetes Engine: Deleting a cluster](https://cloud.google.com/kubernetes-engine/docs/how-to/deleting-a-cluster) | ||
- [Cleaning up Container Registry images](https://cloud.google.com/container-registry/docs/managing#deleting_images) | ||
- [Google Cloud resource clean-up best practices](https://cloud.google.com/blog/products/management-tools/google-cloud-resource-clean-up-best-practices) | ||
- [Kubernetes resource management](https://kubernetes.io/docs/concepts/cluster-administration/manage-deployment/) | ||
- [GCP billing and cost management](https://cloud.google.com/billing/docs) |
File renamed without changes.
This file was deleted.
Oops, something went wrong.
Binary file not shown.
243 changes: 0 additions & 243 deletions
243
gcp-deployment/weather-data-pipeline-deployment-guide.md
This file was deleted.
Oops, something went wrong.