Added additional guides

CLASS · Sep 1, 2024 · 8843e56 · 8843e56
1 parent 3d99330
commit 8843e56
Showing 5 changed files with 198 additions and 1 deletion.
diff --git a/.DS_Store b/.DS_Store
diff --git a/.gitignore b/.gitignore
@@ -1,2 +1,3 @@
 arch/
-.DS_Store
+.DS_Store
+gcp-deployment/k8s-output.txt
diff --git a/08-supplementary-guides/kubernetes-guide.md b/08-supplementary-guides/kubernetes-guide.md
@@ -0,0 +1,94 @@
+# Kubernetes Guide for Docker Users
+
+This guide is designed to help developers familiar with Docker and Docker Compose transition to Kubernetes. It provides tips on creating Kubernetes YAML files, outlines the standard structure of these files, and highlights important considerations when working with Kubernetes.
+
+## Getting Started with Kubernetes YAML Files
+
+### What Files Are Needed?
+
+When transitioning from Docker Compose to Kubernetes, you'll typically need to create several YAML files to define your application's resources. Based on the structure in the `k8s-artifacts` directory, here are the common file types you might need:
+
+1. Deployment YAML (e.g., `postgres-deployment.yaml`, `flask-app-deployment.yaml`)
+2. Service YAML (e.g., `postgres-service.yaml`, `flask-app-service.yaml`)
+3. Job or CronJob YAML (e.g., `data-pipeline-job.yaml`)
+
+You may also need additional files depending on your application's requirements, such as:
+
+4. ConfigMap YAML (for configuration data)
+5. Secret YAML (for sensitive data)
+6. PersistentVolume and PersistentVolumeClaim YAML (for persistent storage)
+
+### Standard Structure of Kubernetes YAML Files
+
+Most Kubernetes YAML files follow a similar structure:
+
+```yaml
+apiVersion: <API version>
+kind: <Resource type>
+metadata:
+  name: <Resource name>
+  labels:
+    <key>: <value>
+spec:
+  <Resource-specific configuration>
+```
+
+Key components:
+- `apiVersion`: Specifies the Kubernetes API version being used
+- `kind`: Defines the type of resource (e.g., Deployment, Service, Job)
+- `metadata`: Contains information about the resource, including its name and labels
+- `spec`: Describes the desired state of the resource
+
+## Tips for Creating Kubernetes YAML Files
+
+1. **Use a Consistent Naming Convention**: Name your resources consistently. For example, use the same prefix for related resources (e.g., `postgres-deployment`, `postgres-service`).
+
+2. **Leverage Labels and Selectors**: Use labels to organize your resources and selectors to create relationships between them. This is crucial for services to find the correct pods.
+
+3. **Define Resource Requests and Limits**: Always specify CPU and memory requests and limits for your containers to ensure efficient resource allocation.
+
+4. **Use Environment Variables**: Store configuration in environment variables, either directly in the YAML or by referencing ConfigMaps and Secrets.
+
+5. **Create Separate Files for Different Resources**: Unlike Docker Compose, it's common in Kubernetes to have separate YAML files for different resources. This improves readability and maintainability.
+
+6. **Use Multi-Document YAML Files**: You can define multiple resources in a single file by separating them with `---`. This can be useful for closely related resources.
+
+7. **Utilize Kubernetes Secrets**: For sensitive information like database credentials, use Kubernetes Secrets instead of hardcoding values in your YAML files.
+
+## Considerations When Working with Kubernetes
+
+1. **Stateful vs. Stateless Applications**: Kubernetes handles stateless applications differently from stateful ones. For stateful applications like databases, consider using StatefulSets instead of Deployments.
+
+2. **Networking**: Understand how Kubernetes networking works, especially the differences between ClusterIP, NodePort, and LoadBalancer service types.
+
+3. **Persistent Storage**: If your application needs persistent storage, learn about PersistentVolumes and PersistentVolumeClaims.
+
+4. **Health Checks**: Implement readiness and liveness probes to help Kubernetes manage your application's lifecycle effectively.
+
+5. **Rolling Updates**: Leverage Kubernetes' rolling update feature for zero-downtime deployments.
+
+6. **Resource Management**: Be mindful of resource requests and limits to ensure efficient use of cluster resources.
+
+7. **Monitoring and Logging**: Set up proper monitoring and logging for your Kubernetes cluster and applications.
+
+## Differences from Docker Compose
+
+When transitioning from Docker Compose to Kubernetes, keep in mind:
+
+1. **Service Discovery**: Kubernetes uses its own DNS for service discovery, replacing Docker Compose's links.
+
+2. **Volume Management**: Kubernetes has a more complex but powerful system for managing persistent storage.
+
+3. **Environment Variables**: While you can still use environment variables, Kubernetes offers ConfigMaps and Secrets for more flexible configuration management.
+
+4. **Scaling**: Kubernetes allows for more granular and dynamic scaling compared to Docker Compose's simple `scale` directive.
+
+5. **Networking**: Kubernetes networking is more complex but also more powerful, allowing for advanced features like network policies.
+
+For more detailed information on the specific Kubernetes resources used in this project, refer to the [Kubernetes Deployment README](link-to-previous-readme).
+
+## Conclusion
+
+Transitioning from Docker and Docker Compose to Kubernetes involves a learning curve, but it offers powerful features for deploying, scaling, and managing containerized applications. By understanding the structure of Kubernetes YAML files and following best practices, you can effectively leverage Kubernetes for your applications.
+
+Remember to consult the official Kubernetes documentation and use tools like `kubectl explain` to learn more about specific resource types and their configurations.
diff --git a/...oughts/production-workflow-explanation.md → ...guides/production-workflow-explanation.md b/...oughts/production-workflow-explanation.md → ...guides/production-workflow-explanation.md
diff --git a/gcp-deployment/README.md b/gcp-deployment/README.md
@@ -0,0 +1,102 @@
+# Kubernetes Deployment for Weather Data Pipeline
+
+## Overview
+
+This directory contains Kubernetes (k8s) YAML files for deploying the weather data pipeline and Flask application to a Kubernetes cluster. The deployment consists of the following components:
+
+1. PostgreSQL database
+2. Data pipeline job (Extract, Load, Transform)
+3. Flask web application
+
+The YAML files define the necessary Kubernetes resources to run these components in a scalable and manageable way.
+
+## Code Explanations
+
+### postgres-deployment.yaml
+This file defines a Deployment for the PostgreSQL database. 
+
+Key components:
+- `apiVersion` and `kind`: Specifies this is a Deployment resource
+- `metadata`: Names the deployment
+- `spec`: Defines the desired state of the deployment
+  - `replicas`: Sets the number of pod replicas
+  - `selector`: Determines which pods are managed by this deployment
+  - `template`: Defines the pod template
+    - `containers`: Specifies the container(s) to run in each pod
+      - Uses the `postgres:13` image
+      - Sets up environment variables from a Secret named `db-credentials`
+      - Exposes port 5432
+
+### data-pipeline-job.yaml
+This file defines a CronJob for the data pipeline. 
+
+Key components:
+- `apiVersion` and `kind`: Specifies this is a CronJob resource
+- `metadata`: Names the job
+- `spec`: Defines the job's schedule and template
+  - `schedule`: Sets when the job should run (currently set to never run automatically)
+  - `jobTemplate`: Defines the job to be run
+    - `spec.template.spec`: Specifies the pod template for the job
+      - `volumes`: Defines a shared volume for data exchange
+      - `initContainers`: Specifies containers to run before the main container
+      - `containers`: Defines the main container to run
+        - Uses images from a Google Cloud Container Registry
+        - Sets environment variables from the `db-credentials` Secret
+
+### flask-app-deployment.yaml
+This file defines a Deployment for the Flask web application. 
+
+Key components:
+- Similar structure to postgres-deployment.yaml
+- `spec.replicas`: Specifies 2 replicas for high availability
+- `spec.template.spec.containers`: 
+  - Uses an image from a Google Cloud Container Registry
+  - Sets environment variables from the `db-credentials` Secret
+  - Exposes port 5000
+
+
+### postgres-service.yaml
+This file defines a ClusterIP Service for the PostgreSQL database, making it accessible within the cluster.
+
+Key components:
+- `apiVersion` and `kind`: Specifies this is a Service resource
+- `spec.type`: Set to ClusterIP for internal cluster access
+- `spec.ports`: Maps the service port to the target port on the pod
+
+### flask-app-service.yaml
+This file defines a LoadBalancer Service for the Flask application, making it accessible from outside the cluster.
+
+Key components:
+- Similar structure to postgres-service.yaml
+- `spec.type`: Set to LoadBalancer for external access
+- `spec.ports`: Maps port 80 to target port 5000 on the pod
+
+
+## Docker Compose vs. Kubernetes
+
+While both Docker Compose and Kubernetes can be used to deploy multi-container applications, they differ in several ways:
+
+1. **Scale**: Docker Compose is typically used for local development and small-scale deployments, while Kubernetes is designed for large-scale, production environments.
+
+2. **Orchestration**: Kubernetes provides more advanced orchestration features, such as automatic scaling, rolling updates, and self-healing.
+
+3. **Resource Definition**: Docker Compose uses a single YAML file, while Kubernetes separates concerns into multiple YAML files for different resource types.
+
+4. **Networking**: Kubernetes provides more sophisticated networking options, including Services and Ingress controllers.
+
+5. **State Management**: Kubernetes has built-in primitives for managing stateful applications, such as StatefulSets and PersistentVolumes.
+
+In this deployment, we've translated the Docker Compose setup into Kubernetes resources, allowing for better scalability and management in a cloud environment.
+
+## Conclusion
+
+This Kubernetes deployment configuration provides a robust, scalable setup for the weather data pipeline and Flask application. It leverages Kubernetes' features to ensure high availability, ease of management, and efficient resource utilization. 
+
+The most important YAML files in this setup are:
+1. `postgres-deployment.yaml`: Ensures the database is running and properly configured.
+2. `data-pipeline-job.yaml`: Manages the ETL process, crucial for data processing.
+3. `flask-app-deployment.yaml`: Deploys the web application that serves the processed data.
+
+These files form the core of the application, defining how the database, data processing job, and web application are deployed and managed within the Kubernetes cluster.
+
+By moving from Docker Compose to Kubernetes, the application gains the ability to scale more effectively and take advantage of cloud-native features, making it more suitable for production environments. The separation of concerns into different YAML files also improves maintainability and allows for more granular control over each component of the application.