Added diagram

CLASS · Sep 17, 2024 · 1b22c77 · 1b22c77
1 parent 219a69a
commit 1b22c77
Showing 1 changed file with 59 additions and 0 deletions.
diff --git a/diagram.md b/diagram.md
@@ -0,0 +1,59 @@
+# Data Pipeline Flow
+
+<div align="center">
+
+```mermaid
+graph TD
+    subgraph AWS
+        A[AWS Data Source]
+    end
+
+    subgraph "Container: Extract"
+        B[extract.py]
+    end
+
+    subgraph "Container: Load"
+        C[load.py]
+    end
+
+    subgraph "Container: Transform"
+        D[transform.py]
+    end
+
+    subgraph "PostgreSQL Database"
+        E[Loading Table]
+        F[Final Table]
+    end
+
+    subgraph "Container: Visualize"
+        G[Flask App]
+    end
+
+    A -->|Data| B
+    B -->|Extracted Data| C
+    C -->|Load Data| E
+    E -->|Read Data| D
+    D -->|Transformed Data| F
+    F -->|Read Data| G
+
+    classDef container fill:#e6f3ff,stroke:#333,stroke-width:2px;
+    class B,C,D,G container;
+    classDef scriptText fill:#e6f3ff,stroke:#333,stroke-width:2px,color:black;
+    class B,C,D,G scriptText;
+```
+
+</div>
+
+## Flow Explanation
+
+The entire process is orchestrated using shell scripts (extract.sh, load.sh, transform.sh) which manage the execution of each step in the pipeline:
+
+1. **Extract**: Data is sourced from AWS and extracted using `extract.py` in the Extract container.
+
+2. **Load**: The extracted data is then loaded into the Loading Table of the PostgreSQL database using `load.py` in the Load container.
+
+3. **Transform**: Data from the Loading Table is read, transformed using `transform.py` in the Transform container, and then stored in the Final Table.
+
+4. **Visualize**: Finally, a Flask app in the Visualize container reads data from the Final Table to create visualizations or serve data via an API.
+
+This pipeline ensures a structured flow of data from its source to a visualized or API-accessible format, with clear separation of concerns at each stage. The use of shell scripts for orchestration allows for flexible and controllable execution of the pipeline steps.