Skip to content
This repository was archived by the owner on Dec 12, 2025. It is now read-only.

Commit

Permalink
GCP: Cleanup and Speedup.
Browse files Browse the repository at this point in the history
  • Loading branch information
tmiddelkoop committed Feb 15, 2022
1 parent 90f95e3 commit c1318ad
Show file tree
Hide file tree
Showing 8 changed files with 91 additions and 125 deletions.
2 changes: 1 addition & 1 deletion content/GCP/01_intro_to_cloud_console.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
"```{admonition} Overview\n",
":class: tip\n",
"\n",
"**Teaching:** 10 min.\n",
"**Teaching:** 15 min.\n",
"\n",
"**Exercises:** 6 min.\n",
"\n",
Expand Down
4 changes: 2 additions & 2 deletions content/GCP/02_intro_to_compute.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
"```{admonition} Overview\n",
":class: tip\n",
"\n",
"**Teaching:** 45 min.\n",
"**Teaching:** 30 min.\n",
"\n",
"**Exercises:** 6 min\n",
"\n",
Expand Down Expand Up @@ -147,7 +147,7 @@
"tags": []
},
"source": [
"## Find the VM Instances\n",
"## Find the VM Instance\n",
"\n",
"Now lets find and connect to the *VM Instance*.\n",
" * Navigate to the Google Compute Engine page by clicking **Navigation Menu** -> **Compute Engine** (under Compute) -> **Instances**.\n",
Expand Down
70 changes: 10 additions & 60 deletions content/GCP/03_intro_to_cloud_storage.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"```{admonition} Overview\n",
":class: tip\n",
"\n",
"**Teaching:** 40 min\n",
"**Teaching:** 20 min\n",
"\n",
"**Exercises:** 5 min\n",
"\n",
Expand All @@ -19,7 +19,6 @@
"\n",
"**Objectives:**\n",
"* Navigate the Google Cloud Storage service and terminology\n",
"* Understand the roles and permissions needed to use Google Cloud Storage in projects\n",
"* Allocate storage in Google Cloud Storage\n",
"* Find the cost estimator for Google Cloud Storage\n",
"* Recognize that resources have a \"location\"\n",
Expand Down Expand Up @@ -50,41 +49,6 @@
"We now take Drew through the process of creating a Google Cloud Storage bucket."
]
},
{
"cell_type": "markdown",
"id": "07fb9096-2b40-4995-a742-be7bd9b2797c",
"metadata": {
"tags": []
},
"source": [
"## Security\n",
"\n",
"Everything in the cloud requires permission (authorization). Let's first verify that we have the permissions to create a bucket. A Bucket (a resource) is created within a project and inheres permissions from it.\n",
"\n",
"We are interested in what permissions that *your* account has for *your* project. To do this navigate to the IAM page (**Navigation Menu -> IAM & Admin -> IAM -> Permissions -> View By: Principals**). This shows the permissions for the project.\n",
"\n",
"*Note: There is a powerful filter box to limit the permissions shown.*\n",
"\n",
"You should see a row with your account shown in the Principal column. Here you should see the \"Editor\" Role in the Role column. A *role* is a collection of permissions managed by Google or someone else. The **Editor**, **Owner**, or the **Storage Admin** role for a project will *allow* *you* to create, access, and delete Buckets *in* the project.\n",
"\n",
"There are three important pieces of information that work together to form the **IAM policy**. The permission (role), the identity (principal), and the resource (project). This is another who (identity), what (permission), and where (resource)."
]
},
{
"cell_type": "markdown",
"id": "9acf29cf-660b-4922-bcb8-89fd9080fdea",
"metadata": {
"tags": []
},
"source": [
"```{admonition} Exercise\n",
"\n",
"Answer the following questions:\n",
" * What is the \"Who, What, Where\" of the IAM policy that allows you to use your project?\n",
" * What else has permissions to do things in your project and state the \"Who, What, Where\"?\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "c5430b40-1a5f-40df-9e13-529ef3ece4ce",
Expand All @@ -94,12 +58,12 @@
"source": [
"## Allocate Google Cloud Storage\n",
"\n",
"Now that we have verified the permissions we can now create a bucket. Buckets are where objects are stored and have a globally unique name.\n",
"Buckets are where objects are stored and have a globally unique name.\n",
"\n",
"To create a bucket we do the following:\n",
" * Click **Navigation Menu** -> **Cloud Storage** (under Storage) -> **Browser** -> **+Create Bucket** (just under the blue bar) to open the *Create a bucket* page.\n",
" * In *Name your bucket*:\n",
" * For **Name**, enter a globally unique name for the bucket (example \"essentials-test-myname-2021-01-01\")\n",
" * For **Name**, enter a globally unique name for the bucket (example \"**essentials-test-myname-2022-01-01**\")\n",
" * Click **Continue**\n",
" * In *Choose where to store your data*:\n",
" * For *Location Type* select **Region** (cheapest and fastest)\n",
Expand Down Expand Up @@ -129,9 +93,9 @@
"tags": []
},
"source": [
"## Follow the Storage Allocation\n",
"## Track the Storage Allocation\n",
"\n",
"Just as with compute, we will audit (follow) the bucket creation by examining at the project *activity*.\n",
"Just as with compute, we will track (follow) the bucket creation by examining at the project *activity*.\n",
"\n",
"To view the project activity we do the following:\n",
"\n",
Expand All @@ -150,15 +114,13 @@
"tags": []
},
"source": [
"## Enumerate the Buckets\n",
"## List the Buckets\n",
"\n",
"Now lets find and examine the bucket. To view a bucket we do the following:\n",
"\n",
" * Navigate to the Google Storage page by clicking **Navigation Menu** -> **Cloud Storage** (under Storage) -> **Browser**. \n",
" * **Find** the bucket you just created. You can use the filter to find a bucket if there are a lot of them.\n",
" * Click on the bucket name to open the **bucket details** (it will display as a hyperlink when you hover over the bucket name).\n",
"\n",
"Navigate to the **dashboard** and you will now see \"Storage\" in the *resources* card under. You can click on this to quickly navigate to the storage page."
" * Click on the bucket name to open the **bucket details** (it will display as a hyperlink when you hover over the bucket name).\n"
]
},
{
Expand All @@ -170,12 +132,12 @@
"source": [
"## Review what is Important\n",
"\n",
"It is always important to review what is important to you. It may be cost, or keeping the data secure. Later on we will show how to monitor overall costs.\n",
"It is always important to review what is important to you. It may be cost, or keeping the data secure. Later on we will show how to monitor overall costs. We will also learn how to use the \"info panel\" to show more information about a bucket.\n",
"\n",
"For Drew, we will review that the bucket **public access** is *not public* by doing the following:\n",
" * Go to **Navigation Menu -> Cloud Storage -> Browser**\n",
" * Select the bucket of interest by **checking the box** next to the Bucket name.\n",
" * In the Right Side Bar (open if necessary) in the **Permissions** tab in the **Public Access** card you should see **Not Public**. This means that public access prevention is turned on.\n",
" * In the **Info Panel** (click show \"Info Panel\" if necessary) in the **Permissions** tab in the **Public Access** card you should see **Not Public**. This means that public access prevention is turned on.\n",
" * You can also see the **permissions** for the bucket in the bottom of the bar."
]
},
Expand Down Expand Up @@ -225,23 +187,11 @@
"\n",
"![storage-delete-bucket](img/storage-delete-bucket.png)\n",
"\n",
"Did you \"Follow\" the bucket by looking at the **activity** page as discussed above?\n",
"Did you \"Track\" the bucket by looking at the **activity** page as discussed above?\n",
"\n",
"Since we care about paying for resources we are not using we review our project by visiting the *compute storage* service and reviewing that we no longer have any *Buckets* allocated. "
]
},
{
"cell_type": "markdown",
"id": "3a28e28d-1d70-44fa-a952-4f3506ea85ec",
"metadata": {},
"source": [
"## Discussion (Optional)\n",
"\n",
"* What does the words \"Secure\", \"Allocate\", \"Follow\", and \"Enumerate\" spell?\n",
"* What happens when you add the \"R\" in \"Review?\"\n",
"* Is this useful?"
]
},
{
"cell_type": "markdown",
"id": "97d7ebc5-4a81-4f1a-aaf3-517adf70640a",
Expand Down
56 changes: 5 additions & 51 deletions content/GCP/04_intro_to_cli.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"```{admonition} Overview\n",
":class: tip\n",
"\n",
"**Teaching:** 40 min\n",
"**Teaching:** 20 min\n",
"\n",
"**Exercises:** 5 min\n",
"\n",
Expand All @@ -23,8 +23,7 @@
"* Use basic cloud CLI commands (`gcloud` and `gsutil`).\n",
"* Verify basic settings.\n",
"* Use environment variables for configuration.\n",
"* Understand the importance of using variables for configuration.\n",
"* Recognize the value of reproducibility and automation.\n",
"* Understand the importance of using variables reproducibility and automation.\n",
"``` "
]
},
Expand All @@ -50,7 +49,7 @@
"\n",
"The cloud can be controlled using a Command Line Interface (CLI) or a programming language such as Python. Collectively these tools interact with the cloud over a Application Programming Interface (API) and this capability forms the basis of the cloud, the ability to control infrastructure programmatically.\n",
"\n",
"Just as with navigating the web console it is important to know the **who**, **what**, and **where** of CLI access to reduce the possibility of access mistakes. We will first verify the tools are installed and configured correctly. Next we get the Account being used (who) and the Project ID of the active project (where) using the `gcloud` command. We will then use the `gcloud` and `gsutil` commands to list some public Buckets (what).\n",
"Just as with navigating the web console it is important to know the **who**, **where**, and **what** of CLI access to reduce the possibility of access mistakes. We will first verify the tools are installed and configured correctly. Next we get the Account being used (who) and the Project ID of the active project (where) using the `gcloud` command. We will then use the `gcloud` and `gsutil` commands to list some public Buckets (what).\n",
"\n",
"The `gcloud` command is used to control most aspects of GCP and the `gsutil` command is used to control Google Cloud Storage Buckets. To access the manual pages for a command just add `--help` to the end of the command or run `gcloud help` for more information.\n",
"\n",
Expand Down Expand Up @@ -97,7 +96,7 @@
"tags": []
},
"source": [
"## Verify the Configuration (Who, What, Where)\n",
"## Verify the Configuration (Who, Where, What)\n",
"\n",
"First, let's verify that the Account being used for access (who) is what we expect."
]
Expand Down Expand Up @@ -148,57 +147,12 @@
"gcloud config get-value project"
]
},
{
"cell_type": "markdown",
"id": "c2972d7b-f393-42b5-8330-cf8292d28afb",
"metadata": {},
"source": [
"Now we will use `gcloud` to list a well known public bucket (what). "
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "617325c9-d853-4291-a1db-938ab9439fee",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"gs://gcp-public-data-landsat/index.csv.gz\n",
"gs://gcp-public-data-landsat/LC08/\n",
"gs://gcp-public-data-landsat/LE07/\n",
"gs://gcp-public-data-landsat/LM01/\n",
"gs://gcp-public-data-landsat/LM02/\n",
"gs://gcp-public-data-landsat/LM03/\n",
"gs://gcp-public-data-landsat/LM04/\n",
"gs://gcp-public-data-landsat/LM05/\n",
"gs://gcp-public-data-landsat/LO08/\n",
"gs://gcp-public-data-landsat/LT04/\n",
"gs://gcp-public-data-landsat/LT05/\n",
"gs://gcp-public-data-landsat/LT08/\n"
]
}
],
"source": [
"gcloud alpha storage ls gs://gcp-public-data-landsat"
]
},
{
"cell_type": "markdown",
"id": "42e6ad5b-186d-4cd1-ba03-e7c85ad40e38",
"metadata": {},
"source": [
"*Advanced Callout: The `alpha` (and `beta`) command allows us to access commands that have not been released for production and care should be taken when using these in a production environment. At this time this is not the recommended way to access storage buckets, but it does help verify that everything is working correctly.*"
]
},
{
"cell_type": "markdown",
"id": "1389ca4f-7234-4ea3-9ad8-85914d88ede5",
"metadata": {},
"source": [
"Finally, we will verify that the separate and preferred `gsutil` command is installed and working by listing the same well known public bucket. "
"Now we will use `gsutil` to list a well known public bucket (what). The `gsutil` command is how we access Google Cloud Storage, most other services use the `gcloud` command."
]
},
{
Expand Down
45 changes: 36 additions & 9 deletions content/GCP/06_running_analysis.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"```{admonition} Overview\n",
":class: tip\n",
"\n",
"**Teaching:** 80 min\n",
"**Teaching:** 60 min\n",
"\n",
"**Exercises:** 8 min\n",
"\n",
Expand Down Expand Up @@ -377,7 +377,7 @@
"id": "2b8b3144-7453-4350-a1cf-7fa74af2bcbf",
"metadata": {},
"source": [
"## Access the bucket\n",
"## Access the Bucket\n",
"\n",
"Now we need to verify that Drew has access to the analysis data. \n",
"\n",
Expand Down Expand Up @@ -418,7 +418,7 @@
"id": "9e16e8b5-a178-492a-aa80-5affe721b6ca",
"metadata": {},
"source": [
"## Getting the data"
"## Getting the Metadata"
]
},
{
Expand Down Expand Up @@ -552,11 +552,25 @@
},
{
"cell_type": "markdown",
"id": "f98c38de-87fa-4e66-9d4b-186fbf81b3b2",
"id": "588ddd30-af8f-4378-b543-290c4c6f0840",
"metadata": {},
"source": [
"````{admonition} Tip\n",
":class: Tip\n",
":class: tip\n",
"To run the above commands in one step run\n",
"\n",
"```\n",
"bash get-index.sh\n",
"```\n",
"````"
]
},
{
"cell_type": "markdown",
"id": "f98c38de-87fa-4e66-9d4b-186fbf81b3b2",
"metadata": {},
"source": [
"````{admonition} Break (Optional)\n",
"\n",
"Now our virtual machine instance is ready and we can access the code and data. Now is a great time to take a short break.\n",
"````"
Expand All @@ -567,7 +581,7 @@
"id": "532e6da3-302a-4e8a-8570-752995f30f1d",
"metadata": {},
"source": [
"## Search for Data\n",
"## Getting the Data\n",
"\n",
"We can see the data is well formed and what we expect. We will now use this data to download data related to a specific point and for the Landsat 8. The following script does a simple filter."
]
Expand Down Expand Up @@ -672,8 +686,6 @@
"id": "a76f24f8-3b2d-4c0d-880b-f2911b9d9b84",
"metadata": {},
"source": [
"## Download the Data\n",
"\n",
"Now that we have a list of folders we are interested, we will now download them with a simple script that takes bucket addresses (URL's) and downloads them with the `gsutil` program."
]
},
Expand Down Expand Up @@ -805,6 +817,21 @@
"ls -l data"
]
},
{
"cell_type": "markdown",
"id": "0ca2e2ff-8276-4092-8a9e-ca754db5078e",
"metadata": {},
"source": [
"````{admonition} Tip\n",
":class: tip\n",
"To run the above analysis in one step run\n",
"\n",
"```\n",
"bash get-data.sh\n",
"```\n",
"````"
]
},
{
"cell_type": "markdown",
"id": "6073d5f6-68ef-41df-8044-73f221ce8780",
Expand Down Expand Up @@ -1396,7 +1423,7 @@
}
],
"source": [
"/usr/bin/python3 process_sat.py"
"python3 process_sat.py"
]
},
{
Expand Down
Loading

0 comments on commit c1318ad

Please sign in to comment.