diff --git a/content/GCP/03_intro_to_cloud_storage.ipynb b/content/GCP/03_intro_to_cloud_storage.ipynb index d079607..77e50a8 100644 --- a/content/GCP/03_intro_to_cloud_storage.ipynb +++ b/content/GCP/03_intro_to_cloud_storage.ipynb @@ -33,7 +33,7 @@ "id": "9897048a-6aa8-4d85-a557-d85b802f3f1d", "metadata": {}, "source": [ - "One of the most common and economic ways to store data in the cloud is to use object storage. In GCP object storage is called *Google Cloud Storage*, which is similar to the Simple Storage Service, also known as S3, on Amazon Web Services (AWS). For object storage, information is stored as a collection of key-value pairs." + "There are many storage services in the cloud. One of the most common and economic ways to store data in the cloud is to use object storage. In GCP object storage is called *Google Cloud Storage*, which is similar to the Simple Storage Service, also known as S3, on Amazon Web Services (AWS). For object storage, information is stored as a collection of key-value pairs. This is different to how data is commonly stored on laptops and high performance computing clusters (supercomputers)." ] }, { @@ -67,7 +67,13 @@ "\n", "You should see a row with your account shown in the Principal column. Here you should see the \"Editor\" Role in the Role column. A *role* is a collection of permissions managed by Google or someone else. The **Editor**, **Owner**, or the **Storage Admin** role for a project will *allow* *you* to create, access, and delete Buckets *in* the project.\n", "\n", - "There are three important pieces of information that work together to form the **IAM policy**. The permission (role), the identity (principal or member), and the resource (project)." + "There are three important pieces of information that work together to form the **IAM policy**. The permission (role), the identity (principal), and the resource (project). This is another who (identity), what (permission), and where (resource).\n", + "\n", + "### Exercise\n", + "\n", + "Answer the following questions:\n", + " * What is the \"Who, What, Where\" of the IAM policy that allows you to use your project?\n", + " * What else has permissions to do things in your project and state the \"Who, What, Where\"?\n" ] }, { @@ -183,7 +189,7 @@ "id": "3a28e28d-1d70-44fa-a952-4f3506ea85ec", "metadata": {}, "source": [ - "## Discussion\n", + "## Discussion (Optional)\n", "\n", "* What does the words \"Secure\", \"Allocate\", \"Follow\", and \"Enumerate\" spell?\n", "* What happens when you add the \"R\" in \"Review?\"\n", @@ -195,7 +201,7 @@ "id": "97d7ebc5-4a81-4f1a-aaf3-517adf70640a", "metadata": {}, "source": [ - "## Resources in Google Cloud Platform - Review\n", + "## Resources in Google Cloud Platform (Optional)\n", "\n", "Even though we only covered the Google Cloud Storage service in this episode, this process can be used for other *resources* allocated in the cloud. The term *resource* is used for the \"things\" that live in a Project, such as compute, storage, and networking and other services. Resources have the following characteristics:\n", "\n", diff --git a/content/GCP/04_intro_to_cli.ipynb b/content/GCP/04_intro_to_cli.ipynb index dd4d0df..de1e44d 100644 --- a/content/GCP/04_intro_to_cli.ipynb +++ b/content/GCP/04_intro_to_cli.ipynb @@ -5,7 +5,7 @@ "id": "5439e525-b985-495f-85a6-e4c8d7452956", "metadata": {}, "source": [ - "# Introduction to the gcloud CLI\n", + "# Introduction to the Command Line Interface (CLI)\n", "\n", "\n", "```{admonition} Overview\n", @@ -63,17 +63,34 @@ }, { "cell_type": "markdown", - "id": "e4fa29a6-7d8e-4591-af3f-8539b94b3bef", + "id": "90229d38-69d3-4bf2-a0ec-c7b2e6c452e4", "metadata": {}, "source": [ - "## Verify the Configuration (Who, What, Where)" + "## Start a Cloud Shell Session\n", + "\n", + "Open up a Cloud shell session by clicking on the terminal icon on the blue bar on the top right of the page (labeled 5 below).\n", + "\n", + "![blue-bar](img/blue-bar.png)\n", + "\n", + "You may wish to push the maximize button (hover over the icons to see the names) on the terminal to make it full screen and change the font size (found in the gear icon) to your liking.\n", + "\n", + "Now test that the `gcloud` command works and you can get help. This is done by running the following:\n", + "```bash\n", + "gcloud help\n", + "```\n", + "\n", + "Use the arrows, page up and page down (also space), to navigate the help screen and press `q` to exit the help screen. (Pressing `h` will give you more information about how to navigate)" ] }, { "cell_type": "markdown", - "id": "f786f92c-7127-4f26-b0c0-fc27a364aca4", - "metadata": {}, + "id": "b91e0142-a26d-4bfd-9d6d-c455fdbf49f4", + "metadata": { + "tags": [] + }, "source": [ + "## Verify the Configuration (Who, What, Where)\n", + "\n", "First, let's verify that the Account being used for access (who) is what we expect." ] }, diff --git a/content/GCP/06_running_analysis.ipynb b/content/GCP/06_running_analysis.ipynb index 730eef5..2d64955 100644 --- a/content/GCP/06_running_analysis.ipynb +++ b/content/GCP/06_running_analysis.ipynb @@ -53,6 +53,8 @@ " * Allow the VM \"Full\" access to \"Storage\". This can be found under \"Identity and API\" on the \"create an instance\" page and then selecting \"Set access for each API\" and change \"Storage\" to \"Full\". **This will allow the VM to create, read, write, and delete all storage buckets in the project\"**\n", " * Feel free to select a bit larger VM by changing the machine type to something larger, for example an \"e2-standard-2\".\n", "\n", + "*Instructor: place these instructions on the screen*\n", + "\n", "*When you are done feel free to connect to the virtual machine on your own for additional practice. Once everyone has created their VM we will connect to the machine as described below.*" ] }, @@ -133,6 +135,93 @@ "sudo unattended-upgrades" ] }, + { + "cell_type": "markdown", + "id": "cd0c7010-ef68-4648-9766-ab26e9bd6ecc", + "metadata": { + "tags": [] + }, + "source": [ + "## Setup Storage\n", + "\n", + "Before we do any work we will first create a bucket to place the results with a reasonable set of options. We do this first to make sure we can store the results when we are done, it is easier to fix problems now than later. \n", + "\n", + "We first store the bucket name in the `BUCKET` environment variable for future use. This time we will specify a realistic set of options for a private bucket used for computation.\n", + "\n", + "Options (run `gsutil mb --help` for more information):\n", + " * `-b on` specifies uniform bucket-level access.\n", + " * `-l $REGION` puts the data in a specific region for lower cost and lower latency.\n", + " * `--pap enforced` turns on public access prevention to help keep data private. \n", + " \n", + "The uniform bucket level access (Bucket Policy Only enabled: true) puts the data access permissions (ACL) on the entire bucket, not on each object in the bucket. This makes the permissions obvious and makes security much more predictable.\n", + " \n", + "As usual, we must set our environment. In this case we also set a `REGION` environment variable to indicate where in the world we want the data to be stored.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "f36cb8c5-f305-4cb2-a5cc-0c9fd8592fb4", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "bucket: essentials-learner-2022-01-11 region: us-west2\n" + ] + } + ], + "source": [ + "BUCKET=\"essentials-${USER}-$(date +%F)\"\n", + "REGION=\"us-west2\"\n", + "echo \"bucket: $BUCKET region: $REGION\"" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "c2ae2b74-5e93-4c55-8bd7-63337f7dcbb8", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Creating gs://essentials-learner-2022-01-11/...\n" + ] + } + ], + "source": [ + "gsutil mb -b on -l $REGION --pap enforced \"gs://$BUCKET\"" + ] + }, + { + "cell_type": "markdown", + "id": "2c71d0de-8bdb-476f-922f-62dc19e8bbeb", + "metadata": {}, + "source": [ + "And verify the bucket was created" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "10f8c773-97fc-46af-a6b5-1bb832472b33", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "gs://essentials-learner-2022-01-11/\n" + ] + } + ], + "source": [ + "gsutil ls -b gs://$BUCKET" + ] + }, { "cell_type": "markdown", "id": "b7c4db9e-f098-41bc-80d2-b524444eec7f", @@ -359,7 +448,7 @@ "id": "76b905b4-1c2a-4960-a14f-974b77f671cd", "metadata": {}, "source": [ - "We will now uncompress the index file to make it easier to use. This may take some time depending on the machine type you are using." + "We will now uncompress the index file to make it easier to use. This may take some time depending on the machine type you are using. (This is also why it is good to write scripts to do the entire process)." ] }, { @@ -1280,67 +1369,56 @@ }, { "cell_type": "markdown", - "id": "c0257075-537c-4510-bafd-72e9756db17b", + "id": "6c63e0c4-1476-4f0b-93b0-e5bd2b406e60", "metadata": {}, "source": [ - "## Exporting the Results\n", + "## Saving the Results\n", "\n", - "Now that we have the output data we will create a bucket to place the results. We will first create a bucket with a reasonable set of options.\n", - "\n", - "We fisrt store the bucket name in the `BUCKET` environment variable for future use. This time we will specify a realistic set of options for a private bucket used for computation.\n", - "\n", - "Options (run `gsutil mb --help` for more information):\n", - " * `-b on` specifies uniform bucket-level access.\n", - " * `-l $REGION` puts the data in a specific region for lower cost and lower latency.\n", - " * `--pap enforced` turns on public access prevention to help keep data private. \n", - " \n", - "The uniform bucket level access (Bucket Policy Only enabled: true) puts the data access permissions (ACL) on the entire bucket, not on each object in the bucket. This makes the permissions obvious and makes security much more predictable.\n", - " \n", - "As usual, we must set our environment. In this case we also set a `REGION` environment variable to indicate where in the world we want the data to be stored.\n" + "We now will store the data in the bucket we created in the beginning of the episode. First we verify the environment variable and that it exists.\n" ] }, { "cell_type": "code", - "execution_count": 24, - "id": "f36cb8c5-f305-4cb2-a5cc-0c9fd8592fb4", + "execution_count": 7, + "id": "9345472f-4ef3-490b-a80e-2462cd534c89", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "bucket: essentials-learner-2021-12-17 region: us-west2\n" + "essentials-learner-2022-01-11\n" ] } ], "source": [ - "BUCKET=\"essentials-${USER}-$(date +%F)\"\n", - "REGION=\"us-west2\"\n", - "echo \"bucket: $BUCKET region: $REGION\"" + "echo $BUCKET" ] }, { "cell_type": "code", - "execution_count": 25, - "id": "c2ae2b74-5e93-4c55-8bd7-63337f7dcbb8", + "execution_count": 9, + "id": "27dfae96-faf2-4d5d-8a78-97781841f172", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "Creating gs://essentials-learner-2021-12-17/...\n" + "gs://essentials-learner-2022-01-11/\n" ] } ], "source": [ - "gsutil mb -b on -l $REGION --pap enforced \"gs://$BUCKET\"" + "gsutil ls -b gs://$BUCKET" ] }, { "cell_type": "markdown", - "id": "22cc1043-386b-4990-b3dc-62cbdd7ba133", - "metadata": {}, + "id": "939bcbaa-14fe-479f-a394-087f116ec7cc", + "metadata": { + "tags": [] + }, "source": [ "Now copy the output data to the bucket. The `-r` flag recursively copies the output directory and `-m` copies the files in parallel. Note the locations of the `-m` and `-r` switches as they apply globally and to the `cp` command respectively." ] @@ -1439,11 +1517,13 @@ "id": "a1c11268-f389-405f-8d96-3c319a49b882", "metadata": {}, "source": [ - "## Sharing Results\n", + "## Sharing Results (Optional)\n", "\n", "In order to share resources outside a project we must use the Identity Access Management service. This is a powerful tool to grant and restrict access to resources, and if not done correctly It can have serious consequences. **Incorrect permissions can lead to exposure of sensitive data, destruction of data, and authorized use of resources that can result in huge bills.** When in doubt, seek help.\n", "\n", - "The question \"What access is really needed?\" is the **Principal of Least Privilege** and is a major cornerstone of security. We need to determine the lowest set of permissions or roles that is needed. In our case we wish to grant the \"Collaborator\" the \"Viewer\" access to the \"results bucket\". This will allow them to view, list, and download all objects in the bucket. This illustrates that for a **resource** a **member** (identity) is granted a **permission** (think of it in this order). Together this is called a **policy**. Google also uses \"Roles\" as a collection of predefined and managed permissions." + "The question \"What access is really needed?\" is the **Principal of Least Privilege** and is a major cornerstone of security. We need to determine the lowest set of permissions or roles that is needed. In our case we wish to grant the \"Collaborator\" the \"Viewer\" access to the \"results bucket\". This will allow them to view, list, and download all objects in the bucket. This illustrates that for a **resource** a **member** (identity) is granted a **permission** (think of it in this order). Together this is called a **policy**. Google also uses \"Roles\" as a collection of predefined and managed permissions.\n", + "\n", + "What we do not want to do is add the collaborator to the project! This would give them access to all storage buckets and all resources." ] }, { @@ -1454,7 +1534,7 @@ }, "source": [ "We will now add Members to a Bucket using the Web Console. We will use the Web Console to interactively build the policy binding by doing the following:\n", - " * Navigation Menu -> Storage/Cloud Storage -> Browser -> Click on the Bucket Name (**Bucket Details**) -> Select the **Permissions** tab -> Click **Add** next to \"Permissions\" above the permissions list.\n", + " * Navigation Menu -> **Storage/Cloud Storage** -> Browser -> Click on the Bucket Name (**Bucket Details**) -> Select the **Permissions** tab -> Click **Add** next to \"Permissions\" above the permissions list.\n", " * In the \"New Principals\" box add the Identity for the collaborator (another individual) as directed by the instructor.\n", " * Select the \"Storage Object Viewer\" by typing \"Storage Object Viewer\" in the filter and then selecting \"Storage Object Viewer\". Do not use any \"Legacy Storage\" roles.\n", " * Click \"Save\" to save the policy.\n", @@ -1471,8 +1551,8 @@ "tags": [] }, "source": [ - "Collaborators should now be able to see the contents of the bucket by explicitly naming it as shown below.\n", - "```\n", + "Collaborators should now be able to see the contents of the bucket by explicitly naming the bucket. Below shows student321 accessing the bucket (note the prompt).\n", + "```bash\n", "student231@cloudshell:~ (t-monument-315019)$ gsutil ls gs://essentials-learner-2021-12-17\n", "gs://essentials-learner-2021-12-17/output/\n", "\n", @@ -1482,6 +1562,22 @@ "```" ] }, + { + "cell_type": "markdown", + "id": "ce0909e5-f889-4e96-9947-8706416a5511", + "metadata": {}, + "source": [ + "Now remove the access by selecting the checkbox on the row with the principal identity and click `remove`. Now verify that the collaborator does not have access\n", + "\n", + "```bash\n", + "student231@cloudshell:~ (t-monument-315019)$ gsutil ls gs://essentials-tmiddelkoop-$(date +%F)/output\n", + "AccessDeniedException: 403 student231@class.internet2.edu does not have storage.objects.list access to the Google Cloud Storage bucket.\n", + "```\n", + "\n", + "\n", + "*Instructors: You may want to have students share these buckets for the example to reduce screen flipping and involve the students.*" + ] + }, { "cell_type": "markdown", "id": "40d9ab41-8920-45f2-8218-550baac5b069", @@ -1489,7 +1585,7 @@ "source": [ "## Cleanup\n", "\n", - "We will now leave the resources running in order to learn more about monitoring costs and will clean up all the resources as the end of Lesson. **Don't forget to do this!**" + "We will now leave the resources running in order to learn more about monitoring costs and will clean up all the resources as the end of Lesson. **Don't forget to do remove the Cloud Storage Bucket and the Compute Engine Instance (Virtual Machine) when you are done!**" ] } ], diff --git a/content/GCP/07_monitoring_costs.ipynb b/content/GCP/07_monitoring_costs.ipynb index d41ec1d..e020d11 100644 --- a/content/GCP/07_monitoring_costs.ipynb +++ b/content/GCP/07_monitoring_costs.ipynb @@ -85,9 +85,9 @@ " * **Save** view\n", " * If you cannot save the view you can also click \"Share\" and use the URL to create a bookmark for easy access.\n", " \n", - "These charts should be monitored daily for active projects and weekly for ongoing projects. For projects that are inactive the billing should be disabled or the project should be deleted.\n", + "This is the most reliable and direct way to monitor costs and these charts should be monitored daily for active projects and weekly for ongoing projects. For projects that are inactive the billing should be disabled or the project should be deleted.\n", "\n", - "There reports do not show consumption in real time. In order to get an idea of what is generating costs (and you can estimate these costs based on past expereince), we will next show how to *enumerate* running resources. " + "There reports do not show consumption in real time. In order to get an idea of what is generating costs (and you can estimate these costs based on past experience), later we will show how to *enumerate* running resources. " ] }, { @@ -101,7 +101,9 @@ "\n", "Due to the diversity in the way Billing Accounts are setup at institutions we will not show you have to setup billing alerts. They are simple and straight forward to setup once you have access and navigate to it. If you do not have access to Billing Alerts you can accomplish similar results by navigating to the Billing Reports every day.\n", "\n", - "You can find the Billing Alerts under **Billing** -> **Budgets & Alerts**." + "You can find the Billing Alerts under **Billing** -> **Budgets & Alerts**.\n", + "\n", + "There are also 3rd party cost monitoring services that your institution may use that can set billing alerts, contact your research computing and data professional for more information." ] }, { diff --git a/content/GCP/08_cleaning_up_resources.ipynb b/content/GCP/08_cleaning_up_resources.ipynb index b5049bd..aee3e3d 100644 --- a/content/GCP/08_cleaning_up_resources.ipynb +++ b/content/GCP/08_cleaning_up_resources.ipynb @@ -274,7 +274,7 @@ "id": "74fd626a-6102-4815-a8c2-27aade6ebc3b", "metadata": {}, "source": [ - "## CLeanup VMs\n", + "## Cleanup VMs\n", "We can also create and delete VMs using the `gcloud` command. Note, the `--quiet` disables conformation. **WARNING: This command will immediately delete the VM called essentials**" ] },