Skip to content

This repository contains a workshop on containerizing research workflows, covering Docker, data processing, and Kubernetes deployment. It provides a complete guide from setup to cloud deployment, with practical examples.

CLASS/class-container-curriculum-dev

main
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Files

Permalink
Failed to load latest commit information.

Container Orchestration for Research Workflows

Workshop Overview

In this workshop, you will learn how to set up and manage to create containerized research workflows for scalability and reproducibility. You will work with the NOAA Global Surface Summary of the Day (GSOD) dataset from the AWS Open Registry. Within the container, you will learn to clean and manipulate the dataset to work with databases and create a web-based user interface (UI) to view and analyze outputs. Along the way, you will learn best practices and standards for creating containers that are replicable across multiple computing systems.

Course Level

  • Intermediate

Intended Audience

This workshop is intended for:

  • Primary Audience: Researchers and data scientists interested in streamlining their workflows.
  • Secondary Audience: IT professionals and developers looking to implement containerized solutions for data processing and analysis.

Workshop Objectives

By the end of this workshop, you will be able to:

  • Understand the process to implement containers for a scalable research workflows
  • Set up a containerized environment for reproducible and replicable science
  • Implement large-scale data processing within containers
  • Create a user-friendly interface for accessing output and visualization within the container
  • Learn how to use cloud tools to ochestrate and scale your containers
  • Learn how to share your containers on free or paid container registries

Delivery Method

This workshop is delivered through:

  • Online via Zoom

Duration

  • 4 Hours

Workshop Outline

This workshop covers the following concepts:

  1. Introduction to Containerized Research Workflows

    • Benefits of containerizing research workflows
    • Tools and technologies: Docker, Kubernetes, etc.
  2. Setting Up the Containerized Environment

    • Creating Dockerfiles and Docker Compose files
  3. Data Processing Fundamentals

    • Understanding data processing
    • Implementing a containerized environment
  4. Developing the User Interface

    • UI development using Flask
    • Containerized UI
  5. Visualizing and Analyzing Output

    • Creating visualization dashboards
  6. Deploying to a Public Cloud as a Kubernetes Cluster

    • Setting up a Kubernetes cluster on a public cloud (for this course we will be using GCP)
    • Deploying containerized applications to Kubernetes
    • Managing and scaling applications in Kubernetes

End of Workshop Assessment

  • Q&A session
  • Feedback

Hands-On Lab Exercises

  1. Setting up Docker containers
  2. Create a web-based UI using Flask
  3. Deploying and managing the complete workflow
  4. Deploying the workflow to a Kubernetes cluster on a public cloud

By the end of this workshop, participants will have a comprehensive understanding of how to build and manage a containerized research workflow using Docker and/or Kubernetes.

About

This repository contains a workshop on containerizing research workflows, covering Docker, data processing, and Kubernetes deployment. It provides a complete guide from setup to cloud deployment, with practical examples.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published