Skip to content
Permalink
main
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
# CLASS-HPC-GCP
Internet2 CLASS Capstone Project, HPC-GCP Team
It is convenient to deploy an HPC cluster on the GCP cloud using
terraform scripts. The scripts are inside the directory slurm-gcp. You
need to edit the scripts - basic.tfvars and main.tf. For a basic
deployment, it is sufficient to redefine a few variables such as
project name, cluster name, etc. Of course, you can fine tune the
variables and scripts to fit your needs.
* Create a GCP Project (for example, class-capstone)
* Go to API services and enable the API for compute engines and deployment manager
* Start the cloud shell (If you set up the environment for gcloud shell or sdk on your laptop, you can use your development environment in place of cloud shell)
* Clone the gcp repo: git clone https://github.com/SchedMD/slurm-gcp.git
* Go to the directory slurm-gcp/tf/examples
* Make a copy of the basic example. `cp basic.tfvar.example basic.tfvar`
* Edit the basics.tfvar file. Add a line “project = class-capstone” (or any name you like)
* Open main.tf, and make sure that the bash variable `source` refers to the correct path
* Initialize terraform `terraform init`
* Start the HPC cluster. `terraform apply -var-file=basic.tfvars`.
* Go to your GCP dashboard and check the compute engines. You should see the controller and the login node up and running.
* SSH into the login node and check the slurm status (`sinfo`) or run some test jobs.
Note that the minimal disk-size is 20 GB to accommodate the size of the VM. If you wan’t more than 20 GB, that’s okay.
The above steps are explained in the document page: https://cloud.google.com/architecture/deploying-slurm-cluster-compute-engine