From 1313d8ab2e179351f6e898cb14d426df149f553d Mon Sep 17 00:00:00 2001 From: Anton Lydike Date: Mon, 18 Nov 2024 15:24:31 +0000 Subject: [PATCH] update google notes to be what I think they should be --- notes/google_cloud_setup.md | 33 ++++++++++++++------------------- 1 file changed, 14 insertions(+), 19 deletions(-) diff --git a/notes/google_cloud_setup.md b/notes/google_cloud_setup.md index f80b04f..37a2bd3 100644 --- a/notes/google_cloud_setup.md +++ b/notes/google_cloud_setup.md @@ -69,19 +69,14 @@ You only have $50 dollars worth of credit, which should be about 6 days of GPU u ### To login into your instance via terminal: -1. In a DICE terminal window (Or your local environment) ```conda activate mlp``` -2. Download the `gcloud` toolkit using ```curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-cli-linux-x86_64.tar.gz``` -3. Install the `gcloud` toolkit using ```tar zxvf google-cloud-cli-linux-x86_64.tar.gz; bash google-cloud-sdk/install.sh```. -**Note**: You might be asked to provide a passphrase to generate your local key, simply use a password of your choice. There might be some Yes/No style questions as well, choose yes, when that happens. - -4. Reset your terminal using ```reset; source ~/.bashrc```. Then authorize the current machine to access your nodes run ```gcloud auth login```. This will authenticate your google account login. -5. Follow the prompts to get a token for your current machine. -6. Run ```gcloud config set project PROJECT_ID``` where you replace `PROJECT-ID` with your project ID. You can find that in the projects drop down menu on the top of the Google Compute Engine window; this sets the current project as the active one. If you followed the above instructions, your project ID should be `sxxxxxxx-mlpractical`, where `sxxxxxxx` is your student number. -7. In your compute engine window, in the line for the instance that you have started (`mlpractical-1`), click on the downward arrow next to ```SSH```. Choose ```View gcloud command```. Copy the command to your terminal and press enter. Make sure your VM is up and running before doing this. -8. Add a password for your ssh-key (and remember it!). -9. Re-enter password (which will unlock your ssh-key) when prompted. -10. On your first login, you will be asked if you want to install nvidia drivers, **DO NOT AGREE** and follow the nvidia drivers installation below. -11. Install the R470 Nvidia driver by running the following commands: +1. Install `google-cloud-sdk` (or similarly named) package using your OS package manager +2. Authorize the current machine to access your nodes run ```gcloud auth login```. This will authenticate your google account login. +3. Follow the prompts to get a token for your current machine. +4. Run ```gcloud config set project PROJECT_ID``` where you replace `PROJECT-ID` with your project ID. You can find that in the projects drop down menu on the top of the Google Compute Engine window; this sets the current project as the active one. If you followed the above instructions, your project ID should be `sxxxxxxx-mlpractical`, where `sxxxxxxx` is your student number. +5. In your compute engine window, in the line for the instance that you have started (`mlpractical-1`), click on the downward arrow next to ```SSH```. Choose ```View gcloud command```. Copy the command to your terminal and press enter. Make sure your VM is up and running before doing this. +6. Don't add a password to the SSH key. +7. On your first login, you will be asked if you want to install nvidia drivers, **DO NOT AGREE** and follow the nvidia drivers installation below. +8. Install the R470 Nvidia driver by running the following commands: * Add "contrib" and "non-free" components to /etc/apt/sources.list ```bash sudo tee -a /etc/apt/sources.list >/dev/null <<'EOF' @@ -91,15 +86,15 @@ You only have $50 dollars worth of credit, which should be about 6 days of GPU u ``` * Check that the lines were well added by running: ```bash - sudo -e /etc/apt/sources.list + cat /etc/apt/sources.list ``` * Update the list of available packages and install the nvidia-driver package: ```bash sudo apt update sudo apt install nvidia-driver firmware-misc-nonfree ``` -12. Run ```nvidia-smi``` to confirm that the GPU can be found. This should report 1 Tesla T4 GPU. if not, the driver might have failed to install. -13. To test that PyTorch has access to the GPU you can type the commands below in your terminal. You should see `torch.cuda_is_available()` return `True`. +9. Run ```nvidia-smi``` to confirm that the GPU can be found. This should report 1 Tesla T4 GPU. if not, the driver might have failed to install. +10. To test that PyTorch has access to the GPU you can type the commands below in your terminal. You should see `torch.cuda_is_available()` return `True`. ``` python ``` @@ -110,8 +105,8 @@ You only have $50 dollars worth of credit, which should be about 6 days of GPU u ``` exit() ``` -14. Well done, you are now in your instance and ready to use it for your coursework. -15. Clone a fresh mlpractical repository, and checkout branch `mlp2024-25/mlp_compute_engines`: +11. Well done, you are now in your instance and ready to use it for your coursework. +12. Clone a fresh mlpractical repository, and checkout branch `mlp2024-25/mlp_compute_engines`: ``` git clone https://github.com/VICO-UoE/mlpractical.git ~/mlpractical @@ -125,7 +120,7 @@ You only have $50 dollars worth of credit, which should be about 6 days of GPU u python train_evaluate_emnist_classification_system.py --filepath_to_arguments_json_file experiment_configs/emnist_tutorial_config.json ``` - You should be able to see an experiment running, using the GPU. It should be doing about 26-30 it/s (iterations per second). You can stop it when ever you like using `ctrl-c`. + You should be able to see an experiment running, using the GPU. It should be doing about 260-300 it/s (iterations per second). You can stop it when ever you like using `ctrl-c`. If all the above matches what’s stated then you should be ready to run your experiments.