update google notes to be what I think they should be

This commit is contained in:
Anton Lydike 2024-11-18 15:24:31 +00:00
parent 56f22b8ac1
commit 1313d8ab2e

View File

@ -69,19 +69,14 @@ You only have $50 dollars worth of credit, which should be about 6 days of GPU u
### To login into your instance via terminal:
1. In a DICE terminal window (Or your local environment) ```conda activate mlp```
2. Download the `gcloud` toolkit using ```curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-cli-linux-x86_64.tar.gz```
3. Install the `gcloud` toolkit using ```tar zxvf google-cloud-cli-linux-x86_64.tar.gz; bash google-cloud-sdk/install.sh```.
**Note**: You might be asked to provide a passphrase to generate your local key, simply use a password of your choice. There might be some Yes/No style questions as well, choose yes, when that happens.
4. Reset your terminal using ```reset; source ~/.bashrc```. Then authorize the current machine to access your nodes run ```gcloud auth login```. This will authenticate your google account login.
5. Follow the prompts to get a token for your current machine.
6. Run ```gcloud config set project PROJECT_ID``` where you replace `PROJECT-ID` with your project ID. You can find that in the projects drop down menu on the top of the Google Compute Engine window; this sets the current project as the active one. If you followed the above instructions, your project ID should be `sxxxxxxx-mlpractical`, where `sxxxxxxx` is your student number.
7. In your compute engine window, in the line for the instance that you have started (`mlpractical-1`), click on the downward arrow next to ```SSH```. Choose ```View gcloud command```. Copy the command to your terminal and press enter. Make sure your VM is up and running before doing this.
8. Add a password for your ssh-key (and remember it!).
9. Re-enter password (which will unlock your ssh-key) when prompted.
10. On your first login, you will be asked if you want to install nvidia drivers, **DO NOT AGREE** and follow the nvidia drivers installation below.
11. Install the R470 Nvidia driver by running the following commands:
1. Install `google-cloud-sdk` (or similarly named) package using your OS package manager
2. Authorize the current machine to access your nodes run ```gcloud auth login```. This will authenticate your google account login.
3. Follow the prompts to get a token for your current machine.
4. Run ```gcloud config set project PROJECT_ID``` where you replace `PROJECT-ID` with your project ID. You can find that in the projects drop down menu on the top of the Google Compute Engine window; this sets the current project as the active one. If you followed the above instructions, your project ID should be `sxxxxxxx-mlpractical`, where `sxxxxxxx` is your student number.
5. In your compute engine window, in the line for the instance that you have started (`mlpractical-1`), click on the downward arrow next to ```SSH```. Choose ```View gcloud command```. Copy the command to your terminal and press enter. Make sure your VM is up and running before doing this.
6. Don't add a password to the SSH key.
7. On your first login, you will be asked if you want to install nvidia drivers, **DO NOT AGREE** and follow the nvidia drivers installation below.
8. Install the R470 Nvidia driver by running the following commands:
* Add "contrib" and "non-free" components to /etc/apt/sources.list
```bash
sudo tee -a /etc/apt/sources.list >/dev/null <<'EOF'
@ -91,15 +86,15 @@ You only have $50 dollars worth of credit, which should be about 6 days of GPU u
```
* Check that the lines were well added by running:
```bash
sudo -e /etc/apt/sources.list
cat /etc/apt/sources.list
```
* Update the list of available packages and install the nvidia-driver package:
```bash
sudo apt update
sudo apt install nvidia-driver firmware-misc-nonfree
```
12. Run ```nvidia-smi``` to confirm that the GPU can be found. This should report 1 Tesla T4 GPU. if not, the driver might have failed to install.
13. To test that PyTorch has access to the GPU you can type the commands below in your terminal. You should see `torch.cuda_is_available()` return `True`.
9. Run ```nvidia-smi``` to confirm that the GPU can be found. This should report 1 Tesla T4 GPU. if not, the driver might have failed to install.
10. To test that PyTorch has access to the GPU you can type the commands below in your terminal. You should see `torch.cuda_is_available()` return `True`.
```
python
```
@ -110,8 +105,8 @@ You only have $50 dollars worth of credit, which should be about 6 days of GPU u
```
exit()
```
14. Well done, you are now in your instance and ready to use it for your coursework.
15. Clone a fresh mlpractical repository, and checkout branch `mlp2024-25/mlp_compute_engines`:
11. Well done, you are now in your instance and ready to use it for your coursework.
12. Clone a fresh mlpractical repository, and checkout branch `mlp2024-25/mlp_compute_engines`:
```
git clone https://github.com/VICO-UoE/mlpractical.git ~/mlpractical
@ -125,7 +120,7 @@ You only have $50 dollars worth of credit, which should be about 6 days of GPU u
python train_evaluate_emnist_classification_system.py --filepath_to_arguments_json_file experiment_configs/emnist_tutorial_config.json
```
You should be able to see an experiment running, using the GPU. It should be doing about 26-30 it/s (iterations per second). You can stop it when ever you like using `ctrl-c`.
You should be able to see an experiment running, using the GPU. It should be doing about 260-300 it/s (iterations per second). You can stop it when ever you like using `ctrl-c`.
If all the above matches whats stated then you should be ready to run your experiments.