Solutions
Contents
Solutions¶
Fundamentals¶
Questions¶
Question 1
What does deep mean in deep learning?
Solution 1
The number of hidden layers.
Question 2
Activation functions help neural networks learn complex functions because they are:
Linear
Non-linear
Solution 2
Non-linear
Question 3
What is a tensor?
Solution 3
A multi-dimensional array.
Question 4
I have labelled pictures of cats and dogs that I’d like a model to classify.
Is this a supervised or unsupervised problem?
Solution 4
Supervised
Question 5
I’d like a model to predict house prices from their features.
Is this a classification or regression problem?
Solution 5
Regression
Question 6
How many times can I use the test data?
Solution 6
Once
Question 7
I’ve decided on the number of hidden layers to use in my neural network.
Is this a parameter or hyperparameter?
Solution 7
Hyperparameter
Question 8
Do I want to minimise or maximise the loss?
Solution 8
Minimise
Question 9
A model underfits the data when it has:
High bias
High variance
Solution 9
High bias
Question 10
If my model underfits, what might help:
Adding more features
Adding more data
Solution 10
Add more features
Question 11
If my model overfits, what might help:
Adding more complex features
Increasing regularlisation
Solution 11
Increasing regularlisation
Tools¶
Questions¶
Question 1
If you were looking to do classic machine learning, what tool is a good choice?
Solution 1
scikit-learn
Question 2
If you were looking to do deep learning using a high-level API, what tools are a good choice?
Solution 2
Keras or PyTorch Lightning.
Question 3
What are good reasons for choosing a high or low-level API?
Solution 3
High-level APIs (e.g., Keras, PyTorch Lightning) are often quicker for testing out ideas, more user-friendly, and have many high-level objects ready for you to use (e.g., layers, optimisers).
Low-level APIs (e.g., TensorFlow, PyTorch) are useful for creating your own custom objects.
Question 4
When creating a model, which API is simpler to use?
Sequential
Subclassing
Solution 4
The Sequential API is simpler.
The Subclassing API has a lot more flexibility.
Question 5
Put these general steps in order:
Compile the model
Preprocess the data
Test the model
Fit the model to the training data
Create the model
Download the data
Solution 5
Download the data
Preprocess the data
Create the model
Compile the model
Fit the model to the training data
Test the model
Question 6
Which machine learning library is the best?
Solution 6
You decide.
Data¶
Questions¶
Question 1
Should I split my data in train and test subsets before or after pre-processing?
Solution 1
Before. Splitting the data should be done first. This can help avoid any data leakage of the test data.
Question 2
Before I use random functionality, what is a good practice for reproducibility?
Solution 2
Set a random seed.
Question 3
What should I create if there are multiple steps to my data pre-processing?
Solution 3
A data pipeline.
Question 4
Name three ways to improve performance in a data pipeline.
Solution 4
Any of the following are often good ideas:
Shuffle
Batch
Caching
Prefetch
Parallel data extraction
Data augmentation
Parallel data transformation
Vectorised mapping
Mixed precision
Models¶
Questions¶
Question 1
What are possible hyperparameters that could be tuned?
Learning rate and the number of units.
Weights and biases.
Solution 1
Learning rate and the number of units.
Question 2
What is transfer learning?
Solution 2
Transfer learning is where a model that has been pre-trained on a problem is transferred to another similar problem. For example, if a model learnt to classify images it can be transferred to other image classification problems.
Question 3
Why is transfer learning useful?
Solution 3
Transfer learning is useful when:
You have a small dataset.
You want to take advantage of huge models without the costs of training them.
Question 4
What is a key step in transfer learning?
Solution 4
To freeze the layers from the base model, otherwise the pre-trained model will forget what it has learned.
Note, this is different to when after transfer learning is complete, you unfreeze some of the lower layers and do some fine tuning.
Question 5
What are callbacks?
Solution 5
Callbacks are objects that get called by the model at different points during training, in particular after each batch or epoch.
Question 6
Name three examples of callbacks.
Solution 6
Checkpoints
Fault tolerance
Logging
Profiling
Early stopping
Learning rate decay
Distributed¶
Questions¶
Question 1
What are the two ways to parallelise machine learning, and which way is simpler?
Solution 1
Data parallelism and model parallelism.
Where data parallelism is often simpler.
Question 2
How can you check the efficiency of a CPU job?
Solution 2
You can check the CPU efficiency of a job using qacct -j <JOBID>
.
Here, the table provides information for cpu
, ru_wallclock
, and slots
.
You can use these to calculate the efficiency via:
Efficiency = 100 * cpu / (ru_wallclock * slots)
Question 3
How can you check the efficiency of a GPU job?
Solution 3
You can use nvidia-smi
.
In your job, you can use it to create an efficiency log via:
# start the efficiency log for the GPU
nvidia-smi dmon -d 10 -s um -i 0 > efficiency_log &
# run the GPU script
python script.py
# stop the efficiency log
kill %1
Question 4
What tools can help distribute TensorFlow and PyTorch code?
Solution 4
Ray Train and PyTorch Lightning.
Question 5
In general, what should the batch size be for distributed work?
Solution 5
Generally, the batch size for distributed work should be the global batch size, which is the per worker batch size multipled by the number of workers.
Question 6
What are some good steps for moving Jupyter Notebook code to HPC?
Solution 6
Clean non-essential code
Refactor Jupyter Notebook code into functions
Create a Python script
Create submission script
Create tests