Correcting regularisation coefficient values to test.
This commit is contained in:
parent
cf6417f917
commit
9a182d4be1
@ -452,10 +452,10 @@
|
||||
"Using the `L1Penalty` and `L2Penalty` classes you implemented in the previous exercise, train models to classify MNIST digit images with\n",
|
||||
"\n",
|
||||
" * no regularisation\n",
|
||||
" * an L1 penalty with coefficient 0.1 on the all of the weight matrix parameters\n",
|
||||
" * an L1 penalty with coefficient 1.0 on the all of the weight matrix parameters\n",
|
||||
" * an L2 penalty with coefficient 0.1 on the all of the weight matrix parameters\n",
|
||||
" * an L2 penalty with coefficient 1.0 on the all of the weight matrix parameters\n",
|
||||
" * an L1 penalty with coefficient $10^{-5}$ on the all of the weight matrix parameters\n",
|
||||
" * an L1 penalty with coefficient $10^{-3}$on the all of the weight matrix parameters\n",
|
||||
" * an L2 penalty with coefficient $10^{-4}$ on the all of the weight matrix parameters\n",
|
||||
" * an L2 penalty with coefficient $10^{-2}$ on the all of the weight matrix parameters\n",
|
||||
" \n",
|
||||
"The models should all have three affine layers interspersed with rectified linear layers (as implemented in the first exercise) and intermediate layers between the input and output should have dimensionalities of 100. The final output layer should be an `AffineLayer` (the model outputting the logarithms of the non-normalised class probabilties) and you should use the `CrossEntropySoftmaxError` as the error function (which calculates the softmax of the model outputs to convert to normalised class probabilities before calculating the corresponding multi-class cross entropy error). \n",
|
||||
"\n",
|
||||
@ -482,7 +482,7 @@
|
||||
"\n",
|
||||
"This assumes all the relevant classes have been imported from their modules, a penalty object has been assigned to `weights_penalty` and a seeded random number generator assigned to `rng`.\n",
|
||||
"\n",
|
||||
"For each regularisation scheme, train the model for 100 epochs with a batch size of 50 and using a gradient descent with momentum learning rule with learning rate 0.05 and momentum coefficient 0.8. For each regularisation scheme you should store the run statistics (output of `Optimiser.train`) and the final values of the first layer weights for each of the trained models."
|
||||
"For each regularisation scheme, train the model for 100 epochs with a batch size of 50 and using a gradient descent with momentum learning rule with learning rate 0.01 and momentum coefficient 0.9. For each regularisation scheme you should store the run statistics (output of `Optimiser.train`) and the final values of the first layer weights for each of the trained models."
|
||||
]
|
||||
},
|
||||
{
|
||||
|
Loading…
Reference in New Issue
Block a user