coursework2, labs and code

2015-11-15 16:33:53 +00:00 · 2015-11-15 16:33:53 +00:00 · ed47b36873
commit ed47b36873
parent 80661efae1
3 changed files with 554 additions and 0 deletions
--- a/07_MLP_Coursework2.ipynb
+++ b/07_MLP_Coursework2.ipynb
@ -0,0 +1,362 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Please don't edit this cell!**\n",
+    "\n",
+    "# Marks and Feedback\n",
+    "\n",
+    "**Total Marks:**   XX/100\n",
+    "\n",
+    "**Overall comments:**\n",
+    "\n",
+    "\n",
+    "## Part 1. Investigations into Neural Networks (35 marks)\n",
+    "\n",
+    "* **Task 1**:   *Experiments with learning rate schedules* - XX/5\n",
+    "    * learning rate schedulers implemented\n",
+    "    * experiments carried out\n",
+    "    * further comments\n",
+    "\n",
+    "\n",
+    "* **Task 2**:   *Experiments with regularisation* - XX/5\n",
+    "    * L1 experiments\n",
+    "    * L2 experiments\n",
+    "    * dropout experiments\n",
+    "    * annealed dropout implmented\n",
+    "    * further experiments carried out\n",
+    "    * further comments\n",
+    "    \n",
+    "\n",
+    "* **Task 3**:   *Experiments with pretraining* - XX/15\n",
+    "    * autoencoder pretraining implemented\n",
+    "    * denoising autoencoder pretraining implemented\n",
+    "    * CE layer-by-layer pretraining implemented\n",
+    "    * experiments\n",
+    "    * further comments\n",
+    "\n",
+    "\n",
+    "* **Task 4**:   *Experiments with data augmentation* - XX/5\n",
+    "    * training data augmneted using noise, rotation, ...\n",
+    "    * any further augmnetations\n",
+    "    * experiments \n",
+    "    * further comments\n",
+    "\n",
+    "\n",
+    "* **Task 5**:   *State of the art* - XX/5\n",
+    "    * motivation for systems constructed\n",
+    "    * experiments\n",
+    "    * accuracy of best system\n",
+    "    * further comments\n",
+    "\n",
+    "\n",
+    "\n",
+    "## Part 2. Convolutional Neural Networks (55 marks)\n",
+    "\n",
+    "* **Task 6**:   *Implement convolutional layer* - XX/20\n",
+    "    * linear conv layer\n",
+    "    * sigmoid conv layer\n",
+    "    * relu conv layer\n",
+    "    * any checks for correctness\n",
+    "    * loop-based or vectorised implementations\n",
+    "    * timing comparisons\n",
+    "\n",
+    "\n",
+    "* **Task 7**:   *Implement maxpooling layer* - XX/10\n",
+    "    * implementation of non-overlapping pooling\n",
+    "    * generic implementation\n",
+    "    * any checks for correctness\n",
+    "\n",
+    "\n",
+    "* **Task 8**:   *Experiments with convolutional networks* - XX/25\n",
+    "    * 1 conv layer (1 fmap)\n",
+    "    * 1 conv layer (5 fmaps)\n",
+    "    * 2 conv layers\n",
+    "    * further experiments\n",
+    "\n",
+    "\n",
+    "\n",
+    "## Presentation (10 marks)\n",
+    "\n",
+    "* ** Marks:**   XX/10\n",
+    "    * Concise description of each system constructed\n",
+    "    * Experiment design and motivations for different systems\n",
+    "    * Presentation of results - graphs, tables, diagrams\n",
+    "    * Conclusions\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Coursework #2\n",
+    "\n",
+    "## Introduction\n",
+    "\n",
+    "\n",
+    "## Previous Tutorials\n",
+    "\n",
+    "Before starting this coursework make sure that you have completed the following labs:\n",
+    "\n",
+    "* [04_Regularisation.ipynb](https://github.com/CSTR-Edinburgh/mlpractical/blob/master/04_Regularisation.ipynb) - regularising the model\n",
+    "* [05_Transfer_functions.ipynb](https://github.com/CSTR-Edinburgh/mlpractical/blob/master/05_Transfer_functions.ipynb) - building and training different activation functions\n",
+    "* [06_MLP_Coursework2_Introduction.ipynb](https://github.com/CSTR-Edinburgh/mlpractical/blob/master/06_MLP_Coursework2_Introduction.ipynb) - Notes on numpy and tensors\n",
+    "\n",
+    "\n",
+    "## Submission\n",
+    "**Submission Deadline:  Thursday 14 January 2016, 16:00** \n",
+    "\n",
+    "Submit the coursework as an ipython notebook file, using the `submit` command in the terminal on a DICE machine. If your file is `06_MLP_Coursework1.ipynb` then you would enter:\n",
+    "\n",
+    "`submit mlp 2 06_MLP_Coursework1.ipynb` \n",
+    "\n",
+    "where `mlp 2` indicates this is the second coursework of MLP.\n",
+    "\n",
+    "After submitting, you should receive an email of acknowledgment from the system confirming that your submission has been received successfully. Keep the email as evidence of your coursework submission.\n",
+    "\n",
+    "**Please make sure you submit a single `ipynb` file (and nothing else)!**\n",
+    "\n",
+    "**Submission Deadline:  Thursday 14 January 2016, 16:00** \n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Getting Started\n",
+    "Please enter your student number and the date in the next code cell."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "#MLP Coursework 2\n",
+    "#Student number: <ENTER STUDENT NUMBER>\n",
+    "#Date: <ENTER DATE>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Part 1. Investigations into Neural Networks (35 marks)\n",
+    "\n",
+    "In this part you are may choose exactly what you implement. However, you are expected to express your motivations, observations, and findings in a clear and cohesive way. Try to make it clear why you decided to do certain things. Use graphs and/or tables of results to show trends and other characteristics you think are important. \n",
+    "\n",
+    "For example, in Task 1 you could experiment with different schedulers in order to compare their convergence properties. In Task 2 you could look into (and visualise) what happens to weights when applying L1 and/or L2 regularisation when training. For instance, you could create sorted histograms of weight magnitudes in in each layer, etc..\n",
+    "\n",
+    "**Before submission, please collapse all the log entries into smaller boxes (by clicking on the bar on the left hand side)**\n",
+    "\n",
+    "### Task 1 - Experiments with learning rate schedules (5 marks)\n",
+    "\n",
+    "Investigate the effect of learning rate schedules on training and accuracy.  Implement at least one additional learning rate scheduler mentioned in the lectures. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "#load the corresponding code here, and also attach scripts that run the experiments ()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Task 2 - Experiments with regularisers (5 marks)\n",
+    "\n",
+    "Investigate the effect of different regularisation approaches (L1, L2, dropout).  Implement the annealing dropout scheduler (mentioned in lecture 5). Do some further investigations and experiments with model structures (and regularisers) of your choice. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Task 3 - Experiments with pretraining (15 marks)\n",
+    "\n",
+    "Implement pretraining of multi-layer networks with autoencoders, denoising autoencoders, and using  layer-by-layer cross-entropy training.  \n",
+    "\n",
+    "Implementation tip: You could add the corresponding methods to `optimiser`, namely, `pretrain()` and `pretrain_epoch()`, for autoencoders. Simiilarly, `pretrain_discriminative()` and `pretrain_epoch_discriminative()` for cross-entropy layer-by-layer pretraining. Of course, you can modify any other necessary pieces, but include all the modified fragments below."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Task 4 - Experiments with data augmentation (5 marks)\n",
+    "\n",
+    "Using the standard MNIST training data, generate some augmented training examples (for example, using noise or rotation). Perform experiments on using this expanded training data."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Task 5 - State of the art (5 marks)\n",
+    "\n",
+    "Using any techniques you have learnt so far (combining any number of them), build and train the best model you can (no other constraints)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "collapsed": true
+   },
+   "source": [
+    "# Part 2. Convolutional Neural Networks (55 marks)\n",
+    "\n",
+    "In this part of the coursework, you are required to implement deep convolutional networks.  This includes code for forward prop, back prop, and weight updates for convolutional and max-pooling layers, and should support the stacking of convolutional + pooling layers.  You should implement all the parts relating to the convolutional layer in the mlp/conv.py module; if you decide to implement some routines in cython, keep them in mlp/conv.pyx). Attach both files in this notebook.\n",
+    "\n",
+    "Implementation tips: Look at [lecture 7](http://www.inf.ed.ac.uk/teaching/courses/mlp/2015/mlp07-cnn.pdf) and [lecture 8](http://www.inf.ed.ac.uk/teaching/courses/mlp/2015/mlp08-cnn2.pdf), and the introductory tutorial, [06_MLP_Coursework2_Introduction.ipynb](https://github.com/CSTR-Edinburgh/mlpractical/blob/master/06_MLP_Coursework2_Introduction.ipynb)\n",
+    "\n",
+    "### Task 6 -  Implement convolutional layer (20 marks)\n",
+    "\n",
+    "Implement linear convolutional layer, and then extend to sigmoid and ReLU transfer functions (do it in a similar way to fully-connected layers). Include all relevant code.  It is recommended that you first implement in the naive way with nested loops (python and/or cython);  optionally you may then implement in a vectorised way in numpy.  Include logs for each way you implement the convolutional layer, as timings for different implementations are of interest.  Include all relevant code."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "collapsed": true
+   },
+   "source": [
+    "### Task 7 - Implement max-pooling layer (10 marks)\n",
+    "\n",
+    "Implement a max-pooling layer. Non-overlapping pooling (which was assumed in the lecture presentation) is required. You may also implement a more generic solution with striding as well. Include all relevant code."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Task 8 - Experiments with convolutional networks (25 marks)\n",
+    "\n",
+    "Construct convolutional networks with a softmax output layer and a single fully connected hidden layer. Your first experiments should use one convolutional+pooling layer.  As a default use convolutional kernels of dimension 5x5 (stride 1) and pooling regions of 2x2 (stride 2, hence non-overlapping).\n",
+    "\n",
+    "*  Implement and test a convolutional network with 1 feature map\n",
+    "*  Implement and test a convolutional network with 5 feature maps\n",
+    "\n",
+    "Explore convolutional networks with two convolutional layers, by implementing, training, and evaluating a network with two convolutional+maxpooling layers with 5 feature maps in the first convolutional layer,  and 10 feature maps in the second convolutional layer.\n",
+    "\n",
+    "Carry out further experiments to optimise the convolutional network architecture (you could explore kernel sizes and strides, number of feature maps, sizes and strides of pooling operator, etc. - it is up to you)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "collapsed": true
+   },
+   "source": [
+    "**This is the end of coursework 2.**\n",
+    "\n",
+    "Please remember to save your notebook, and submit your notebook following the instructions at the top.  Please make sure that you have executed all the code cells when you submit the notebook.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 2",
+   "language": "python",
+   "name": "python2"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 2
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython2",
+   "version": "2.7.10"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
--- a/mlp/conv.py
+++ b/mlp/conv.py
@ -0,0 +1,126 @@
+
+# Machine Learning Practical (INFR11119),
+# Pawel Swietojanski, University of Edinburgh
+
+
+import numpy
+import logging
+from mlp.layers import Layer
+
+
+logger = logging.getLogger(__name__)
+
+"""
+You have been given some very initial skeleton below. Feel free to build on top of it and/or
+modify it according to your needs. Just notice, you can factor out the convolution code out of
+the layer code, and just pass (possibly) different conv implementations for each of the stages
+in the model where you are expected to apply the convolutional operator. This will allow you to
+keep the layer implementation independent of conv operator implementation, and you can easily
+swap it layer, for example, for more efficient implementation if you came up with one, etc.
+"""
+
+def my1_conv2d(image, kernels, strides=(1, 1)):
+    """
+    Implements a 2d valid convolution of kernels with the image
+    Note: filer means the same as kernel and convolution (correlation) of those with the input space
+    produces feature maps (sometimes refereed to also as receptive fields). Also note, that
+    feature maps are synonyms here to channels, and as such num_inp_channels == num_inp_feat_maps
+    :param image: 4D tensor of sizes (batch_size, num_input_channels, img_shape_x, img_shape_y)
+    :param filters: 4D tensor of filters of size (num_inp_feat_maps, num_out_feat_maps, kernel_shape_x, kernel_shape_y)
+    :param strides: a tuple (stride_x, stride_y), specifying the shift of the kernels in x and y dimensions
+    :return: 4D tensor of size (batch_size, num_out_feature_maps, feature_map_shape_x, feature_map_shape_y)
+    """
+    raise NotImplementedError('Write me!')
+
+
+class ConvLinear(Layer):
+    def __init__(self,
+                 num_inp_feat_maps,
+                 num_out_feat_maps,
+                 image_shape=(28, 28),
+                 kernel_shape=(5, 5),
+                 stride=(1, 1),
+                 irange=0.2,
+                 rng=None,
+                 conv_fwd=my1_conv2d,
+                 conv_bck=my1_conv2d,
+                 conv_grad=my1_conv2d):
+        """
+
+        :param num_inp_feat_maps: int, a number of input feature maps (channels)
+        :param num_out_feat_maps: int, a number of output feature maps (channels)
+        :param image_shape: tuple, a shape of the image
+        :param kernel_shape: tuple, a shape of the kernel
+        :param stride: tuple, shift of kernels in both dimensions
+        :param irange: float, initial range of the parameters
+        :param rng: RandomState object, random number generator
+        :param conv_fwd: handle to a convolution function used in fwd-prop
+        :param conv_bck: handle to a convolution function used in backward-prop
+        :param conv_grad: handle to a convolution function used in pgrads
+        :return:
+        """
+
+        super(ConvLinear, self).__init__(rng=rng)
+
+        raise NotImplementedError()
+
+    def fprop(self, inputs):
+        raise NotImplementedError()
+
+    def bprop(self, h, igrads):
+        raise NotImplementedError()
+
+    def bprop_cost(self, h, igrads, cost):
+        raise NotImplementedError('ConvLinear.bprop_cost method not implemented')
+
+    def pgrads(self, inputs, deltas, l1_weight=0, l2_weight=0):
+        raise NotImplementedError()
+
+    def get_params(self):
+        raise NotImplementedError()
+
+    def set_params(self, params):
+        raise NotImplementedError()
+
+    def get_name(self):
+        return 'convlinear'
+
+#you can derive here particular non-linear implementations:
+#class ConvSigmoid(ConvLinear):
+#...
+
+
+class ConvMaxPool2D(Layer):
+    def __init__(self,
+                 num_feat_maps,
+                 conv_shape,
+                 pool_shape=(2, 2),
+                 pool_stride=(2, 2)):
+        """
+
+        :param conv_shape: tuple, a shape of the lower convolutional feature maps output
+        :param pool_shape: tuple, a shape of pooling operator
+        :param pool_stride: tuple, a strides for pooling operator
+        :return:
+        """
+
+        super(ConvMaxPool2D, self).__init__(rng=None)
+        raise NotImplementedError()
+
+    def fprop(self, inputs):
+        raise NotImplementedError()
+
+    def bprop(self, h, igrads):
+        raise NotImplementedError()
+
+    def get_params(self):
+        return []
+
+    def pgrads(self, inputs, deltas, **kwargs):
+        return []
+
+    def set_params(self, params):
+        pass
+
+    def get_name(self):
+        return 'convmaxpool2d'
--- a/mlp/utils.py
+++ b/mlp/utils.py
@ -0,0 +1,66 @@
+# Machine Learning Practical (INFR11119),
+# Pawel Swietojanski, University of Edinburgh
+
+import numpy
+from mlp.layers import Layer
+
+
+def numerical_gradient(f, x, eps=1e-4, **kwargs):
+    """
+    Implements the following numerical gradient rule
+    df(x)/dx = (f(x+eps)-f(x-eps))/(2eps)
+    """
+
+    xc = x.copy()
+    g = numpy.zeros_like(xc)
+    xf = xc.ravel()
+    gf = g.ravel()
+
+    for i in xrange(xf.shape[0]):
+        xx = xf[i]
+        xf[i] = xx + eps
+        fp_eps, ___ = f(xc, **kwargs)
+        xf[i] = xx - eps
+        fm_eps, ___ = f(xc, **kwargs)
+        xf[i] = xx
+        gf[i] = (fp_eps - fm_eps)/(2*eps)
+
+    return g
+
+
+def verify_gradient(f, x, eps=1e-4, tol=1e-6, **kwargs):
+    """
+    Compares the numerical and analytical gradients.
+    """
+    fval, fgrad = f(x=x, **kwargs)
+    ngrad = numerical_gradient(f=f, x=x, eps=eps, tol=tol, **kwargs)
+
+    fgradnorm = numpy.sqrt(numpy.sum(fgrad**2))
+    ngradnorm = numpy.sqrt(numpy.sum(ngrad**2))
+    diffnorm = numpy.sqrt(numpy.sum((fgrad-ngrad)**2))
+
+    if fgradnorm > 0 or ngradnorm > 0:
+        norm = numpy.maximum(fgradnorm, ngradnorm)
+        if not (diffnorm < tol or diffnorm/norm < tol):
+            raise Exception("Numerical and analytical gradients "
+                            "are different: %s != %s!" % (ngrad, fgrad))
+    else:
+        if not (diffnorm < tol):
+            raise Exception("Numerical and analytical gradients "
+                            "are different: %s != %s!" % (ngrad, fgrad))
+    return True
+
+
+def verify_layer_gradient(layer, x, eps=1e-4, tol=1e-6):
+
+    assert isinstance(layer, Layer), (
+        "Expected to get the instance of Layer class, got"
+        " %s " % type(layer)
+    )
+
+    def grad_layer_wrapper(x, **kwargs):
+        h = layer.fprop(x)
+        deltas, ograds = layer.bprop(h=h, igrads=numpy.ones_like(h))
+        return numpy.sum(h), ograds
+
+    return verify_gradient(f=grad_layer_wrapper, x=x, eps=eps, tol=tol, layer=layer)