Adding details and example code for Kaggle submission and accessing files on AFS.

2017-01-25 21:26:03 +00:00 · 2017-01-25 21:26:03 +00:00 · 773c05eabe
commit 773c05eabe
parent 98c63547d1
2 changed files with 280 additions and 70 deletions
--- a/notebooks/09a_Object_recognition_with_CIFAR-10_and_CIFAR-100.ipynb
+++ b/notebooks/09a_Object_recognition_with_CIFAR-10_and_CIFAR-100.ipynb
@ -8,6 +8,7 @@
   },
   "outputs": [],
   "source": [
+    "import os\n",
    "import tensorflow as tf\n",
    "import numpy as np\n",
    "from mlp.data_providers import CIFAR10DataProvider, CIFAR100DataProvider\n",
@ -25,7 +26,16 @@
    "\n",
    "As the name suggests, CIFAR-10 has images in 10 classes:\n",
    "\n",
-    "> airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck\n",
+    "    airplane\n",
+    "    automobile\n",
+    "    bird \n",
+    "    cat\n",
+    "    deer\n",
+    "    dog\n",
+    "    frog\n",
+    "    horse\n",
+    "    ship\n",
+    "    truck\n",
    "\n",
    "with 6000 images per class for an overall dataset size of 60000. Each image has three (RGB) colour channels and pixel dimension 32×32, corresponding to a total dimension per input image of 3×32×32=3072. For each colour channel the input values have been normalised to the range [0, 1].\n",
    "\n",
@ -120,9 +130,13 @@
    "\n",
    "Each class has 600 examples in it, giving an overall dataset size of 60000 i.e. the same as CIFAR-10.\n",
    "\n",
-    "Both CIFAR-10 and CIFAR-100 have standard splits into 50000 training examples and 10000 test examples. To avoid accidental (or purposeful...) fitting to the test set, we have used a different assignation of examples to test and training sets and only provided the inputs (and not target labels) for the 10000 examples chosen for the test set. The remaining 50000 examples have been split in to a 40000 example training dataset and a 10000 example validation dataset, each with target labels provided. If you wish to use a more complex cross-fold validation scheme you may want to combine these two portions of the dataset and define your own functions for separating out a validation set.\n",
+    "Both CIFAR-10 and CIFAR-100 have standard splits into 50000 training examples and 10000 test examples. For CIFAR-100 as there is an optional Kaggle competition (see below) scored on predictions on the test set, we have used a non-standard assignation of examples to test and training set and only provided the inputs (and not target labels) for the 10000 examples chosen for the test set. \n",
    "\n",
-    "Data provider classes for both CIFAR-10 and CIFAR-100 are available in the `mlp.data_providers` module. Both have similar behaviour to the `MNISTDataProvider` used extensively last semester. A `which_set` argument can be used to specify whether to return a data provided for the training dataset (`which_set='train'`) or validation dataset (`which_set='valid'`). \n",
+    "For CIFAR-10 the 10000 test set examples have labels provided: to avoid any accidental over-fitting to the test set **you should only use these for the final evaluation of your model(s)**. If you repeatedly evaluate models on the test set during model development it is easy to end up indirectly fitting to the test labels - for those who have not already read it see this [excellent cautionary note from the MLPR notes by Iain Murray](http://www.inf.ed.ac.uk/teaching/courses/mlpr/2016/notes/w2a_train_test_val.html#fnref2). \n",
+    "\n",
+    "For both CIFAR-10 and CIFAR-100, the remaining 50000 non-test examples have been split in to a 40000 example training dataset and a 10000 example validation dataset, each with target labels provided. If you wish to use a more complex cross-fold validation scheme you may want to combine these two portions of the dataset and define your own functions for separating out a validation set.\n",
+    "\n",
+    "Data provider classes for both CIFAR-10 and CIFAR-100 are available in the `mlp.data_providers` module. Both have similar behaviour to the `MNISTDataProvider` used extensively last semester. A `which_set` argument can be used to specify whether to return a data provided for the training dataset (`which_set='train'`) or validation dataset (`which_set='valid'`).\n",
    "\n",
    "The CIFAR-100 data provider also takes an optional `use_coarse_targets` argument in its constructor. By default this is set to `False` and the targets returned by the data provider correspond to 1-of-K encoded binary vectors for the 100 fine-grained object classes. If `use_coarse_targets=True` then instead the data provider will return 1-of-K encoded binary vector targets for the 20 coarse-grained superclasses associated with each input instead.\n",
    "\n",
@ -133,11 +147,23 @@
    "\n",
    "### Accessing the CIFAR-10 and CIFAR-100 data\n",
    "\n",
-    "Before using the data provider objects you will need to copy the associated data files in to your local `mlp/data` directory (or wherever your `MLP_DATA_DIR` environment variable points to if different). The data is available as six compressed NumPy `.npz` files, (`cifar-10-train.npz, cifar-10-valid.npz, cifar-10-test-inputs.npz` and `cifar-100-train.npz, cifar-100-valid.npz, cifar-100-test.npz`) in the AFS directory `/afs/inf.ed.ac.uk/group/teaching/mlp/data`. Assuming your local `mlpractical` repository is in your home directory you should be able to copy the required files by running\n",
+    "Before using the data provider objects you will need to copy the associated data files in to your local `mlp/data` directory (or wherever your `MLP_DATA_DIR` environment variable points to if different). The data is available as six compressed NumPy `.npz` files\n",
+    "\n",
+    "    cifar-10-train.npz           235MB\n",
+    "    cifar-10-valid.npz            59MB\n",
+    "    cifar-10-test-inputs.npz      59MB\n",
+    "    cifar-10-test-targets.npz     10KB\n",
+    "    cifar-100-train.npz          235MB\n",
+    "    cifar-100-valid.npz           59MB\n",
+    "    cifar-100-test-inputs.npz     59MB\n",
+    "\n",
+    "in the AFS directory `/afs/inf.ed.ac.uk/group/teaching/mlp/data`. Assuming your local `mlpractical` repository is in your home directory you should be able to copy the required files by running\n",
    "\n",
    "```\n",
    "cp /afs/inf.ed.ac.uk/group/teaching/mlp/data/cifar*.npz ~/mlpractical/data\n",
-    "```"
+    "```\n",
+    "\n",
+    "As some of the files are quite large you may wish to copy only those you are using currently (e.g. only the files for one of the two tasks) to your local filespace to avoid filling up your quota. The `cifar-100-test-inputs.npz` file will only be needed by those intending to enter the associated optional Kaggle competition."
   ]
  },
  {
@ -151,7 +177,7 @@
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
-    "collapsed": true
+    "collapsed": false
   },
   "outputs": [],
   "source": [
@ -173,7 +199,7 @@
    "            [input_dim, output_dim], stddev=2. / (input_dim + output_dim)**0.5), \n",
    "        'weights')\n",
    "    biases = tf.Variable(tf.zeros([output_dim]), 'biases')\n",
-    "    outputs = tf.matmul(inputs, weights) + biases\n",
+    "    outputs = nonlinearity(tf.matmul(inputs, weights) + biases)\n",
    "    return outputs"
   ]
  },
@ -218,7 +244,7 @@
   "source": [
    "with tf.Session() as sess:\n",
    "    sess.run(init)\n",
-    "    for e in range(25):\n",
+    "    for e in range(10):\n",
    "        running_error = 0.\n",
    "        running_accuracy = 0.\n",
    "        for input_batch, target_batch in train_data:\n",
@ -306,9 +332,9 @@
   },
   "outputs": [],
   "source": [
-    "with tf.Session() as sess:\n",
-    "    sess.run(init)\n",
-    "    for e in range(25):\n",
+    "sess = tf.Session()\n",
+    "sess.run(init)\n",
+    "for e in range(10):\n",
    "    running_error = 0.\n",
    "    running_accuracy = 0.\n",
    "    for input_batch, target_batch in train_data:\n",
@ -335,6 +361,74 @@
    "        print('                 err(valid)={0:.2f} acc(valid)={1:.2f}'\n",
    "               .format(valid_error, valid_accuracy))"
   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Predicting test data classes and creating a Kaggle submission file\n",
+    "\n",
+    "An optional [Kaggle in Class](https://inclass.kaggle.com/c/mlp2016-7-cifar-100) competition (see email for invite link, you will need to sign-up with a `ed.ac.uk` email address to be able to enter) is being run on the CIFAR-100 (fine-grained) classification task. The scores for the competition are calculated by calculating the proportion of classes correctly predicted on the test set inputs (for which no class labels are provided). Half of the 10000 test inputs are used to calculate a public leaderboard score which will be visible while the competition is in progress and the other half are used to compute the private leaderboard score which will only be unveiled at the end of the competition. Each entrant can make up to two submissions of predictions each day during the competition.\n",
+    "\n",
+    "The code and helper function below illustrate how to use the predicted outputs of the TensorFlow network model we just trained to create a submission file which can be uploaded to Kaggle. The required format of the submission file is a `.csv` (Comma Separated Variable) file with two columns: the first is the integer index of the test input in the array in the provided data file (i.e. first row 0, second row 1 and so on) and the second column the corresponding predicted class label as an integer between 0 and 99 inclusive. The predictions must be preceded by a header line as in the following example\n",
+    "\n",
+    "```\n",
+    "Id,Class\n",
+    "0,81\n",
+    "1,35\n",
+    "2,12\n",
+    "...\n",
+    "```\n",
+    "\n",
+    "Integer class label predictions can be computed from the class probability outputs of the model by performing an `argmax` operation along the last dimension."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "test_inputs = np.load(os.path.join(os.environ['MLP_DATA_DIR'], 'cifar-100-test-inputs.npz'))['inputs']\n",
+    "test_predictions = sess.run(tf.nn.softmax(outputs), feed_dict={inputs: test_inputs})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "def create_kaggle_submission_file(predictions, output_file, overwrite=False):\n",
+    "    if predictions.shape != (10000, 100):\n",
+    "        raise ValueError('predictions should be an array of shape (10000, 25).')\n",
+    "    if not (np.all(predictions >= 0.) and \n",
+    "            np.all(predictions <= 1.)):\n",
+    "        raise ValueError('predictions should be an array of probabilities in [0, 1].')\n",
+    "    if not np.allclose(predictions.sum(-1), 1):\n",
+    "        raise ValueError('predictions rows should sum to one.')\n",
+    "    if os.path.exists(output_file) and not overwrite:\n",
+    "        raise ValueError('File already exists at {0}'.format(output_file))\n",
+    "    pred_classes = predictions.argmax(-1)\n",
+    "    ids = np.arange(pred_classes.shape[0])\n",
+    "    np.savetxt(output_file, np.column_stack([ids, pred_classes]), fmt='%d',\n",
+    "               delimiter=',', header='Id,Class', comments='')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "create_kaggle_submission_file(test_predictions, 'cifar-100-example-network-submission.csv', True)"
+   ]
  }
 ],
 "metadata": {
--- a/notebooks/09b_Music_genre_classification_with_the_Million_Song_Dataset.ipynb
+++ b/notebooks/09b_Music_genre_classification_with_the_Million_Song_Dataset.ipynb
@ -8,6 +8,7 @@
   },
   "outputs": [],
   "source": [
+    "import os\n",
    "import tensorflow as tf\n",
    "import numpy as np\n",
    "from mlp.data_providers import MSD10GenreDataProvider, MSD25GenreDataProvider\n",
@ -33,15 +34,18 @@
    "\n",
    "We provide data providers for the fixed length crops versions of the input features, with the inputs being returned in batches of 3000 dimensional vectors (these can be reshaped to (120, 25) to get the per-segment features). To allow for more complex variable-length sequence modelling with for example recurrent neural networks, we also provide a variable length version of the data. This is only provided as compressed NumPy (`.npz`) data files rather than data provider objects - you will need to write your own data provider if you wish to use this version of the data. As the inputs are of variable number of segments they have been ['bucketed'](https://www.tensorflow.org/tutorials/seq2seq/#bucketing_and_padding) into groups of similar maximum length, with the following binning scheme used:\n",
    "\n",
-    "       1 - 250  segments\n",
+    "     120 - 250  segments\n",
    "     251 - 500  segments\n",
    "     501 - 650  segments\n",
    "     651 - 800  segments\n",
    "     801 - 950  segments\n",
    "     951 - 1200 segments\n",
    "    1201 - 2000 segments\n",
+    "    2000 - 4000 segments\n",
    "    \n",
-    "For each bucket the NumPy data files include inputs and targets arrays with second dimension equal to the maximum sgement size in the bucket (e.g. 250 for the bucket) and first dimension equal to the number of tracks with number of segments in that bucket. These are named `inputs_{n}` and `targets_{n}` in the data file where `{n}` is the maximal number of segments in the bucket e.g. `inputs_250` and `targets_250` for the first bucket. For tracks with less segments than the maximum size in the bucket, the features for the track have been padded with `NaN` values.\n",
+    "For each bucket the NumPy data files include inputs and targets arrays with second dimension equal to the maximum sgement size in the bucket (e.g. 250 for the bucket) and first dimension equal to the number of tracks with number of segments in that bucket. These are named `inputs_{n}` and `targets_{n}` in the data file where `{n}` is the maximal number of segments in the bucket e.g. `inputs_250` and `targets_250` for the first bucket. For tracks with less segments than the maximum size in the bucket, the features for the track have been padded with `NaN` values. For tracks with more segments than the maximum bucket size of 4000, only the first 4000 segments have been included.\n",
+    "\n",
+    "To allow you to match tracks between the fixed length and variable length datasets, the data files also include an array for each bucket giving the indices of the corresponding track in the fixed length input arrays. For example the array `indices_250` will be an array of the same size as the first dimension of `inputs_250` and `targets_250` with the first element of `indices_250` giving the index into the `inputs` and `targets` array of the fixed length data corresponding to first element of `inputs_250` and `targets_250`.\n",
    "\n",
    "The Million Song Dataset in its original form does not provide any genre labels, however various external groups have proposed genre labels for portions of the data by cross-referencing the track IDs against external music tagging databases. Analagously to the provision of both simpler and more complex classifications tasks for the CIFAR-10 / CIFAR-100 datasets, we provide two classification task datasets derived from the Million Song Dataset - one with 10 coarser level genre classes, and another with 25 finer-grained genre / style classifications.\n",
    "\n",
@ -58,7 +62,10 @@
    "    Country\n",
    "    Reggae\n",
    "\n",
-    "For each of these 10 classes, 5000 labelled examples have been collected for training (i.e. 50000 example in total) and a further 1000 example per class which you are provided inputs but not targets for for testing, with the exception of the Blues class for which only 991 testing examples are provided due to there being insufficient labelled tracks of the minimum required length (i.e. a total of 9991 test examples).\n",
+    "For each of these 10 classes, 5000 labelled examples have been collected for training / validation (i.e. 50000 example in total) and a further 1000 example per class for testing, with the exception of the `Blues` class for which only 991 testing examples are provided due to there being insufficient labelled tracks of the minimum required length (i.e. a total of 9991 test examples). \n",
+    "\n",
+    "The 9991 test set examples have labels provided: however to avoid any accidental over-fitting to the test set **you should only use these for the final evaluation of your model(s)**. If you repeatedly evaluate models on the test set during model development it is easy to end up indirectly fitting to the test labels - for those who have not already read it see this [excellent cautionary note int the MLPR notes by Iain Murray](http://www.inf.ed.ac.uk/teaching/courses/mlpr/2016/notes/w2a_train_test_val.html#fnref2). \n",
+    "\n",
    "\n",
    "The 25-genre classification tasks uses the [*MSD Allmusic Style Dataset*](http://www.ifs.tuwien.ac.at/mir/msd/MASD.html) labels derived from the [AllMusic.com](http://www.allmusic.com/) database by [Alexander Schindler, Rudolf Mayer and Andreas Rauber of Vienna University of Technology](http://www.ifs.tuwien.ac.at/~schindler/pubs/ISMIR2012.pdf). The 25 genre / style labels used are:\n",
    "\n",
@ -88,13 +95,53 @@
    "    Rock Hard\n",
    "    Rock Neo Psychedelia\n",
    "    \n",
-    "For each of these 25 classes, 2000 labelled examples have been collected for training (i.e. 50000 example in total) and a further 400 example per class which you are provided inputs but not targets for for testing (i.e. 10000 examples in total). The tracks used for the 25-genre classification task only partially overlap with those used for the 10-genre classification task and we do not provide any mapping between the two.\n",
+    "For each of these 25 classes, 2000 labelled examples have been collected for training / validation (i.e. 50000 example in total). A further 400 example per class have been collected for testing (i.e. 10000 examples in total), which you are provided inputs but not targets for. The optional Kaggle competition being run for this dataset (see email) is scored based on the 25-genre class label predictions on these unlabelled test inputs. \n",
    "\n",
-    "The 50000 labelled examples provided for each of the two tasks have been split in to a 40000 example training dataset and a 10000 example validation dataset, each with target labels provided. If you wish to use a more complex cross-fold validation scheme you may want to combine these two portions of the dataset and define your own functions / classes for separating out a validation set.\n",
+    "The tracks used for the 25-genre classification task only partially overlap with those used for the 10-genre classification task and we do not provide any mapping between the two.\n",
+    "\n",
+    "For each of the two tasks, the 50000 examples collected for training have been pre-split in to a 40000 example training dataset and a 10000 example validation dataset. If you wish to use a more complex cross-fold validation scheme you may want to combine these two portions of the dataset and define your own functions / classes for separating out a validation set.\n",
    "\n",
    "Data provider classes for both fixed length input data for the 10 and 25 genre classification tasks in the `mlp.data_providers` module as `MSD10GenreDataProvider` and `MSD25GenreDataProvider`. Both have similar behaviour to the `MNISTDataProvider` used extensively last semester. A `which_set` argument can be used to specify whether to return a data provided for the training dataset (`which_set='train'`) or validation dataset (`which_set='valid'`).  Both data provider classes provide a `label_map` attribute which is a list of strings which are the class labels corresponding to the integer targets (i.e. prior to conversion to a 1-of-K encoded binary vector).\n",
    "\n",
-    "Below example code is given for creating instances of the 10-genre and 25-genre fixed-length input data provider objects and using them to train simple two-layer feedforward network models with rectified linear activations in TensorFlow."
+    "The corresponding variable length input data are included as data files `msd-10-genre-train_var-length.npz`, `msd-10-genre-valid-var-length.npz`.\n",
+    "\n",
+    "Below example code is given for creating instances of the 10-genre and 25-genre fixed-length input data provider objects and using them to train simple two-layer feedforward network models with rectified linear activations in TensorFlow.\n",
+    "\n",
+    "### Accessing the Million Song Dataset data\n",
+    "\n",
+    "Before using the data provider objects you will need to copy the associated data files in to your local `mlp/data` directory (or wherever your `MLP_DATA_DIR` environment variable points to if different). \n",
+    "\n",
+    "The fixed length input data and associated targets is available as compressed NumPy `.npz` files\n",
+    "\n",
+    "    msd-10-genre-train.npz          210MB\n",
+    "    msd-10-genre-valid.npz           53MB\n",
+    "    msd-10-genre-test-inputs.npz     53MB\n",
+    "    msd-10-genre-test-targets.npz   5.2KB\n",
+    "    msd-25-genre-train.npz          210MB\n",
+    "    msd-25-genre-valid.npz           53MB\n",
+    "    msd-25-genre-test-inputs.npz     53MB\n",
+    "\n",
+    "in the AFS directory `/afs/inf.ed.ac.uk/group/teaching/mlp/data`. Assuming your local `mlpractical` repository is in your home directory you should be able to copy the required files by running\n",
+    "\n",
+    "```\n",
+    "cp /afs/inf.ed.ac.uk/group/teaching/mlp/data/msd-*-train.npz ~/mlpractical/data\n",
+    "cp /afs/inf.ed.ac.uk/group/teaching/mlp/data/msd-*-valid.npz ~/mlpractical/data\n",
+    "cp /afs/inf.ed.ac.uk/group/teaching/mlp/data/msd-*-test-*.npz ~/mlpractical/data\n",
+    "```\n",
+    "\n",
+    "As some of the files are quite large you may wish to copy only those you are using (e.g. only the files for one of the two tasks) to your local filespace to avoid filling up your quota. The `msd-25-genre-test-inputs.npz` files will only be needed by those intending to enter the associated optional Kaggle competition.\n",
+    "\n",
+    "In addition to the fixed length input files there are also corresponding files with the variable length input data\n",
+    "\n",
+    "    msd-10-genre-train-var-length.npz          1.6GB\n",
+    "    msd-10-genre-valid-var-length.npz          403MB\n",
+    "    msd-10-genre-test-inputs-var-length.npz    403MB\n",
+    "    msd-10-genre-test-targets-var-length.npz   3.1KB\n",
+    "    msd-25-genre-train-var-length.npz          1.5GB\n",
+    "    msd-25-genre-valid-var-length.npz          367MB\n",
+    "    msd-25-genre-test-inputs-var-length.npz    363MB\n",
+    "    \n",
+    "As you can see some of these files, particularly the training sets, are very large so you will likely need to be careful when copying to your filespace to make sure you have sufficient quota available."
   ]
  },
  {
@ -130,7 +177,7 @@
    "            [input_dim, output_dim], stddev=2. / (input_dim + output_dim)**0.5), \n",
    "        'weights')\n",
    "    biases = tf.Variable(tf.zeros([output_dim]), 'biases')\n",
-    "    outputs = tf.matmul(inputs, weights) + biases\n",
+    "    outputs = nonlinearity(tf.matmul(inputs, weights) + biases)\n",
    "    return outputs"
   ]
  },
@ -175,7 +222,7 @@
   "source": [
    "with tf.Session() as sess:\n",
    "    sess.run(init)\n",
-    "    for e in range(25):\n",
+    "    for e in range(10):\n",
    "        running_error = 0.\n",
    "        running_accuracy = 0.\n",
    "        for input_batch, target_batch in train_data:\n",
@ -263,9 +310,9 @@
   },
   "outputs": [],
   "source": [
-    "with tf.Session() as sess:\n",
-    "    sess.run(init)\n",
-    "    for e in range(25):\n",
+    "sess = tf.Session()\n",
+    "sess.run(init)\n",
+    "for e in range(10):\n",
    "    running_error = 0.\n",
    "    running_accuracy = 0.\n",
    "    for input_batch, target_batch in train_data:\n",
@ -292,6 +339,75 @@
    "        print('                 err(valid)={0:.2f} acc(valid)={1:.2f}'\n",
    "               .format(valid_error, valid_accuracy))"
   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Predicting test data classes and creating a Kaggle submission file\n",
+    "\n",
+    "An optional [Kaggle in Class](https://inclass.kaggle.com/c/mlp2016-7-msd-genre) competition (see email for invite link, you will need to sign-up with a `ed.ac.uk` email address to be able to enter) is being run on the 25 genre classification task. The scores for the competition are calculated by calculating the proportion of classes correctly predicted on the test set inputs (for which no class labels are provided). Half of the 10000 test inputs are used to calculate a public leaderboard score which will be visible while the competition is in progress and the other half are used to compute the private leaderboard score which will only be unveiled at the end of the competition. Each entrant can make up to two submissions of predictions each day during the competition.\n",
+    "\n",
+    "The code and helper function below illustrate how to use the predicted outputs of the TensorFlow network model we just trained to create a submission file which can be uploaded to Kaggle. The required format of the submission file is a `.csv` (Comma Separated Variable) file with two columns: the first is the integer index of the test input in the array in the provided data file (i.e. first row 0, second row 1 and so on) and the second column the corresponding predicted class label as an integer. The predictions must be preceded by a header line as in the following example\n",
+    "\n",
+    "```\n",
+    "Id,Class\n",
+    "0,12\n",
+    "1,24\n",
+    "2,9\n",
+    "...\n",
+    "```\n",
+    "\n",
+    "Integer class label predictions can be computed from the class probability outputs of the model by performing an `argmax` operation along the last dimension."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "test_inputs = np.load(os.path.join(os.environ['MLP_DATA_DIR'], 'msd-25-genre-test-inputs.npz'))['inputs']\n",
+    "test_inputs = test_inputs.reshape((test_inputs.shape[0], -1))\n",
+    "test_predictions = sess.run(tf.nn.softmax(outputs), feed_dict={inputs: test_inputs})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "def create_kaggle_submission_file(predictions, output_file, overwrite=False):\n",
+    "    if predictions.shape != (10000, 25):\n",
+    "        raise ValueError('predictions should be an array of shape (10000, 25).')\n",
+    "    if not (np.all(predictions >= 0.) and \n",
+    "            np.all(predictions <= 1.)):\n",
+    "        raise ValueError('predictions should be an array of probabilities in [0, 1].')\n",
+    "    if not np.allclose(predictions.sum(-1), 1):\n",
+    "        raise ValueError('predictions rows should sum to one.')\n",
+    "    if os.path.exists(output_file) and not overwrite:\n",
+    "        raise ValueError('File already exists at {0}'.format(output_file))\n",
+    "    pred_classes = predictions.argmax(-1)\n",
+    "    ids = np.arange(pred_classes.shape[0])\n",
+    "    np.savetxt(output_file, np.column_stack([ids, pred_classes]), fmt='%d',\n",
+    "               delimiter=',', header='Id,Class', comments='')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "create_kaggle_submission_file(test_predictions, 'msd-25-example-network-submission.csv', True)"
+   ]
  }
 ],
 "metadata": {