From 1d4587bd421ddb593d3a8b59c1c15fa89bffc4e2 Mon Sep 17 00:00:00 2001
From: pswietojanski
Date: Mon, 5 Oct 2015 09:20:14 +0100
Subject: [PATCH] 2nd lab
---
01_Linear_Models.ipynb | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/01_Linear_Models.ipynb b/01_Linear_Models.ipynb
index 7fc5611..7eea371 100644
--- a/01_Linear_Models.ipynb
+++ b/01_Linear_Models.ipynb
@@ -432,7 +432,7 @@
"source": [
"# Iterative learning of linear models\n",
"\n",
- "We will learn the model with stochastic gradient descent using mean square error (MSE) loss function, which is defined as follows:\n",
+ "We will learn the model with stochastic gradient descent on N data-points using mean square error (MSE) loss function, which is defined as follows:\n",
"\n",
"(5) $\n",
"E = \\frac{1}{2} \\sum_{n=1}^N ||\\mathbf{y}^n - \\mathbf{t}^n||^2 = \\sum_{n=1}^N E^n \\\\\n",
@@ -444,7 +444,8 @@
"Hence, the gradient w.r.t (with respect to) the $r$ output y of the model is defined as, so called delta function, $\\delta_r$: \n",
"\n",
"(8) $\\frac{\\partial{E^n}}{\\partial{y_{r}}} = (y^n_r - t^n_r) = \\delta^n_r \\quad ; \\quad\n",
- " \\delta^n_r = y^n_r - t^n_r \n",
+ " \\delta^n_r = y^n_r - t^n_r \\\\\n",
+ " \\frac{\\partial{E}}{\\partial{y_{r}}} = \\sum_{n=1}^N \\frac{\\partial{E^n}}{\\partial{y_{r}}} = \\sum_{n=1}^N \\delta^n_r\n",
"$\n",
"\n",
"Similarly, using the above $\\delta^n_r$ one can express the gradient of the weight $w_{sr}$ (from the s-th input to the r-th output) for linear model and MSE cost as follows:\n",
@@ -518,7 +519,6 @@
"\n",
"def fprop(x, W, b):\n",
" #code implementing eq. (3)\n",
- " #return: y\n",
" raise NotImplementedError('Write me!')\n",
"\n",
"def cost(y, t):\n",
@@ -571,8 +571,8 @@
" #4. Update the model, we update with the mean gradient\n",
" # over the minibatch, rather than sum of particular gradients\n",
" # in a minibatch, to do so we scale the learning rate by batch_size\n",
- " mb_size = x.shape[0]\n",
- " effect_learn_rate = learning_rate / mb_size\n",
+ " batch_size = x.shape[0]\n",
+ " effect_learn_rate = learning_rate / batch_size\n",
"\n",
" W = W - effect_learn_rate * grad_W\n",
" b = b - effect_learn_rate * grad_b\n",