minor modifications to lab 5
This commit is contained in:
parent
73f894493a
commit
d84341fbc5
@ -181,7 +181,7 @@
|
||||
"\n",
|
||||
"Implementation tips: To back-propagate through the maxout layer, one needs to keep track of which linear activation $a_{j}, a_{j+1}, \\ldots, a_{j+K}$ was the maximum in each pool. The convenient way to do so is by storing the indices of the maximum units in the fprop function and then in the backprop stage pass the gradient only through those (i.e. for example, one can build an auxiliary matrix where each element is either 1 (if unit was maximum, and passed forward through the max operator for a given data-point) or 0 otherwise. Then in the backward pass it suffices to upsample the maxout *igrads* signal to the linear layer dimension and element-wise multiply by the aforemenioned auxiliary matrix.\n",
|
||||
"\n",
|
||||
"*Optional:* Implement the generic pooling mechanism by introducing an additional *stride* hyper-parameter $0<S\\leq K$. It specifies how many units you move to build the next pool. For instance, for non-overlapping pooling with $S=K=3$ one would build the first two maxout units as: $h_1=\\max(a_1,a_2,a_3)$ and $h_2=\\max(a_4,a_5,a_6)$. However, when setting $S=1$ the pools will share some subset of activations: $h_1=\\max(a_1,a_2,a_3)$ and $h_2=\\max(a_2,a_3,a_4)$."
|
||||
"*Optional:* Implement the generic pooling mechanism by introducing an additional *stride* hyper-parameter $0<S\\leq K$. It specifies how many units you move to build the next pool. For instance, for non-overlapping pooling with $S=K=3$ one would build the first two maxout units as: $h_1=\\max(a_1,a_2,a_3)$ and $h_2=\\max(a_4,a_5,a_6)$. However, after setting $S=1$ the pools should share some subset of linear activations: $h_1=\\max(a_1,a_2,a_3)$ and $h_2=\\max(a_2,a_3,a_4)$."
|
||||
]
|
||||
},
|
||||
{
|
||||
|
Loading…
Reference in New Issue
Block a user