diff --git a/05_Transfer_functions.ipynb b/05_Transfer_functions.ipynb index 98c4030..c9ec3b8 100644 --- a/05_Transfer_functions.ipynb +++ b/05_Transfer_functions.ipynb @@ -181,7 +181,7 @@ "\n", "Implementation tips: To back-propagate through the maxout layer, one needs to keep track of which linear activation $a_{j}, a_{j+1}, \\ldots, a_{j+K}$ was the maximum in each pool. The convenient way to do so is by storing the indices of the maximum units in the fprop function and then in the backprop stage pass the gradient only through those (i.e. for example, one can build an auxiliary matrix where each element is either 1 (if unit was maximum, and passed forward through the max operator for a given data-point) or 0 otherwise. Then in the backward pass it suffices to upsample the maxout *igrads* signal to the linear layer dimension and element-wise multiply by the aforemenioned auxiliary matrix.\n", "\n", - "*Optional:* Implement the generic pooling mechanism by introducing an additional *stride* hyper-parameter $0