Lecture 6,7 Training Neural Networks
27 Sep 2018Activation Functions Sigmoid Advantages: (-inf , inf) » (0 , 1), good for saturating “firing rate” of neuron Disadvantages: Saturated neuron’s gradient is nearly 0: it stops gradient dis...
Activation Functions Sigmoid Advantages: (-inf , inf) » (0 , 1), good for saturating “firing rate” of neuron Disadvantages: Saturated neuron’s gradient is nearly 0: it stops gradient dis...
Convolutional Network: NN using convolutional layers We want to preserve spatial structure of image! But Fully connected NN cannot do that. preserve (horizon * vertical * channel) structure For 32 * 32 * 3 image, <...
Chain rule of Derivative: Get each partial derivative with computing a chain of derivative add gate: distributing gradient max gate: gradient router mul gate: gradient switcher(scaler) 1/x gate exp gate or l...