BuildOurOwnRepublic blog rpblic

Search inside Blog:

    Lecture 4 Introduction to Neural Networks

    Tags:   DeepLearning    CS231n   
    • Chain rule of Derivative: Get each partial derivative with computing a chain of derivative
      • add gate: distributing gradient
      • max gate: gradient router
      • mul gate: gradient switcher(scaler)
      • 1/x gate
      • exp gate or log gate
      • sigmoid gate
      • when calculate in vectorized form, the gradient comes as Jacobian Form, same matrix size with the shape of W.

    Simple example of chain rule Example of chain rule in vector Sigmoid Function

    • Backpropagation: After calculating Forward prediction(propagation) and calculating loss, calculate each partial derivatives by applying chain rule of derivative, then modipy parameters with these gradient. Iterate each Forward propagation and Backward propagation to minimize loss.

    Example of Forward and Backward Propagation

    • Neural Network: Stack multiple linear matrix multiplication layer(and also an activation layer), so that model gets non-linearity or various templates.

    How Neuron actually works Activation Functions