BuildOurOwnRepublic blog rpblic

Search inside Blog:

    Lecture 1 Introduction to Convolutional Neural Networks for Visual Recognition

    Tags:   DeepLearning    CS231n   

    Range of Computer Vision

    Brief history of computer vision

    • Biological Vision: 540 million years ago, vision sensory system has evolved.
      • Humans use 50% of neurons in our contex involved in visual processing
    • Mechanical Vision: 16th, Renaissance period - Camera Obscura Camera Obscura
      • Studying the mechanism of vision: Hubel & Wiesel, 1959, Finding out that visual processing starts with simple structure of visual world and moves along the visual processing pathway the brain buildup the complexity
      • Larry Roberts, 1963, first thesis of computer vision as simplifing the visual structure into simple geometric shapes, to recognize and reconstruct the shape and structure
      • David Marr, late 1970s, Defines Vision Process as Input » Primal Sketch » 2 1/2D Sketch » 3D Model Representation
      • To make primitive computer to understand the visual structure, it has to be reduced to simple structures.
        • Brooks & Binford, 1979, Generalized Cylinder Model
        • Fischler and Elschlager, 1973, Pictorial Structure Model
        • David Lowe, 1987, Lines & Edges Model How to reduce complex visual structure
      • Shi & Malic, 1997, Using a Grapy Theory algorithm to obtain Image Segmentation
      • Viola & Jones, 2001, Face Detection using AdaBoost Face detection
      • David Lowe, 1999, SIFT feature - finding the diagnostic and invariant features to recognize object
      • Schmid & Ponce, 2006,Spatial Pyramid Matching- finding clues of which type of scene it is, by feature descriptor, then use SVM algorithm
      • Recognizing Human Body: Dalal & Triggs, 2005, Histogram of Gradients / Felzenswalb, McAllester, Ramanan, 2009, Deformable Part Model
      • LeCun et al, 1998, Use Convolutional NN for MNIST digit classification CNN
        • number of computation matters(10^6 transistors to 10^9 transistors & GPU)
        • number of data matters(10^7 to 10^14)
      • Data of evaluating Visual Recognition
        • PASCAL Visual Object Challenge
        • IMAGENET: label ALL categories we could see in real world with HUGE data. (By the bunch of categories, there might be a overfitting bottleneck problem for almost every ML algorithms, then How we could recognize every images?) IMAGENET Classification Progress IMAGENET Classification Progress