> Applied Deep Learning w/ PyImageSearch

March 25, 2021


Project Information
PyImageSearch (Applied Deep Learning)


  • Finished Deep Learning for Computer Vision with Python (Starter + Practitioner) by Adrian Rosebrock

Project Links

Useful Links


I worked through this book mainly to get some experience developing tangible examples of applied machine learning techniques, as well as to better understand how to preprocess data and tune parameters. I've already taken many classes that introduced the theoretical concepts of machine learning/AI, but I always felt very lacking in the applied realm, so this course was a great opportunity to close that learning gap.

Scripted in Python. Models created via Keras.

Code executed on Google Colab (using GPU on larger datasets).

Scratch Notes

  1. Since I've encountered most of the topics before already, I only jotted down topics that stood out to me. These notes are NOT detailed nor do they encompass all the topics introduced in the book. I would highly recommend supporting the author and purchasing the book yourself if you are interested in learning more about the content available. I'm including the notes here mostly for personal documentation purposes (readability is not guaranteed). View my very rough scratch notes here (starter) and here (practitioner).
  2. Takeaways (Starter):
    • Use batch normalization (stabilizes loss, but longer training time)
    • Use a training monitor to note when the model starts diverging
    • CONTROL the generalization gap between training and testing loss!
    • Pooling is used for down-sampling; Convolution filters is used for training features.
    • Dropout, low learning rate, regularization = prevent overfitting.
    • Use high learning rate in the beginning to quickly learn "good" weights, and either decay/drop the learning rate towards the end of the NN to prevent overfitting.
    • Activation layer ALWAYS follows a CONV layer (even if it's not specified in the paper's NN diagram)
    • FC's usually only used AT THE END. Need to use CONV in the beginning/middle of the NN to learn local features.
    • Image classification uses cross-entropy/sigmoid activation at the end.
    • Most networks use ReLU nowadays (except for papers written before the ReLU boom, which tended to use sigmoid or tanh)
    • Use larger filters at beginning, and smaller filters at the end.
    • Understand the input/output size calculation between nodes + make sure you preprocess the data correctly.
    • Visualize the system architecture to make sure the network is outputting the dimensions you think it is.
    • Use weight checkpointing and validation loss monitoring to keep track of how your training is going.
    • Use weights from pretrained networks to classifying your data, or to build on top of to increase its training accuracy.
    • Obtaining and labeling a dataset can be half (if not more) of the battle! Try to use traditional CV techniques (ie. opencv) to speed up the labeling process. Oftentimes, the DATA is more important than the NETWORK itself!
    • Keep datasets in the format: /root/classname/image_filename.jpg
  3. Takeaways (Practitioner):
    • Use data augmentation as a form of regularization to preturb image inputs while keeping the same label (esp good for smaller datasets)
    • Rank-1 vs. Rank-5 accuracy. # Ground Truth within Top 1 or 5 / # datapoints
    • Transfer Learning: using a pre-trained model as a "shortcut" to learn patterns in data not originall trained on. Two types: (1) networks as feature extractors, (2) fine tuning
    • Use CNNs as feature extractors - increases accuracy (ie. extract from VGG16, ResNet, etc) when applied to personal networks. Usually better than handwritten networks like SIFT, HOG, etc.
    • Fine tuning - network surgery, remove FC of trained network, and input new FC to retrain by freezing, training, unfreezing, and training on modified model. Able to leverage pre-existing network architectures by jump-starting the learning process. Leads to higher accuracy transfer learning.

Lessons Learned

  1. Machine learning is part SCIENCE, part ART:
    • Science = understanding the math/theory behind how NNs work.
    • Art = experience + intelligent guess and check.
  2. Don't feel demotivated if your neural network doesn't work out the first time! Researchers will usually spend HOURS training and tuning parameters, to get the accuracies we see on papers (even for very simple networks).
  3. TAKE NOTES between training attempts to remember what worked and what didn't!