A Deep Dive into CNNs

  • Machine Learning

  • Computer Vision

Mentors :

  • Aakriti

  • Akshit Srivastava

Mentees :

  • 10-15 students

In this project, mentees will learn about the four of the famous Convolutional Neural Network (CNN) architectures - AlexNet, VGGNet, ResNet and GoogleNet.

All of these are classic deep neural networks that have performed exceptionally well in the ImageNet Challenge (ILSVRC) in different years and have functioned as backbones for many computer vision tasks.

The project will involve learning about fundamental concepts and algorithms used in machine learning to analyze the salient features of these different architectures and evaluate cases where these models outperform others. After a basic implementation of these architectures using PyTorch, students would move onto use transfer learning to train deeper models on bigger datasets for classification and localization.

Have a look at the case study presented in these slides - http://cs231n.stanford.edu/slides/2019/cs231n_2019_lecture09.pdf

Bonus points for comparing performance in a more complex CV task such as face or gesture recognition!

This project is ideal for students who want to get started with Deep Learning for Computer Vision.

Week Number Tasks to be Completed
Week 1 Learn/Brush-up Python, PyTorch, Jupyter, Numpy, Unix commands
Week 2 Learn about logistic regression, activation function, gradient descent and neural networks
Week 3 Learn about CNNs for image classification and give a read to the architectures presented in the original papers of these models
Week 4 Implement the models from scratch for classification and compare performance
Week 5 Read about Transfer Learning to train models for more complex tasks and Document results