• Machine Learning

Mentors :

  • Utkarsh Agarwal

  • Gagan Jain

Mentees :

  • 4 students

Description: Reinforcement Learning (RL) is a field of Artificial Intelligence where an agent learns by interacting with its environment and receiving a reward/penalty for its actions. RL has recently started receiving a lot more attention, owing to the famous victory by an RL agent over the world champion in the game of “Go”.

This project aims to get people familiar with the field and get started with a hands-on project that will require them to understand and implement some of the popular RL algorithms. We will start with building a simple GUI, and will then try out algorithms like SARSA and Q-learning for training agents in basic environments. We will then move on to Deep RL techniques for handling Pac-Man which involve the use of neural networks for training the agents.


Nothing as such right now. If you’re interested, you might want to read the first chapter of the book “Introduction to Reinforcement Learning” by Sutton and Barto to get a feel of RL. The book is available freely online and a simple google search should do :) Watching the video lectures of the Reinforcement Learning Specialisation on Coursera by The University of Alberta can be a good starting point for getting the hang of RL.


Sutton and Barto ( Coursera ( Enjoy watching this beautiful documentary about AlphaGo by DeepMind - (

Tentative Timeline :

Week Number Tasks to be Completed
Week 1 Basics of RL and some basic implementations. Python beginners should also start learning about the ML libraries. The goal for this week from the project side will be to have a simple GUI from openai gym ready and running for visualization.
Week 2-3 Understanding RL algorithms and implementing them on basic examples such as grid-world and mountain car. Training the agent based on these algorithms and observing the results.
Week 4-5 Understanding the technique of Deep Reinforcement Learning, implementing them, and comparing the results with other available implementations.
Week 6 Wrapping things up, documenting the work done and exploring possible extensions to the work done.

Checkpoints :

Checkpoint Number Progress
1 Getting started with the theory of RL. Getting the GUI ready
2 Understanding simple RL concepts and algorithms like MDP planning, SARSA and Q Learning
3 Implementing the basic applications of RL algorithms to get a hang of them
4 Understanding and implementing Deep RL algorithm for learning for Pac Man
5 Finishing up the implementation by fine-tuning the hyperparameters and removing bugs
6 Documenting everything