WnCC - Seasons of Code
Seasons of Code is a programme launched by WnCC along the lines of the Google Summer of Code. It provides one with an opprtunity to learn and participate in a variety of interesting projects under the mentorship of the very best in our institute.
List of Running Projects
- Browser Based PDF manager
- Resume Script Generator
- Physicc : A Simple Physics Engine
- Image Colorization
- Language Model Based Syntax Autocompletion in a Text Editor
- Computer vision based web app
- Cribbit Cribbit (Open for PGs Only)
- Techster Texter
- Language Detection
- Book Tracker
- ResoBin - Not the bin we deserve but the bin we need!
- Agree to disagree
- Watson (World's smartest assistant in your pocket)
- Meta Learning - Learning to Learn
- Break free of the matrix, by building one!
- Procedurally Generated Infinite Open World
- Introduction to App Development
- PAC MAN
- Introduction to Web Development
- Goal ICPC
- Traffic congestion modelling and rendering
- Tools for Data Science
- Machine Learning Based Metropolitan Air Pollution Estimation
- Audio controlled drone
- NLPlay with Transformers
- DIY FaceApp
- A Deep Dive into CNNs
- Competitive Coding
- Snake AI
- Facial Recognition App
- Gaming meets AI !!!
- R(ea)L Trader
- Computational Geometry
- Deep reinforcement learning - 2048 AI
- Reinforcement Learning to Finance
- Developing Hybrid ANN-Statistical Model for Robust Stock Market Prediction
- Astronomical Data-modelling and Interpretation
- Visual Perception for Self Driving Cars
- Convolutional Neural Networks and Applications
- Quantum Computing Algorithms
- Algorithm Visualizer
- Anime Club IITB Website using Django
- Machine Learning in Browser
This project aims to develop a source code plagiarism detector using Python.
First step will be to implement a basic bag of words approach by file parsing. After that, some language specific preprocessing like renaming of variables, usage of macros, etc. can be integrated to improve accuracy on a specific language (like c++). Based on the results, we will further add some machine learning techniques like k-nearest neighbours to further improve the results.
This project can also be extended to compare source codes with online available codes using Google Search API (for example) if time permits.
We expect the students to go through some of the references mentioned and do some research of their own and include some of their ideas related to the project topic in their proposals. More importantly, we look for enthusiasm in students which will be judged by the effort they put in their proposals.
- Bag of words approach:
- Basic Python tutorial: https://www.w3schools.com/python/python_intro.asp
- File parsing using Python: https://stackabuse.com/read-a-file-line-by-line-in-python/
- KNN: https://towardsdatascience.com/machine-learning-basics-with-the-k-nearest-neighbors-algorithm-6a6e71d01761
- A research paper on this approach: http://ijmlc.org/papers/50-A243.pdf
|Week Number||Tasks to be Completed|
|Week 1||Learn basics of Python and file parsing|
|Week 2||Implement a basic bag of words approach|
|Week 3||Add some preprocessing specific to language syntax|
|Week 4||Integrate KNN|
|Week 5||Final touch and presentation|
|Bonus||In case we meet deadlines earlier than planned, we can integrate Google Search API to search on online available codes.|
|1||(4th April) - Implement bag of words with similarity percentage|
|Rest||Same as week schedule|