Machine Learning

From Grundy
Jump to: navigation, search


Traditional software development requires programmers to specify precise instructions to a computer, via a programming language. However in Machine Learning, a programmer provides a template (more formally termed as model) to a computer for a given task. A computer attempts to learn precise instructions (compliant with the provided model) automatically, via pre-processed data (Supervised Learning, Unsupervised Learning) or interaction with a environment (Reinforcement Learning).

Machine Learning has recently received a lot of media attention due to its recent success. Machine Learning is playing an increasingly important role in our lives, and many popular tech companies utilize a Machine Learning arsenal to improve their products. To get you interested, here is a list of exciting breakthroughs in Machine Learning -

Various kinds of ML Models

There's actually more to the story: The above training paradigm where we use 'labelled' examples(this is a tree; that is not a tree) to train our algorithm is called Supervised Learning.

The paradigm that has been garnering the greatest attention of late is called Reinforcement Learning, finding use in training AlphaGo and other game-playing and behavioural AI. Exciting ideas like Genetic and Evolutionary algorithms are subsumed by this paradigm. Check out this video where a bot learns to play Super Mario. The idea is to reinforce correct behaviour and deter incorrect behaviour, often via a stochastic reward mechanism. More technically, the algorithm must minimise a cost function in a learning environment. Lots of fascinating math comes into play and this field sees the direct application of Game Theory and Markov Decision Processes.

The last broad flavor of ML is called Unsupervised Learning. The algorithm is not given any input or reinforcement at all. Only data. It must draw it's own conclusions or observations. This is extremely challenging due to the open ended nature of the problem and largely lies unsolved. Most of the work in this field has been based on Pattern Finding and Clustering.

Basic Regressor-Classifier Models

Decision Trees & Random Forests

Unsupervised Models

Deep Learning

Neural Networks (Feedforward)

Convolutional Neural Networks

Recurrent Neural Networks

Finally, the most awesome kind of networks which make use of memory and feedback. First up, a blog to boost your spirits - RNN Effectiveness.

Learning Machine Learning

There are tons of resources online. Here are our recommendations for deep learning -

Machine Learning Platforms

Machine learning today is powered by very efficient libraries that run on GPUs. Some of the exciting libraries to look at are,

  • TensorFlow - The tool powering all your favourite Google products - Gmail, Google Translate, Google Search, Google Speech etc. TensorFlow was recently made open source on Github. TensorFlow has a python API making machine learning easier and efficient. Have a look at our TensorFlow tutorial to find a list of TensorFlow resources.
  • Keras
  • Torch - Have a look at our Torch guide.
  • Theano
  • scikit-learn

Motivation: Why Implement from Scratch?

It is always a good practice to implement simple models like Neural Networks, RACTs, clustering etc. from scratch, at least once. For real applications, libraries are preferable as they have a large team working on it and have evolved with time, but understanding the underlying mechanism always helps in scenarios. Also, there may be times when no single library can help and you would have to get your hands dirty! Also,

  • it can help us to understand the inner works of an algorithm
  • we could try to implement an algorithm more efficiently
  • we can add new features to an algorithm or experiment with different variations of the core idea
  • we circumvent licensing issues (e.g., Linux vs. Unix) or platform restrictions
  • we want to invent new algorithms or implement algorithms no one has implemented/shared yet
  • we are not satisfied with the API and/or we want to integrate it more "naturally" into an existing software library

Let us narrow down the phrase "implementing from scratch" a bit further in context of the 6 points I mentioned above. When we talk about "implementing from scratch," we need to narrow down the scope to make this question really tangible. Let's talk about a particular algorithm, simple logistic regression, to address the different points using concrete examples. I'd claim that logistic regression has been implemented more than thousand times.

One reason why we'd still want to implement logistic regression from scratch could be that we don't have the impression that we fully understand how it works; we read a bunch of papers, and kind of understood the core concept though. Using a programming language for prototyping (e.g., Python, MATLAB, R, and so forth), we could take the ideas from paper and try to express them in code -- step by step. An established library, such as scikit-learn, can help us than double-check the results and to see if our implementation -- our idea of how the algorithm is supposed to work -- is correct. Here, we don't really care about efficiency; although we spend so much time to implement the algorithm, we probably want to use an established library if we want to perform some serious analysis in our research lab and/or company. Established libraries are typically more trustworthy -- they have been battle-tested by many people, people who may have already encountered certain edge cases and made sure that there are no weird surprises. Furthermore, it is also more likely that this code was highly optimized for computational efficiency over time. Here, implementing from scratch simply serves the purpose of self-assessment. Reading about a concept is one thing, but putting it to action is a whole other level of understanding -- and being able to explain it to others is the icing on the cake.

Another reason why we want to re-implement logistic regression from scratch may be that we are not satisfied with the "features" of other implementations. Let us naively assume that other implementations don't have regularization parameters, or it doesn't support multi-class settings (i.e., via One-vs-All, One-vs-One, or softmax). Or if computational (or predictive) efficiency is an issue, maybe we want to implement it with another solver (e.g., Newton vs. Gradient Descent vs. Stochastic Gradient Descent, etc.). But improvements concerning computational efficiency does not necessarily need to be in terms of modifications of the algorithms, we could also use lower-level programming languages, for example, Scala instead of Python, or Fortran instead of Scala, ... this can go all down to assembly or machine code, or designing a chip that is optimized for running such kind of analysis. However, if you are a machine learning (or "data science") practitioner or researcher, this is probably something you should delegate to the software engineering team.

Tips for applying ML

  • Use pickle to save trained model as objects, which can be called easily, even after the kernel has been stopped.

See Also