Excited by the hype around machine learning (ML)? Perhaps you’d like to transition into the field but you’re not sure where to start. Which type of ML practitioner would most suit you – a programmer who implements algorithms, a data scientist who wrangles data, or a project manager who specifies model requirements?
To help point you in the right direction, we've put together a short list of some specific hard skills (math and programming) you'll need, along with tips and pointers on how to steer your career towards the exciting field of ML.
From Nada to ML Model
In order to understand why hard skills grounded in math and programming are important, it's important to first understand that ML involves algorithms, a sequence of tasks or instructions to perform computations. In ML, algorithms are primarily based around calculating probabilities. Examples of ML algorithms include k-nearest neighbor (k-NN), Decision Trees, Clustering, and others. See our Machine Learning Handbook for additional information.
The current state-of-the-art branch of ML is deep learning (DL), which involves the creation of models, computational graphs for storing probabilities that certain things are true. For example, a neural network contains various layers of probabilities and weights. The model is trained with data (e.g., an image), and each subsequent layer stores information as to whether certain elements exist in the image such as pixels, groups of pixels, features, etc. The final layer contains the predictions (e.g., probabilities that a given image represents a certain classification).
ML practitioners such as developers and data scientists, iteratively train the model using existing data (e.g., collections of images and labels) to produce an outcome (e.g., classification) until they're satisfied that the model could accurately do this on real-world data when hosted by an application for inference.
To build and understand ML models, it's important to have some foundational mathematical knowledge:
- Calculus: as the study of continuous change, calculus plays a critical role while training a model. The process of optimizing the model uses algorithms like gradient descent to minimize or maximize the objective.
- Linear Algebra: used in many areas of ML, most notably for its multi-dimensional data structures like vectors, matrices, etc. which store probabilities as tensors. The associated operations (e.g., matrix math) are now hardware-accelerated by vector-processing hardware like GPUs. For additional information see: 10 Examples of Linear Algebra in Machine Learning.
- Statistics: no matter how intelligent AI appears to be, ML models are ultimately driven by statistics. For data preparation, statistical methods can be used for sampling and outlier detection, and also be used in modeling for understanding and comparing training results. For additional information see: Statistics for Machine Learning.
ML practitioners commonly use the Python programming language due to its concise syntax and relative simplicity to learn. With Python, ML developers can describe their models and algorithms in a grammar that more closely resembles English rather than a cryptic programming language. It includes a rich set of constructs (APIs) for math and vector operations, as well as extensibility through module imports. Other programming languages used in ML include R, Scala, Julia, and Java.
You should also become familiar with TensorFlow, a rich open-source ML framework written in Python. Arguably the most popular ML framework, TensorFlow's APIs construct a data-flow graph of nodes and edges through which tensors flow (hence the name TensorFlow). TensorFlow allows ML developers to construct a variety of model topologies.
Courses and Projects
While ML was once reserved for PhD researchers, the growth of open-source tools, frameworks, and datasets has helped democratize it. You can also now find many ML courses online (e.g., coursera) ranging from simple tutorials to degree programs.
ML practitioners often start out with the classic problem of identifying images of handwritten digits using the MNIST dataset. For beginners, we recommend learning how to solve this problem by following this excellent YouTube series.
Once you have the foundational knowledge, enhance it by building some projects which can also serve as examples in future job hunts. These can include course assignments, hobby projects, and open-source projects solving real-world problems. You might also consider competing in ML competitions (e.g., Kaggle competitions) and referencing those projects as well. Placing your projects on GitHub can also be beneficial as it's used extensively by the ML community.
Of course, training ML models requires data, so we recommend building a list of quality data sources. Start with PerceptiLabs' Top 5 Open Source Datasets for Machine Learning and then expand your search through online resources like Kaggle.
Accelerate Your Understanding of ML with PerceptiLabs
PerceptiLabs is becoming the tool of choice for career professionals and is also a great tool to learn ML and TensorFlow. PerceptiLabs is built on TensorFlow and offers a visual API that abstracts common TensorFlow patterns into high-level components that you visually piece together into a model. Each component visualizes how it transforms its input so you can easily understand your model.
Using PerceptiLabs is also much easier than writing raw TensorFlow code (although you can still do so in PerceptiLabs). By visually building your model and adjusting settings, you can quickly learn the model's transformations and view the underlying code that was generated for you. In fact, a number of post-secondary schools are now considering PerceptiLabs to help shorten the time it takes to learn TensorFlow. Check out our Quick Start Guide for information on how to get started.
The Journey Begins
If you are considering a transition to ML there have never been more tools, data, and resources available at your disposal to enhance your ML knowledge. With the right foundational knowledge, the right connections, and some perseverance, you too can make the transition into the exciting and evolving field of ML.