Everyone wants to enhance their understanding of machine learning or ML these days. From those who want to see what they can accomplish with ML, to business analysts who need a general understanding of how their organizations' models are trained and deployed. They are all trying to figure out how to do ML.
But what does it really mean to do ML? What is an ML framework? And given that ML has traditionally been a developer-oriented practice, do you still have to be a programmer in this day and age to do it?
In this blog we'll explain how ML frameworks provide a way to describe the ML models we want to create and how PerceptiLabs builds upon those frameworks to make ML modeling more accessible to a whole new range of users.
An Overview of ML, for the non-expert
The process of doing ML can be summarized as four simple steps:
- Find some data to use for training and wrangle it.
- Build an ML model that uses that data.
- Train the ML model with that data so that the model learns to recognize things and then test it.
- Export that model for inference (i.e., to use your model for making predictions, classifications, and decisions in the real-world).
So what is an ML model exactly? Conceptually, an ML model consists of some way to store and organize data, and processes called algorithms which operate on that data. You can think of a model as a function – remember functions from math class? – which takes input(s), does some calculations, and produces output(s).
For example, an ML model can take an image's pixels as its input, break them down into varying levels of data representations (e.g., patterns of pixels, features, etc.), and then produce outputs (e.g., probabilities that the pixels in a medical image represent certain medical conditions). The output can then be used by applications in a variety of ways such as to alert a medical practitioner that they should take a closer look at that medical image when diagnosing a patient.
An ML model typically organizes data into a specific topology (graph) of nodes called tensors. Training that model involves the use of optimizer algorithms, which traverse the graph to update the various data representations it contains, using collections of training data as input (e.g., images along with labels indicating what each image represents). This training data is typically partitioned into three groups:
- Training Data: core training data on which to train the model as previously described.
- Verification Data (aka Validation Data): data used to test model fit during training.
- Test Data: data to test the model against after training, to see how well the trained model handles data it hasn't seen before.
ML modeling is really just creating a fancy graph to store mathematical variables on which to perform statistics and to optimize those variables. But what makes ML models powerful is that they can represent complex data and probabilities, using thousands upon thousands of values which are analyzed and updated during training. Examples of ML models include neural networks which loosely represent data like neurons in our brains, Convolutional Neural Networks (CNNs), and Generative Adversarial Networks (GANs).
From a pragmatic standpoint, groups of tensors are often stored as vectors which can be processed efficiently by vector processing hardware such as GPUs.
Traditionally, building an ML model required very specialized skills and knowledge. In addition to extensive coding and debugging skills, you would also have to be familiar with:
- Various branches of mathematics including algebra, statistics, and calculus.
- Implementing complex math algorithms from scratch.
- Designing graph topologies and representing them using programmatic constructs (e.g., arrays).
- Using primitive debugging tools like simple print statements.
- Designing a trained model format to export to, and writing code to export a model to it.
- creating code that applications can use to host the model (i.e., run the model in the real world, provide inputs to it, and retrieve its output).
On top of this, ML models require close inspection of the code to understand it, visualize the topology, and see how parts of it transform data. As a consequence, these issues often lead to models becoming black boxes.
Thankfully today, we have better solutions.
ML Frameworks to the Rescue, Sort of...
Over the last few years a number of open-source, pure-code ML frameworks have been introduced to help overcome some of these challenges, the most popular being TensorFlow. For the non-programmers reading this blog, a framework is a set of code and applications/tools that has been put together to solve some sort of problem in a simpler way. It’s called abstraction in that it can make things easier by automating certain functions, reducing the amount of code to write, etc.
One of the great aspects of ML frameworks is that they provide a command-like language (often in high-level1 languages like Python) for programmers to describe or specify how the model should be architected and which algorithm and operations should be used. These collections of commands form the application programmer interface (API) and provide a convenient way to construct large and complex graphs by describing the desired topologies in a more human-readable manner. Frameworks also provide a rich collection of features and algorithms, a pre-designed and optimized file format in which to store the exported model, and code/tools for hosting the model. These abstractions save you time because you don't need to code all of this fundamental functionality from scratch, and can focus on your model design instead. You can also take advantage of subsequent fixes and enhancements which are continually released by the framework's community.
However, as great as these frameworks are, they still didn't overcome some of the fundamental challenges of ML modeling. For starters you still need to be a programmer, and even with a high-level language, there is still lots of code to write and review. Furthermore, visualizing the model and debugging it are still difficult because the model simply exists in the form of the programmatic constructs (i.e., many, many lines of code) used to describe it.
The following code snippet2 describes how to build a simple CNN. For you non-programmers- see if you can figure out what is going on:
import tensorflow as tf from tensorflow.keras import datasets, layers, models import matplotlib.pyplot as plt (train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data() # Normalize pixel values to be between 0 and 1 train_images, test_images = train_images / 255.0, test_images / 255.0 # Verify the data class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck'] plt.figure(figsize=(10,10)) for i in range(25): plt.subplot(5,5,i+1) plt.xticks() plt.yticks() plt.grid(False) plt.imshow(train_images[i], cmap=plt.cm.binary) # The CIFAR labels happen to be arrays, # which is why you need the extra index plt.xlabel(class_names[train_labels[i]]) plt.show() # Create the convolutional base model = models.Sequential() model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3))) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation='relu')) # Add Dense layers on top model.add(layers.Flatten()) model.add(layers.Dense(64, activation='relu')) model.add(layers.Dense(10)) # Compile and train the model model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy']) history = model.fit(train_images, train_labels, epochs=10, validation_data=(test_images, test_labels)) # Evaluate the model plt.plot(history.history['accuracy'], label='accuracy') plt.plot(history.history['val_accuracy'], label = 'val_accuracy') plt.xlabel('Epoch') plt.ylabel('Accuracy') plt.ylim([0.5, 1]) plt.legend(loc='lower right') test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)
During training, you would typically run this code many times, looking at the evaluation output, tweaking some values in code, and repeating until testing of the model indicates it can predict with a satisfactory level of accuracy. This iterative process can become very slow since the whole model has to be run on the entire dataset in order to see the output.
Taking it up a Level
PerceptiLabs builds on TensorFlow, by further abstracting its code into commonly-used patterns of code called components, each with an expected input and specific output. In other words, PerceptiLabs essentially writes the code for you behind the scenes. These are represented as visual elements in PerceptiLabs' UI, which you can drag, drop, and connect. We call this our visual API. As well, each components' settings can be easily adjusted through the UI, and for you programmers out there, you also have the option to view and tweak the underlying code. What's more, each component provides a visualization of how it has transformed its input.
This visual workflow does a better job separating model editing from model training than pure-code ML frameworks do. For starters, PerceptiLabs executes each component separately during modeling. This means that as you change your model, PerceptiLabs immediately executes that component, and any dependency components further down the model. And PerceptiLabs performs this execution using only the first sample from your training data source(s) so that each component updates its visualization. This eliminates the need to re-run the whole model on the whole training dataset during modeling just to see if the model architecture seems correct. This in turn, allows for faster model iteration, while leaving the heavy processing of the whole dataset for the training phase.
Speaking of the training phase, PerceptiLabs' also has visualizations for this too. During training, PerceptiLabs shows visualizations for each sample along with a rich set of statistics, all in realtime. These allow you to watch how training is progressing, and potentially stop training much earlier than with pure-code ML frameworks to further adjust settings.
ML frameworks provide a convenient, more human-readable language for describing models, but they don’t go far enough in making ML easier to do. They fail to address common ML modeling challenges like understanding, visualizing, and debugging the model. They are also geared toward those with programming experience.
If you are new to ML, it's important to understand the topologies of fundamental models like neural networks, and appreciate how an ML framework like TensorFlow allows you to describe those models with less code. From there it's easy to see how PerceptiLabs' visual API makes ML modeling even easier and opens it up to more types of users.
Ready to get started? Visit the Quickstart Guide which walks you through how to install PerceptiLabs and build your first model.
For additional information about ML model fundamentals, we recommend checking out An Introduction to Deep Learning with PerceptiLabs, Four Common Types of Neural Network Layers (and When to use Them), and this great YouTube series on neural networks.
1 The level of a programming language refers to how much it abstracts the machine language used to program the hardware. Low-level languages like assembly and C for example, allow very direct control over hardware features like the CPU and memory, while higher-level languages, which are often implemented using lower-level languages, allow for more human-like expressions and representations of real-world concepts and objects.
2 Source of example code: https://www.tensorflow.org/tutorials/images/cnn.