Deep Learning 101 – A Real World Example

Machine Learning

Today, I want to share with you a little known secret. Deep Learning, specifically a branch of Machine Learning, isn’t hard to understand! I know, I know, with all the complex math & vocabulary like Epochs, Activations Functions, and Hidden Layers it can feel overwhelming. However, in this blog post, I will break it down and keep it simple. I’ve found in my journey of learning TensorFlow, the best thing to do is just get started and begin with the simplest of examples and build a model. But first, you might ask, what is a model?

Google has published an encompassing Glossary of Machine Learning Terms on this subject. It’s fun to thumb through some of the terms and learn about some of the more abstract areas of Machine Learning. Google defines in this glossary therefore as:

“The representation of what a machine learning system has learned from the training data.”

Fair enough. So basically, this is the core of how the system will give us the results that we are looking for. In essence, it’s the main algorithm of how we will derive our results and it’s based on the data that we have given it to learn from. Before we go further, I believe there’s another more basic concept that should be covered and that is why Machine Learning is such a different paradigm than how we have handled programming in the past. First a bit of history.

History

When computers were first being programmed, and even to this day, many people think of them as large calculators and conditional logic. These computers can handle vast amounts of decisions and complex operations much faster than any human could. The approach was that as a human you would program the computer, that when it received certain input, it would respond with a specific output. For example, a program might read

user_input_1 = [get keyboard input]
user_input_2 = [get keyboard input]
sum = user_input_number_1 + user_input_number_2
Print “Your sum is: [sum]”

If you haven’t seen this diagram on the internet before, I believe it does a great job of describing the differences between traditional programming and Machine Learning.

traditional vs machine learning

Source: Machine Learning Mastery

Having been writing software applications starting at a very young age in elementary school and then for my entire career, this idea that you could feed data and output and get a program in return completely blew my mind the first time I started looking at it.

If we step back and understand that the formal definition of Machine Learning reads:

“Machine learning (ML) is the study of computer algorithms that improve automatically through experience.”

It’s this data and output that makes up the experiences and the program then becomes more and more intelligent over time. In fact, there’s a very human aspect to all of this if you believe in the Bayes Theorem, which I highly suggest you dig into and read more about, but in short, the theorem states:

“Initial belief plus new evidence = new and improved belief”.

There’s that “new evidence” (or what is covered in the ML definition of “experience”) piece again.

So, what have we learned? We’ve seen that Machine Learning is really the art of taking data that have known outcomes and feeding them into a machine that then comes up with the algorithm to match that data. The beauty of this is that all those millions of lines of code in a program that might have been needed to try and handle the data and give a consistent output can be avoided. The machine does the work!

Our First Model

Let’s finally get to it doing some work and building our first model.

Let’s say we have some data that looks like this:

Input:  0,  10, 12,   25, 35
Output: 32, 50, 53.6, 77, ?

Would you be able to calculate what the “?” would be in this series of output?

If you hadn’t guessed already, the answer is 95. Through a series of trial and error you might recognize that the values are related by the equation:

Output = Input * 1.8 + 32

You might also realize that this is the same equation as converting from Celsius to Fahrenheit that reads:

F = C x 1.8 + 32

What your brain just did to figure out these relationships, that’s exactly what Machine Learning does. Not so magical is it?

In traditional Software Development we would write a function like this:

def function(input) {
return input * 1.8 + 32
}

In the world of Machine Learning we have this function showing a series of inputs and outputs :

def function(input) ← 0, 10, 12, 25, 35 {

< What will generate the output givein the input ? >

return → 32, 50, 53.6, 77
}

The < ? > is where Machine Learning comes in and specifically, we use what is known as a Neural Network to figure out the best match of inputs to the outputs. It does this by going through a training process with data that we call our “Training Set”. We’ll get into more details on how a Neural Network works in future posts, but for now, think of it as a function that has variables that will be tuned to produce the correct output given the known inputs. It uses a network of what are known as neurons (nodes) that make decisions based on weights.

single layer neural network

Single Layer Neural Network

TensorFlow and Google Colab to the Rescue

So enough about how the human would solve this problem, let’s solve it using Machine Learning. The first step on our journey is to learn about Colab!

From Google’s Colab FAQ, “Colaboratory, or “Colab” for short, is a product from Google Research. Colab allows anybody to write and execute arbitrary python code through the browser and is especially well suited to machine learning, data analysis, and education. More technically, Colab is a hosted Jupyter notebook service that requires no setup to use, while providing free access to computing resources including GPUs.”

Let’s Jump into Colab!

If you want to jump ahead and run an already defined notebook that goes through the exercise, you can watch the video that I have prepared. The notebook that we will cover can be found here.

After you watch the video and go through the notebook, return here for a preview of what’s coming in the next post.

What Have We Learned?

Hopefully, you now have a better understanding of what Machine Learning is and how it can be applied to a relatively straightforward application of creating a model that will convert from Celsius to Fahrenheit.

In this tutorial you have:

  1. Been exposed to using a popular platform called Tensorflow and an online website to explore and train models called Colab.
  2. You have created your own Colab notebook to build a model that converts an input of Celcius to Fahrenheit.
  3. Seen how I went through the process of taking training data, assembling the layers of a Neural Network
  4. Using the Loss and Optimizer functions to compile a model. We then went through the steps of training the model before we tested the model with some predictions.

Another way to look at how Machine Learning works is with the following graphic.

MachineLearning101-Input-Output

The input data and output data were fed into the system and the formula was calculated based on that data.

Amazing! If you looked at my Colab workbook and watched my video, you’ll see that all of this was done with just a few lines of code in TensorFlow.

l0 = tf.keras.layers.Dense(units=1, input_shape=[1]) 
model = tf.keras.Sequential([l0])
model.compile(loss='mean_squared_error',optimizer=tf.keras.optimizers.Adam(0.1))
history = model.fit(celsius_q, fahrenheit_a, epochs=500, verbose=False)
model.predict([100.0])

Where we successful predict with very good accuracy than 100 degrees Celcius to ~ 211.7 degrees Farenheight

Gradient Descent

From the diagram below, you’ll see that once a value is predicted, the difference between that predicted value and the correct value is calculated. The amount that is off is called the loss. It’s a measure of how well the model performed mapping the input to the output. After the loss is calculated, the internal weights of all the layers of the Neural Network are adjusted, so as to minimize this loss — that is, to make the output value closer to the correct value. This optimization process is called Gradient Descent. As you get more in Machine Learning you will learn more about these concepts and I’ll be working to cover them in future blog posts.

MachineLearning101-Optimization

Awesome! So, What’s Next?

You might have noticed at the end of Colab that we created, we saved the model. Why is this important?

While looking at and building models out of Colab is interesting, it won’t help us if we want to take this model and apply it in our software application as a component that other parts of a system might want to call through an API or other components. For that, we’ll have to deploy our model as a service.

In the next post of this series, we’ll talk about some of the ways in which we use a trained model that will allow for predictions to happen both from a server called TensorFlow Serving or on a device using TensorFlow Lite.

More on that coming soon. For now, I hope you have enjoyed this tutorial, and please reach out to me at justin@lab651.com if you have any questions or wish to talk further about how I can help bring Machine Learning to your organization. Until next time!

Contact Us