Hi. In this video, we will train our first model in TensorFlow. To do that, you will actually need to know that TensorFlow, which is a deep learning framework, has optimizers that can do backpropagation for you. Let's look at the simple graph that has variable x, which is a scalar, and that has one operation taking a square of that variable which is stored in tensor f. Let's say that right now we want to minimize the value of f with respect to x. To do that, you can actually call tf.train.GradientDescentOptimizer with learning rate 0.1, and these will yield you optimizer that you can further use to minimize the value of f with respect to x. That minimize call will give you an operation which we call here step, and you will be able to run that operation in your graph as many times as you wish. The first thing to notice though is that you don't have to specify all the optimized variables because TensorFlow already knows your graph, and it already knows which variables are needed to compute the value of f. So, that's why you can omit that list of variables with respect to which you have your minimization. You can do that because all variables that you create in your graph are trainable by default. You can change that. You can say that your variable will not be trainable and that means that it will not be updated during backpropagation. You can get all trainable variables with a call tf.trainable_variables, and for our graph is just a variable x which is a scalar. The next thing is we will have to make 10 gradient descent steps, and you can do that with this simple Python loop. In each iteration of that loop, you run our step operation which does gradient descent, and which does back propagation for you. I also want to get the values of x and f, so that I can print them and see the progress. After running this code, you will see something like this, the value of x goes to zero and the value of f goes to zero as well. So, our minimization seems to work, right? But the problem is that the values that we have are not synchronized, because when you run your operations in your session, the order of operations is not fixed. So, you can see here that the value of x is already smaller than the respective value of f. But this is not a problem because there is a different thing for that. To get a synchronized output, you will have to do the following thing; first, just right after creating your operation f which is the square of x, you will have to pass it through tf.Print function which will not change its value but will print the values of x and f along the way. So, later, when you will run your gradient descent steps and you will evaluate these tensor, you will have these beautiful logging in your stdout which will output synchronized values of x and f. But one thing to notice is that you will only see it in Jupyter Server stdout, it will not be visible in the Notebook because there is a bug in TensorFlow and it is still not resolved. But you can do logging even better with TensorBoard. This is a visual tool to see the plots of your loss functions and any other statistics. But first, before using TensorBoard, we will have to add so-called summaries to our graph. You define a scalar summary for our scalar x and you give it a name current x. We also have a summary for f. To merge all these summaries in one single node, we call tf.summary.merge_all. This is how we use these summaries. Just after you created your interactive session, you also need to create a so-called summary_writer and you say where to store these logs. Notice that it is usually a good convention to append the wrong number at the end of this path because it will be useful further during visualization. Then, after you initialize your variables, you can do 10 steps of gradient descent but when you do that, you can also evaluate your summaries, and it will give you current summaries that you will be able to write with that summary writer. Also, don't forget to flush that data on disk so that you see that in a beautiful web interface. You can run that interface with the simple comment like this, tensorboard --login=./logs. Then, you can open it in your browser and you will have the following screen. Here, you will see how your summaries, current x and current f are changing through iterations, you actually can see that here in red square, you see your different runs. So, if you rerun your script with a different number at the end, it will be here and you will be able to select which run you want to plot, or you can plot them all and compare for example your neural network architectures. So, this is pretty neat technique. You can also visualize your graph in TensorBoard. You can see that gradient computation, for example, which we got from a backward pass is also a part of a graph and that graph of derivatives is also here but it was created for you, you don't have to mess with that. Now, let's try to solve a linear regression. But first, let's generate a model dataset. Here, we'll have 1,000 points in three dimensions. We have features which are randomly uniform. We have weights which are random as well. Our target is approximately a linear combination of our features with some random noise so that it becomes a little bit more like a real dataset, right? To solve linear regression we will need some placeholders for our input data. The first placeholder is for features, it will be of shape, something by d, because every instance has precisely d features. Another placeholder is forward target, it is just a single column. We will actually need some operations to make predictions, and here we just take a product of our features with weights metrics, and effectively it applies a linear model to all our instances. Let's store that result in predictions tensor. To optimize our parameters, we will have to define a loss function. Here, we define it as a mean squared error. We need to compute it in terms of TensorFlow operations as well, so that TensorFlow will be able to run backpropagation and optimize all the variables that we use in our loss computation. To actually run backpropagation, you will need optimizer. We created with tf.train.GradientDescentOptimizer, then we call minimize method and we pass our loss function so that the step that we have will do backpropagation and will do gradient descent. So, here, we run our gradient descent step in 300 iterations, and first we create an interactive session, we initialize our variables, and we are prepared to run our step in 300 iterations. To actually run backpropagation here, we need to feed the values of features and target placeholders because you should remember that we have used them as placeholders when we were defining the graph and right now we want to execute it, so we need to fill those placeholders. I will also print the current loss every 50 iterations. So, after running this script, you can see that our ground truth weights are the following; they are random, and because it's a model dataset and you can see the these code actually finds the weights that are pretty close to our ground truth weights, so backpropagation seems to work in this case. If a training process takes a long time, you will benefit from model checkpoints, because you can save all your work and continue if it's interrupted. To do that, you will have to call tf.train.Saver and pass all the trainable variables in there. You will be able to use these saver to save all the trainable variables in your iterations to a desired location on disk. Then, you will be able to retrieve last saved checkpoints and you can choose which one you want to restore with a restore method. One thing to note though is that these checkpoint contains only tensors and their values. It doesn't contain graph definition, that means, that you will have to define your graph in exactly the same way that it was before restoring a checkpoint. So, this is a downside of this check pointing but at least you can save your work. So, to summarize this video, you should know that TensorFlow has built-in optimizers that do backpropagation automatically for you and you don't have to write it with your hands. TensorBoard is a great tool and it provides visualizing and you can visualize your training process, your loss functions and any other statistics. TensorFlow also allows you to checkpoint your graph so that you can restore it later and you can save your work and continue where you left off. But you need to define it in exactly the same way which is a downside but at least it's possible. So, this video concludes this week. I want to wish you good luck in our programming assignments.