[SOUND] It is good to see you back. The second lecture deals with the evolution of a Markov chain over time. Now, recall that in the previous lecture you've seen a definition of a DTMC that stems from the point of view of state-transition systems, where each transition has a label from the interval [0,1] that can be interpreted as a probability. Now, today, I will show you a second definition. That is based on diskrete stochastic processes or the so-called diskrete Markov processes. If you look at a stochastic process, it is described by a family of random variables, X(k), where each k corresponds to a discrete point in time. This is, again, my timeline with discrete time points. And then, for example, here we have x(0) which may take a value from the state space of the market chain. So this is for example S0. And then, here, we have X(1). And this random variable at 0.1 takes another value, for example, S1, and so on. So we say that X(k) takes the value of the state of the system at time step k. You've learned, you've seen that DTMCs are memoryless. Now, using this definition of stochastic processes we are able to formally define this memoryless property. So here you see the probability that Let me choose red for you, the probability that here the random variable X(k) takes the value Sk. Given the history of all the random variables we've seen so far equals the probability that X(k) takes exactly this value Sk. Given only the value of the random variable X at timepoint K minus 1. So this illustrates that in a DTMC you only need to know where you are now in order to derive what the next step will be. You don't need to know the history that has brought you there. So this property allows us to actually move back from this. Sarcastic process point of view to the state base view I've shown you in the last lecture because we do not need to take into account the history of the random variables in this sarcastic process. Now, of course, we're interested in the evolution over time. What can we say about such the DTMC? We can say a DTMC is in a single starting state at the beginning. For example, me with my coin at the vending machine. Or we can say it assumes a starting position according to an initial distribution. For example, you start to look at your favorite web server, let it be World of Warcraft. And then, you have a certain probability that it is up and then the counter probability that it is down. So this gives you a starting probability. Of course, you want it to be up with probability 1. Then, the state of the web server changes. Also the state of the mark of change changes. And the next state is chosen according to the probabilities on the arcs outgoing form each state. What are the sort of questions that we can ask? What is the state of the model after k steps? Also, is there a unique state after k steps, or is there something like an average state? And all these questions will address throughout this module. Another possible question is, can all states be reached and then with what probability? So let us start with the transient evolution of the DTMC and with the so called transient state probabilities. What is the probability of being in a state s' at time t, after exactly k step transitions have been taken? This is the transient state probability. And denoted by Ps' (k). Let's take a quick look what these variables stand for. This P just tells us it's a probability. This s' is the state we're interested in as prime has to be in the state space. And then, k tells us the time point we're looking at. For the initial state, we have, for example Ps. So the initial state is s at time 0. We have a probability of 1. So this is one initial state. And then, we can collect all these probabilities into vectors and into a state distribution. So here, we have on is the transient state distribution for this state's phase at time n, it is a vector with as many entries as we have states in the state space. And it is also a transient discreet probability distribution. What does it mean? Well, first, it means that all the entries come from this interval 0,1. And it also means that all the probabilities for one time point n accumulate to 1. Otherwise, it wouldn't be a probability distribution. Let's come back to the example of the web server. Recall that if the web server is working today, then it will be working with probability 0.9 tomorrow. But now, what we want to know is actually what is the probability that this web server will be working on day three. This is the web server again. And we have two probabilities. p0(n), which is the probability to be in state 0 at day n, so the probability that the server is working. And we have p1(n), which is the probability that the web server is broken at day n. Also, we have an initial distribution and here we are very optimistic and we assume that we always start in the working state. So what can we do now? We can do some calculations. And I want to show you how to derive the probability distribution for day one and then for day two and so on. So for day 1, I want to derive P0 1 is the probability to be inside 0 at day 1. And this is 0.9 times p00. This is the probability that I come from state 0. And then, I stay there with probability 0.9. Plus 0.4 times p1 (0), this is the probability that I'm instead broken at time .0 and then I have to move to state working. And the same I can do for state 1 at time 0.1, so either I move to broken, or I stay in broken. So let me clean this up for you. This is for day one. You can do the same for day two. And for day three. It's just the same computation Now, you may think it is a bit cumbersome to do this explicitly for each state. I completely agree. So this is why we are going to use the matrix notation. For such a DTMC model, you have learned that we have this probability makes it P and now, I would like to ask you to actually build this very simple transition probability matrix. It has this form, and you just need to enter the probabilities from the model. Now, that we've done this, We have this probability matrix with four entries, and we have the initial probability distribution. And now, I show you again this one line that we have derived, but this we can now simply rewrite as P0 times the probability matrix P. And the same we can do now for timepoint 2, 3, or even timepoint n, as indicated here. And this is P at the time (n -1) times the probability matrix P. And now you can even play with this. And you can instead write P0, the initial distribution, times P to the power of n. Now, it may be cumbersome to compute the nth exponential of this stochastic matrix. So I would recommend to stick with an iterative approach and use P(n-1) times P. Now, I've done this for you. Not by hand, of course, for the first 25 steps. What you see is that the probability to be inside 0 so that is the working side drops from 1. And then, it stabilizes around 0.8 and this stabilizing behavior is also called long-run behavior. And the first part where the probabilities change, where they decrease in this case, is called the transient behavior. And, of course, the question is does every system have such a long-run behaviour? If you're interested in the answer I would recommend you to keep watching this lecture series. [SOUND]