Concerning actors, we have talked about their message-driven or event-driven nature. We have talked about that supervision makes them resilient against failure. And we've seen how routing of messages can achieve scalability. As a conclusion to this part we need to talk about the fourth tenet of reactive programming, which is responsiveness. And we will see that this trait ties all of the four of them together into a cohesive whole. The first we note is that responsiveness means that a system can respond to input within a given time limit. If it is unable to respond and there is just a certain waiting time which you give it that is equivalent to the system not being available. This means that resilience is not achieved if the system cannot respond. No matter how you restart your parts of your actor hierarchy, for example, if that does not lead to the system generating responses in time, then it is all for nothing. One regime in which this is particularly hard to achieve is when the system is overloaded. But even there it holds that if the system does not respond in an overload scenario, then it is not resilient. In the following, we will talk about patterns, how to achieve responsiveness in both the normal and the failure cases and underload. Let us first look at the nominal case and see what we, what we can do there. We had this example in the aggregation patterns where this PostSummary actor fires off requests to different backend services, and aggregates them and then responds to the sender when everything has been received. But what we do here is, we ask one actor and when we get the reply, we ask another one. When we get the reply, we ask the third one. And once we have all three of them, we construct the response which can be result of failure. And then pipe it back to the sender. This adds the call latencies to all these three actors together before the final result can be dispatched. This PostSummary actor will respond quite a lot sooner if we were intead to fire off the requests in parallel. So, we ask in parallel, and then we tie the resulting futures together in one full comprehension, to compute the result, and this will reduce the latency. Obviously, the slowest of these three actors to respond will define how long it takes the PostSummary actor to respond in turn. Once all opportunity for parallelism has been exploited, the next step is to look at the responsiveness of each single component, and try to reduce the latency time there. One thing which should be avoided is that the processing cost depends on how loaded your system is, because that will amplify problems when users start to hit your business for example, and you become successful, and then suddenly the site doesn't work anymore. Because you built in something which is for example order of n squared in the number of current users in your system. This mean choosing data structures and algorithms which preferably exhibit linear or algorithmic complexity. Once you have reduced the time it takes to process a single request as far as is practical or desirable, then you need to add parallelism when needed. For example using the scalability patterns where we talked about different routing strategies and pools which you can use. But inevitably, every system has a certain limit, and once the system's capacity is reached, requests will start piling up. This means that the processing get backlogged, queues fill, and the latency rises for everyone using the system. Eventually the clients themselves will time out. You have seen that on web pages for example, where the browser tells you that the site did not respond. What do you do in that case? Well you go to different site if you have a choice, so this needs to be avoided. In order to achieve this, you can use the circuit breaker pattern and implementation of which comes with Akka. Let's use an example here, where we want to contact the user service and ask it about a certain user. We want to make this resilient against the user service being overloaded or just constantly failing. So this ask returns a future and the circuit breaker wraps this future and looks at whether it succeeds and when it succeeds. The configuration of it is given here. The call timeout is one second. So it checks for every future put in here, whether it was completed within one second. And if it was not, it increases a failure counter. And when that reaches three, then the circuit breaker will open. And all subsequent requests will fail immediately wthout contacting the user service. This takes the pressure off the user service and makes the system respond a lot faster. Then, as requests keep coming in, every 30 seconds the circuit breaker will allow one request through, to see if it succeeds. And if that is the case, then the circuit breaker closes again, and things will proceed normally. But if that also fails, then it will open again for another 30 seconds. You'll notice here that the timeout for the ask operation is two seconds, while the call timeout for the circuit breaker was one second. And that can come in handy, that you say a single request may take two or even 5five seconds, but if three in a row are slower than one second, I want the circuit breaker to trip. This pattern is a good way to separate two components such that failures in one do not influence the other. But it does not completely isolate actors if they happen to run on the same dispatcher. Therefore, the last thing we need to consider is to segregate the resources available to different parts of your system to make them independent from each other. For example, you have that part of the system which is responsible for sending a response to the client. And that part needs to function as long as possible, even if all the back ends services are down. This can be achieved by configuring these actors to run on different nodes for example, or on the same host to run on different dispatchers. You can configure this if you create the props of your actors and say, with dispatcher, saying for example compute jobs in this case. And that will make this actor run on a different thread pool than its parent for example. If you do not specify this, actors run on the so-called default dispatcher. This is the configuration section which is the current default in Akka, and it says that the executors is a so-called fork-join-executor. And it uses minimum eight threads, maximum 64 threads. And in that it defaults to three times the number of your CPU cores that you have available. You configure another dispatcher just by putting another config section in your file with the name you gave in the with dispatcher method. And then you can just say, compute jobs and configure it. For example, fork-join-executor and this locks it down to exactly four threads, which you reserve for this kind of compute job. While doing this, you should keep in mind that configuring many more threads than you have CPUs in your system can defeat the purpose of this bulk heading. Because then these threads will compete for the available CPU cores. Detecting failure in distributed systems takes time because the only thing you can observe is that you do not observe anything. And, you need to give yourself a timeout for that. There are systems where you can afford to wait and you have for example a back up system which you need to switch to immediately. But obviously, this is limited in latency to how frequently you check that the prime system is up. Where there is not enough you can do nothing but immediately always send to all, so you have for example three systems. And you have the requests coming in. Nominally you would use A and then you would want to switch to B or C as needed. But if that needs to happen within a millisecond for example, there is no choice, you always need to send to all of them to keep them in sync. Because within a millisecond, you cannot transfer all state between A and B. And then what you can do is, like for example, in highly redundant satellite systems, get the responses from all three, and as soon as you have two responses and they match you can, send the reply. And if you notice that one of the nodes does not send within the time allotted, then you can shut it down and request a new one to be added. Such a scheme would allow you to be always responsive, even if one node fails. We have seen how event-driven systems can scale vertically, because you can dispatch the processing of events to any number of CPU cores in your system. And if you add the ability to send events over the network, so you make your system location transparent, that adds horizontal scalability, because you can run your computation on whole cluster of nodes. But the quality which we want to achieve in the end is that the system which we construct responds to inputs, giving the correct outputs. And this demands not only scalability, which we get by being event-driven and location transparent, but also resilient, which means that failure is contained and fixed by delegation. Therefore, we can see how responsiveness ties together all the principles of reactive programming including, resilience, scalability, and its event-driven nature.