Now looking into the video frames that are in the fragments, we have I frame for intra-coded frame, we have P frame for predicted frame, then we have B frame for bi-predictive frame. An I frame is an independently encoded picture. So, if you get an I frame and you try to present it on the monitor, it works well because it has all information for the image fully-contained. Then, in order to increase the compression rate, we use P frames and B frames. A P frame depends on, one, previously decoded I or P frame, and this is to increase the compression rate such that these P frames are smaller compared to I frames, because compared to the former I frame, only the parts that are different, that have changed in this new frame, following frame will be encoded inside, and that increases the compression rate significantly. Then we have B frames which depend on multiple previous decoded pictures. What does this mean? Why would you need to rely on multiple decoded pictures? Think of it this way. If you had two images, and the overall background depended on one image, and a certain object that newly shows up depends on another image, then what if you just took those two to combine to create another image. Then you will be using multiple former images to construct your new image. If it works that way with a little bit of image information, you can create the new image at a very low data rate. So, therefore, the compression becomes very powerful and the resulting file becomes very small, which is exactly what we want. That is, one example, a relative example compared to what the B frame is actually doing. P and B frames have higher compression rates which make the video streaming file very small, and that is wonderful. This encodes the motion-compensated differences, which is the relative information only, of other frames, and that is how it gets to make these frames so small such that the compression rate is very high. This provides advantages in, when you save a video file on your device, it takes just very little space. In addition, when you request for it to be sent to you, just a little bit of dataflow will enable a large amount of video playback to become possible. So it is a win-win as in terms of networking and also storage. The video frames are based on video fragments, and these are made up of one or multiple GoPs, G-O-P-S. This stands for groups of pictures. There is, each group of picture begins with an I frame and then it is followed by P or B frames. This results in a much smaller video file size compared to using all I frames because the I frame is large because it contains full information of an overall image. Very high data compression rate is achieved due to the P and B frames used in a GoP. There are two types of GoP. One is a close type in which the P and B decoding depends only on frames within that GoP. However, an open GoP is when a P or B frames will depend on frames that are beyond the GoP range of where they are contained within. So, when you're using an open GoP, you need to have some other GoPs opened up and kept in the memory because an open GoP will have some P and B frames that will need to take information from those earlier GoPs that were already opened and played on the video. However, when you have closed GoP, then the overall buffer size you need to dedicate in terms of memory allocation for the video streaming playing can be just smaller because everything is self-contained, you open up a GoP and all of the relations are within that GoP. So, based upon closed and open, there are ways that you can achieve different levels of buffers and also how do you play the video back will be dependent on which type of GoP you're using. HTTP progressive download, one of the most widely used pull-based media streaming methods, and that is because it has ubiquitous connectivity. Almost all mobile devices, PCs, laptops, tablets, and pads, all use this technology. So, this is why it's so popular and why you need to learn about it. So, here we go. It is used in almost all applications and web services over the Internet. YouTube uses HTTP over TCP, and various other video streaming technologies use it as well. TCP supports recovery of errors reordering of data segments, and erroneous and lost packets are recovered by the TCP. YouTube needs to provide adaptive transmission control to avoid video stalling, and that is exactly what MPEG-DASH does. Pull-based media streaming based upon HTTP, this is the most common protocol that is used for video service delivery and various media streaming delivery. The media client is the active entity that request content from the media server. The server responds depends on the request from the client. So you can see that there is a direct pull mechanism and unless it is requested, the server will not send it to your smartphone. In addition, the client decides the bitrate to receive the media packets. The bitrate deciding factors are based upon the buffer status of the client and available network data transfer rate and delay. HTTP is an application protocol that enables data communication for the World Wide Web, and it is very popular. HTTP supports distributed, collaborative, hypermedia information systems. In addition, HTTP is the protocol used to exchange or transfer hypertext. Hypertext is structured text that is used in hyperlinks between PCs, smartphones, and servers. This is what the website addresses are used and requested through, text with special formats that are enabling website and various hypertext support services. So, you can see that HTTP is needed and naturally, it is very popular. Based on the fact that HTTP is based on a pull mechanism, makes it even better, and these combinations come together in MPEG-DASH. Here, HTTP has a history that is quite interesting. It was created in 1989 by CERN, which is the European Organization for Nuclear Research. It was standardized by the IETF, the Internet Engineering Task Force, and the World Wide Web Consortium. So, from the beginning, you can see that it was made to be used as a web Internet support mechanism, and it is definitely serving its original purpose. HTTP versions include 1.0, which is the original HTTP. This makes a separate connection to the same server for every resource request that is received. An upgraded version is in RFC 2068, and that is version 1.1. This was released in 1997. In this mechanism, it can reuse a connection multiple times for downloading video, audio, and data. So definitely as you can see that, there is a significant boost in its efficiency on how it can be used to support multiple connections. HTTP/1.1 was revised and RFC 2616 was released in 1989, replacing its former version RFC 2068. This commonly is what is used in the internet nowadays. HTTP/2 was standardized in 2015, and is now supported by major web servers. The HTTP functions as a request-response protocol based upon a client-server computing model. So, it is a request that is made and a response is sent to it. The typical pull-based mechanism, in which who is making the pull request? The client is, and who is responding? The servers. An example would be like, the client is a web browser and the server is the application running on a computer hosting a website. Here is an example where a client sends a HTTP request message to the server. You can see the request message that is based upon a get URL and HTTP version 1.1 is being used, and you will see that there is a ''OK'' message that is sent back where the resources data is returned back from the server to the client. Or it could be the other way around as discussed in the example in the former page. Server provides resources such as HTML files and video content, or functions to the Client. The server sends back a response message to the Client, and the service response includes status information of the request processed. HTTP permits intermediate network elements to improve or enable communications between the Client and Server. HTTP is an application layer protocol that needs a reliable transport layer protocol, and this is why HTTP works well with TCP. Because TCP does session control and it has a window based control in addition to it that works with the session control and therefore it is made very reliable. So, HTTP with TCP good combination, and that is why YouTube video uses it. HTTP can also use unreliable protocols for example, like it does in HTTPU and SSDP (Simple Service Discovery Protocol). HTTP resources are identified in the Internet using an URL, and that is where HTTP works so well with. Because it considers it's original URL based connectivity support. URLs use the URI of HTTP and HTTPS. The URIs are hyperlinks in HTML, which is Hypertext Markup Language. These documents form inter-linked hypertext documents. HTTP is commonly supported by firewalls, so connection blocking due to security issues rarely occur. Once again, because it is based on pull-based mechanism. Client device controls the streaming. It is based on a pull mechanism. The Server does not need to maintain session state information of the client device. Once again, because it is based on a pull-based mechanism. HTTP/1.1 can reuse a connection multiple times, and that was one of the benefits of 1.1. It can use it for downloading video, audio, data. Popular websites can benefit from web cache servers. They can send previously stored contents on behalf of an upstream server to reduce the response time. Such that this is one way to say of course the response time to make the response time faster. In addition, it prevents unnecessary traffic being exchanged over the internet, therefore, helping to improve the Internet quality. HTTP proxy servers at private network boundaries can facilitate communication for clients without a globally routed address. This is enabled by relaying messages with external servers. A combination of Adaptive Video Rate Control and Progressive Downloading. Typical techniques that use Adaptive HTTP Streaming in proprietary services include Microsoft Smooth Streaming, Adobe HTTP Dynamic Streaming, and Apple HTTP Live Streaming. It is a combination of adaptive Video Rate Control and Progressive Downloading. Where several representations with different quality of the same video are stored in the server and downloaded. You will see this included into the impact dash package. Videos are commonly divided into fragments of 2 to 10 seconds of length. Client device periodically checks that never condition every 2 to 10 seconds based upon the fragment length, and may make changes such that these changes are based upon fragment units. So, within a fragment you view what is going on, the next fragment I may want a different type of data rate or I want a different type of video quality. Then therefore the controlling units are based upon segments of the fragments. That's why the every 2 to 10 seconds that is based upon the fragment size, is also the reference in which the periodic checks are conducted. The client device tries to avoid video stalling, because we want our video to be streaming. Therefore, we want to avoid having our buffer emptied. Buffer starvation, we don't want it to occur. Once again, we want the buffer to maintain a stable good level in which the video play as well as the video packets arriving are at the same rate or nearly the same rate such that how much do I have in my buffer? That buffer level is maintained in a stable fashion. One selection option that is commonly used is that, we want to select the video player quality level to be at a slightly lower level than what the bitrate the network can support. Because if you're using the full capacity of the network, then small changes to the network throughput will have a fluctuating effect on the video stream that is being received. We don't want that, we want it to be as stable as possible. So that is why instead of using the full structure of what the throughput can support, we want to lower it down a little bit and use that level in a steady, stable fashion. This has the effect of saving network bandwidth also. So, it's saving network bandwidth and also making the video play very stable. Such that buffer being empty, buffer starvation does not occur. Phase one of the process, Burst Downloading Phase. Very fast client buffer fill- up is achieved, and quick video playback is attempted. Then in step two, Throttle Download Phase, where you maintain buffer fill-up level to avoid video stalling. So, at the first we want to hurry up to start the video, so, we go into burst mode. Download a lot and then quickly start to play it, and then we want to maintain a stable level.