Welcome back. In this segment we'll briefly described the latest video compression standards, the high efficiency video coding standard, HGVC. It's a result of the effort by the joint collaborative team on video coding and it is therefore additional names. It's called H.265 by ITU-T, and MPEG-H part two by MPEG. As computers become more powerful, more involved ideas from the literature, or new ideas can be implemented and tested. So the old idea of breaking a frame into blocks of various sizes using a [INAUDIBLE] structure. To perform motion estimation and compensation can now become part of a standard. Since adequate computational resources are available. Similarly, instead of the eight spatial prediction modes used in H.264 33 plus 235. Such prediction modes can be implemented in H.265. The overall result is that the latest standard is a very powerful one. Especially for large spatial resolutions. As a matter of fact, it supports resolution up to 8k UHD, ultra high definition. It's really fascinating to watch the progress the video compression standards are making. They bring of course up the important question of how much longer can we be push in the envelope. By having the rate of the previous standard while maintaining the quality of the constructed video. Maybe some of you might be the ones to further push this envelope. So let us look into some of the details of this exciting standard. High efficiency video coding standard is the latest standard out of the Joint Collaborative Team on Video Coding. That initial objective can be to reduce the bit rate by 50% of the same quality when compared to H.264 High Profile. Since it's a joint def it has various names; in ITU is referred to as an H.265 or an H Next Generation Video Coding. Whereby the ISO is referred to as MPEG-H Part 2. In comparing HEVC to H.264 it provides in practice better coding efficiency by more than 35% at the same quality level. It supports resolutions up to 8K UHD while H.264 supporting resolutions up to 4K UHD. It also supports stereo and multi-view video encoding and also it supports increased use of parallel processing, as we are going to see. One of the novelties of ATVC is the use of coding tree units and coding tree blocks. The CTU contains the lumen and chromacity B as well as syntax element. So this is a departure from the standard 16 by 16 microblock used by all standards. Of course, we saw that H.264 subdivides this microblock into various sizes. This CTU however can be as large as 64 by 64 pixels. And therefore, it's suitable for encoding high definition content and results in better coding efficiency. The CTU is subdivided into coding blocks, which can be as small as four by four, based on their [UNKNOWN] structure. And of course some rate control logic. What we see in this table is how the CTU maximum size affects the bit rate, the encoding time and the decoding time. So when the largest possible CTU is used, we are saving bits. Because there are many more choices in sub dividing the CTU and adopting to the content of the frame. So with the smaller CTU sizes the bit rate increased by 2.2 here and 11%. If we consider the smaller one, which is concise with the standard size of the macro blocks. The encoding time however is higher here, because again there are more choices to be made. The decoding time however is smaller than the other choices of the CTU's. Because there are larger now blocks and not so many blocks to be decoded. The idea of course of using a quad-32 segment or tesselate the frame is not new. Here's an old result and we see here the segmentation for this particular frame. And those are the associated motion vectors for each of these specific blocks. The actually segmentation as well as the bit allocation was done in the [UNKNOWN] in this particular work. Behind the segmentation I showed you the previous frame, there's this multilevel trellis that one hand shows the [UNKNOWN] structure. So a 32 by 32 block is potentially broken down into four 16 by 16, 16 by 16 and four 8 by 8 and so on. But this Trellis also allows us to perform optimal bit relocation for encoding the segmentation, the motion vectors. As well as the texture so that the overall performance is optimized or the overall distortion is minimized. We see here the coding three unit idea being demonstrated. Based on this segmentation or tessellation of the space or the coding decisions will be made. How the bits will be allocated again, amongst segmentation, motion vectors, as well as texture. Here's an example showing that the tessellation of this frame from the Foreman sequence by H.264 and H.265. Clearly seems many more choices available with H.265, their coding efficiency only increases. So here we see that there are large blocks 64 by 64, smaller ones all the way down to 4 by 4. Their choices again are not as many when we work with H.264 we talked about the various choices in the previous presentation. When it comes to motion estimation and compensation, quarter sample precision is used for motion estimation. And this was the case also with H.264. H.265 however uses 7-tap or 8-tap filters for this fractional sample interpolation. While H.264 uses a 6-tap filter to go to half-sample precision and then linear interpolation for quarter [UNKNOWN] precision. Regarding motion vector signaling. An advanced motion vector prediction scheme is used. According to which candidate motion vectors have derived from neighbouring prediction blocks. In addition there's the option to merge or inherit motion vectors from the temporal or spatial neighbors. And this compared to HW64 then it's an improved skip or direct motion mode. As far as interprediction goes AW65s supports 33 different directional modes. As compared to the eight ones that are supported by AW64. And this directional modes applied, they're applied to blocks from 44 or the way up to 32 by 32. In addition, there is the planar mode prediction. And it's done by assuming an object service with the horizontal and vertical slope that is derived from the boundaries. And finally there's another mode, the DC prediction or flat mode prediction. According to which a flat surface with a value matching the min value of the boundary samples is utilized. We see here a pictorially the 33 prediction directions, and direction zero is the planar prediction and one the DC prediction we just mentioned. It should be noted here that all these directions are not equally spaced and some directions are more popular than others. So we see here schematically that intra-prediction by the direction of mode 29 as goes numbered in the previous slide. When using this intra-angular mode each block is predicted directionally from spatially neighbouring samples that are reconstructed. But not get filtered by the in loop filters. For a block N by N, a total of four N plus one spatially neighbouring samples may be used for this prediction. When it comes to entropy coding a CABAC encoder, context adapting binary arithmetic code that is used to in, in 265 as is the with 264. The one H.265; however, has undergone several improvements to improve its throughput speed. Especially when its used for parallel processing, its compression performance and to reduce its context memory requirements. And in 265 no variable length codes are used as opposed to [INAUDIBLE] and variable length coding in H.264. With respect to the in-loop filters. They're used in the inter-picture prediction loop. It's similar to the one used in H.264 however it's simpler than the one used in 264 and it better supports parallel processing. And also a different filter, this sample adaptive offset filter is used. It's applied after the deblocking filter so that a better reconstruction of original signal amplitude by using an offset from a look-up table. HEVC supports slices and tiles. Slices are rows of CTUs, as shown here. This is an example of a slice. And as mentioned multiple times during this week, slices are needed for terminating error propagation. Since they're independently decoded, decodable. In addition to slices, HDVC supports tiles. And tiles are self-contained and independently recordable rectangular regions in the pictures. So this is one tile here, this is another tile and so on. The main purpose of tiles is to enable the use of parallel processing for encoding and decoding. Now the slice is divided into rows of CTU's. So the decoding of each row can begin as soon as some, the decisions are made. And therefore this supports parallel processing in the form of threads as shown here. So all this threads can be processed at the same time. So there are two different mechanisms, if you will, that support parallel processing. The use of tiles and the use of threads. We show here an experiment of comparison, of some of the more recent standards. So HDVC is compared against 264, 263, MPEG-4, and MPEG-2. So the rate distortion curves are shown here for this particular sequence. So, the bitrate is down here, from 0 to 10 megabits per second, and the PSNR improvement is shown on the vertical axis. So as expected you see kind of the chronology of a development of these standards. The older standard has a registration curve shown here and the newest one has the registration curve shown here in blue. So there are considerably savings involved as the newer standards arrive and the savings are shown on the plot here on the right. So what's shown here, is the bitrate savings with respect to HEVC. And we say that with respect to the older standards, there's savings anywhere between 85% and 70%. Y with respect to AW64 the savings range from 55% to around 40% down here. So in some sense, we can, we have come to full circle. If you recall at the very beginning of this week's lecture. I show the similar rate distortion comparison with some of the earlier standards with also smaller sequences. We use their format, a QCIF image. At frame rate I believe it was ten frames per second. Nevertheless the philosophical question still remains. As time goes on these curves keep being pushed this direction and we see here that there's not that much space left. So the obvious question is will there then be a next standard that can push the envelope even further. Or in other words, what is really the limit of compression? How far can we compress a given video? This is a question that we can not really answer. We have to wait and see. You know for the reduction of the bit rate is going to be with the new standard while keeping the same quality. Of course, the objectives change as well. We have the proliferation of large displays and large formats the 4K and And 8K formats. And clearly, we need new tools and new approaches to handle these larger formats. At the same time, the computational speed and capabilities increase tremendously as time goes on. And it's only natural that more sophisticated algorithms will be able to be implemented. That will of course provide the additional gains that are needed. So we reached the end of week ten, two more weeks to go. During this week the third week out of three dedicated to compression we talked about [UNKNOWN] video compression. This is probably the most important topic from an application point of view since we are surrounded by application used in video compression. This is also reflected by the large number of video compression standards. We discussed and analyzed the important topics and covered some of the intellectually interesting and practically useful techniques. We talked about the important concept of temporal prediction and also of the concept of intra and inter prediction. We described the evolution in video cooperation standards which represents a history of the research and development in this area. I repeat here a comment I made last week, that is, there was nothing really terribly mathematical in this part. So I hope everybody feels comfortable with this material. You're now knowledgable to also follow the future development in this exciting film. Next week we'll talk about a different topic. The topic of image and video segmentation. So, see you next week.