[MUSIC] Augmented reality, feature detection and description technology. Now we're going into the details of how actually augmented reality is done. In order to do something, in order to display information or put an object in a certain location, on a certain entity that is in your real world view, you first have to detect that object. You first have to decide what you are going to choose in the real world view to put text or a certain image or a character on top. In order to do that, you need feature detection and description technology. Augmented reality feature detection. Characteristic primitives of an image are identified. Identification is done by highlighting unique visual cues, and various feature detection techniques exist. The objective is to conduct efficient and effective extraction of stable visual features. What does this mean? Well, if the computation processing task is too heavy that you need to compute a lot. Or it doesn't really compute it the right way, such that you get errors in what you actually compute. Then augmented reality cannot be used. And due to these reasons, augmented reality was not made possible in older times, in former devices. And this is why augmented reality is now very capable. Because augmented reality is at a level in which we can be efficient. We can process it very quickly with low computation, saving the battery consumption of your augmented reality device. In addition, what you find is very reliable, very accurate, and the information that you display on it is very effective. Therefore, overall, the effectiveness and the efficiency is there to enable augmented reality applications. And that is why augmented reality is so popular now and is going to become so much more popular in the near future. Augmented reality feature detector requirements. Robust against changing imaging conditions. What does that mean? Well, think of it this way. This is not virtual reality, where the environment and all the characters within the environment are make believe. This is augmented reality. Like on the glasses that you're using right now, you get a view of the real world. But then again, the probability that you're going to stay exactly fixed at a location looking at an object is zero. No, you are going to move around and look at that at various angles. As you conduct your daily work, as you conduct your daily pleasure of playing or relaxing. In that case what that means is that your view of that certain object or the environment is going to consistently change. Because this is the real world and everything is displayed on your smartphone or on your glasses or head mounted display or other devices like your car windshield. Then in that case the way an angle, the light as well as the surrounding environment is going to consistently change. So therefore, based on these changing imaging conditions, we need the augmented reality feature detector technology to remain robust, reliable. Such that even if you look at an object under a different lighting, or a different environment, or at a different angle, then still it will detect the same object the same way, and provide your desired information in the right way. It needs to satisfy the application's Quality of Service requirements. This means that quality of service as in terms of device lifeline, how long you can use it on a certain device. Because if the device, no matter how good it is, if you can only use the device for five minutes, then it won't sell. In addition is the accuracy there? Is the information displayed within the given amount of time? Because people that are using their glasses are not going to stay looking at the same object until the information pops up in an augmented reality fashion. No, they're just going to go and look at whatever they need to look at at that time. The information, augmented information or the augmented object that is going to be added on to the real world view has to be displayed immediately. As in terms of how immediate, well, down to the level of maybe 10 to 50 milliseconds. It needs to be that fast or even faster. Then you must be able to complete the entire process of augmented reality and serve it in that way. Before the user looks in a different position, and a new view comes into their vision range. Therefore, the device has to be very accurate, very fast, with limited, very little power consumption. These type of quality of service requirements must be satisfied for that product that is going to be used in a specific way. In terms of AR feature detection influencing factors, they include the environment, of course, because the environment will consistently change. Changes in viewpoint, even if it's in the same environment, based on what angle I'm looking at you, will give me a different view of the overall object that I'm looking at. Image scale, in other words, when you look close and you look far, then the object will be large and it will be small based on looking at it through a magnifying range or a macroscopic view that has an overall landscape. The object that you're looking at will be extremely enlarged or reduced, as well as the resolution of that object image will be varied significantly. You need your augmented reality device to be able to work properly at magnified scales as well as reduced, compact scales. And that is where the image scale comes in that your device needs to be able to support. Resolution, your camera may have different resolutions. The resolution may change based on the light setting. As well as the resolution may change based on the amount of energy that is left in your device. The resolution may change based on the network connection. If the network is becoming slow, the throughput is becoming reduced. The data rate is becoming smaller. And there's more and more queueing delay. But you need to use the networking support to get augmented reality services. Then the video resolution that intakes the image may be reduced. That's the resolution issues. Lighting is also another thing. Is it bright or dark? Because as in terms of dark, if it becomes really dark, then if the device is equipped with infrared then you may go into infrared mode. And the infrared camera that intakes the information, the image, may not be at the same resolution that the camera for a bright day would be at. In that case the resolution will drop as it goes into infrared mode because the lighting is not supportive. We will look at these type of things in our augmented reality smartphone testing project, which is the last module of this lecture. So hang in there, and we'll test a couple of things on some interesting apps. The interest point detected in the interest point detection process, the IPD process, can be used to obtain feature descriptors. A circular or square region centered at the detected interest points is basically it. As you can see there are circles in this image and that image over there. In addition, you can see that the size of the circles fit the size of the regions based on their scales. So there's big circles and small circles. Circles are used in this figure as well as that figure as well. However, the size of the regions, as well as using squares, sometimes instead of circles you'll see these squares that represent the object that is being identified. Feature detection, image features with unique interest points are detected. Feature descriptors characterized image features using a sampling region. Exactly how is this done? Well, we will talk about this in further detail in this lecture and the following lectures. So hang in there. Sampling based on an image patch around detected interest points is where the feature detection engine works most. As in terms of methods, one method is to use spectra descriptors that are generated by considering the local image region gradients. Some of the techniques supporting this are SIFT and SURF, which we will look in further details, of course. A gradient, what is a gradient? Because this is used for the local image region processing in feature detection in method 1. Well you can think of it like this, a gradient is a multi-variable generalization of the derivative. A derivative is a derivative of a single variable, normally a scalar. A gradient is done on multiple variables and therefore normally represented in terms of a vector. So using gradients in method 1 is what will enable us to generate the spectra descriptors. Another method is method 2, which uses local binary features that are identified using a simple point-pair pixel intensity comparison. This here, as in terms of the point-pair pixel intensity comparison, are based on local binary features. And what we're trying to do is, instead of using the gradient, which is the differentiation process, these type of methods are much simpler and less computation burdening. So therefore you can do them quicker and the energy consumption of the device is reduced significantly. That is the purpose of the these techniques. And the algorithms that enable you to use this in feature detection are BRIEF, ORB and BRISK, which of course we will study in further detail. Feature detection and description schemes. Schemes that do both feature detection and description are SIFT, SURF, ORB and BRISK. BRIEF does feature description only. And for a feature detector scheme there is FAST. And we will study about these algorithms in detail. AR scheme mix matching. What do you mean by mix matching? Well, feature detection results in a specific set of pixels at specific scales identified as interest points. Different feature description schemes can be applied to different interest points to generate feature descriptions. This is because some schemes have specific benefits that they are extremely good at when they are processing in their certain way. Such that you may want to mix and match a certain sequence to enable a certain feature, while achieving a certain level of reliability, robustness and accuracy. And at the same time being sufficiently fast and sufficiently low power consuming. Application and interest point characteristic based scheme selection is possible. You will see some mixing and matching to create a certain algorithm. Something like ORB, which I will soon explain in further details. These are the references that I used and I recommend them to you. Thank you.