Welcome to the course analyzing of processing from music applications. This week we're talking about the sinusoidal model. In the theory lectures, we actually presented the model and we presented it from a CMA processing and mathematical point of view. In this demonstration class, I want to present it from a practical point of view, so actually using it. So, let's go directly to the sms-tools GUI. Let's start with the DFT model, and let's open 1_sine.wav. It's a sine .wav at 540 hertz, and so. Let's listen to that. [SOUND] Okay, so let's use a hanning window, that's fine. Maybe let's use window size of 511 and maybe a 50 size of 1,024. You compute it. Okay, so this is the sine wave, the magnitude phase spectra and the reconstructed signal from the spectrum. The sinusoidal model will attempt to find the peak, the spectral peak, the height, the location and the phase corresponding to that location. Of course you will do it for the whole sound. So, it will start from the STFD. So, if we get the same sound, and get the same type of parameters, 511 and 1024. And let this, put this half of that window. And this will be the STFT, the spectrogram. We listen the output. [NOISE] Right, pretty good. So, in here we are seeing this red line which is basically the peak as it evolves in time, which is very stable, and in the phase we see this clear area which is the flat phase of the main lobe. Okay, but now let's go to the sine wave. The sine model and let's do the same thing. Let's open the sine. Let's use the same parameters of 511 and now 1024. And now we're starting with a parameters that are specific for a sinusoidal analysis. For example, we have the magnitude threshold which is the magnitude threshold decibels that we're going to be looking for. So, we're going to be looking for peaks that are within 50 degree from the zero degree from the maximum of the sound. We can restrict the duration of the tracks, for example lets just put zero so no, with no restriction. We can restrict the maximum number of sinusoids, 150 is plenty. In fact, there is only one. And we can restrict the deviation from frame to frame in terms of how this track evolves. And here we put the ten at the lowest frequency and then normally it's good to that this deviation changes as the frequency goes up. And as the frequency goes up, it's good to increase this deviation, so this is the increasing factor of this deviation as the frequency goes up. Okay, so let's compute that. And okay interestingly enough, we are seeing more than one line. Okay? This resynthesis, we can listen to that. [SOUND] That's pretty good. It's a sinusoid. But in the analysis, here we are only seeing the frequency values of the peaks. We're not seeing the magnitude or the face of being for lines but clearly they may not be the same amplitude. So, where do these come from? In fact, they come from the side lobes. We specified [INAUDIBLE] little threshold of minus 15. And if we go back to the signal, the DFT, and we zoom into the peak, well, we're going to see that apart from the main peak there is side lobes that are peaks and are within 50 decibels of the maximum energy, so therefore these have been picked up. If we want to not pick up these side notes we should reduce this threshold for example minus 20. And now yeah now there is only one line. Okay, let's complicate this and let's analyze a little bit more complicated signal, for example, two sine waves. So, let's go back to the DFT. And let's get rid of all these windows, and let's open the the sum of two sine waves 440 and 490 hertz, sum in together. And let's maybe, well we have to choose the window size appropriate for a specific window and we talked in the theory class that in order to do that We can use an equation that says that it has to be the number of bins of the window. In the case of hanning windows is 4 times the sampling rate. Which is in this case, 44100, and we have to divide it by the Frequency distance that we want to resolve. In this case, between 440 and 490 hertz, so 50 hertz. So this number, 3,528, is the size event, the minimum size event if we want the two main lobes of these two sinusoids to be separate. So, if we put here 235 and we'll put it as a knot size, so we'll put that 359. The FFT has to be bigger than that by a power of two, and let's just analyze at this point. Okay, and now if we zoom into this region, we will see, okay, the two peaks of the two sinusoids. Okay, then we will do the sinusoidal model. We'll do the short time for transfer, and let's go directly to the sinusoidal model. And let's put the same values. Let's get the sum of these two sinosoids. Which we will maybe let's listen it [NOISE] Okay. And now we can set up the values as three, five, two, nine. Okay? And again let us decide 40926. And okay, the rest of the parameters, let's lower the threshold I think, minus 20 is easily too little. And we just put minus 30 for example. Okay, and lets leave the others the same as we have before. Okay, lets compute that. Okay, and now we have the original sound and the synthesized. We can listen to that [SOUND] Ok, that's pretty good. And if we zoom into this area, we see the two lines, the 440 hertz and 490 hertz. What happens if this size is not large enough? For example, what happens if it's a 1,800? And of course, we're going to need such a [INAUDIBLE] let's 2048 is efficient.. And I think that in here, let's make sure that we save and let's have a lower threshold, let's say, -40. And let's leave this rather the same. Okay and now we see something interesting. So, if we listen to the output sound [SOUND] it's not that bad but if we look at the analysis it definitely is very different. If we look at the analysis. Okay, what we are seeing is kind of, it gets confused. Sometimes it finds two, sometimes it doesn't, so clearly we don't have enough resolution for discrimination clearly these two sinusoids, even though when we sum it all together it doesn't sound that bad. Okay, but anyway, we'll have to pay attention to this and make sure that we understand how the parameters have to be set for analyzing specific types of sound. And now let's go to a sound, a real sound. Let's use the oboe sound and let's go directly to the oboe sound, which is the note A4 at 440 Hertz. So let's calculate again the distance that we need to, the size of. The window that we need to have, maybe let's change the window. And let's put for example, a Blackman window. Okay, what should be the window size for this sound? Okay, the Blackman window, the number of bins is 6. The sampling rate is the same, 44,100. And now, we have to divide by the fundamental frequency. The fundamental frequency of the sound is 440 Hz, and that's the distance between two consecutive harmonics. So we will put 440 Hz here. And this is the size that we need, 601 samples, okay? So let's put 601 samples. FFT size, well, let's put the next part of two. And now, of course, we are dealing with a big sound that has a lot of harmonics and they will go pretty much down. So let's put more bigger threshold. So lower value Okay, so let's put -80, and now we can. These parameters matter a bit, so there might be spurious kind of tracks. So let's get read of tracks that are small. So for example, .05, so we get that. In this case, now we will read quite a bit of sinusoids. And we definitely will need these deviation. So ,this will be a deviation that will be quite useful. Okay and let's compute it. Of course, it takes longer because it's a much more complex sound. And this is the result. The original sound the tracks, here we're only visualising the first 5000, but if you would go higher, we would see more, and we see pretty good, all the horizontal lines, and of course here at the end maybe some harmonics that are disappearing. Let's listen to the original. [SOUND] [SOUND] And the synthesis [SOUND]. Okay, that's pretty good. What happens if for example, the threshold is not low enough. Let's put -30 for example. And let's re-compute it. Ok, now what we are seeing is that it has not been able to capture all the harmonics. The harmonics were too soft were not analyzed. So, let's hear what we got back. [SOUND] Yeah, clearly this is only a part of that sound, and that basically what I wanted to talk about. So, this was a class that I tried to introduce the sinusoidal model from a practical perspective. And of course we have used the sms-tools, the GUI and this sinusoidal sounds that we have, and the oboe sound we have been playing around with come from freesound. And that's all, so hopefully that has given you a practical view on the sinusoidal model. Of course, this require years for you to practice quite a bit more It's very important to analyze many sounds and to actually see how it is affected the different parameters on different sound. And in fact, that's what we're going to do next demonstration class. So, I hope to see you next demonstration class where we will actually analyze more complex sound, and go into the sinusoidal model more deeply. So ,see you next time.