So if you look at, if we break down the components of the singular value decomposition we can take, look at the u, the kind of the u and the v matrix of the left singular values and the right singular values. So you, if you look at the original data matrix here on the left, so that's the image plot here that, that we saw before. And then on the, in the middle plot here, I plot the, the left, the left, the first, sorry, the first left singular vector. And it kind of roughly shows basically the mean of all the, of that kind of data set. So if you look, if you plot, if you plot it across the rows, you can see that the first singular vector is kind of, has a negative value. For rows 40 through about 18 or so or 17. And then it has a positive value for the remaining rows. So that shows the clear separation in the means of the two sets of of rows, right. And if you look at the the, the, the right sing, the first right singular vector you can see that this also shows the change in the mean between the the first five columns and the second five columns. And so the nice thing about the singular value decomposition here is that it immediately, it immediately picked up on the shift in the means, both kind of in the row, from the row dimension and the column dimension. Without you having to tell it anything. It was totally unsupervised, and it just kind of picked it up automatically. So remember, we made a plot that was very, very similar to this before, but that was be, when we knew that there was this kind of pattern in the data set. So here the, the singular value decomposition picked up the pattern right away. And, and, and in the first kind of singular vet, left and right singular vectors. Another kind of component of the singular value decomposition is, is known as the variance explained. And this comes from the singular values that are in the d matrix. Remember the d matrix, is really a diagonal matrix. It only has elements that are on the diagonal of that matrix. And you can think of each singular value as representing the percent of the total variation in your data set that's explained by that particular component. And so, and the components are typically ordered so that the first one explains the most variation as possible, and the second one explains the second most, et cetera. And so you can plot the kind of proportion of variance explained as we have in the in this plot below. And so here on the left hand side I've just plotted the raw singular values and you can see that they kind of decrease in value as you go across the columns. But of course the raw singular value doesn't really have much meaning cause it's not on an interpretable scale. So if I divide by the total sum of all the singular values, then on the right hand side I've got the, I can interpret it as the proportion of the variance explained, and you can see that's exactly the same plot but the y axis has changed, and so you can see that the first singular value if you recall it captures kind of the shift and the mean in between the rows and the columns. That, that captures about 40% of the variation in your data. Alright, so almost almost half of the variation in the data is explained by a single kind of singular value. Or you can think of it as a single dimension. And then, the remaining variation in the data is explained by the other components. So, just to show that the relationship between the singular value decomposition and the principle components is close, see, I, I, here, I plotted, I've calculated the SVD and the, and the PCA of the same data matrix. And I've plotted the principle components on the, X axis, and the for, sorry, the principle component, first principle component on the X axis. And the first right singular val, vector on the y-axis. And you can see that they fall exactly in a line that they're exactly the same things. So the svd and the principle components, and principle components analysis is essentially the same thing. And so if you hear, you know, you're in a cocktail party and you hear two people talking about the svd and pca you can rest assured that they're basically doing the same thing. So if we look at, if we look, take a closer look at the variance explained. Here we've got a very simple example which is just basically a, kind of a matrix that has either zeros or ones. And so the idea is that there's basically there's only one pattern in this matrix, right? And so the first five columns are zeros. And the second five columns are wants that's it, nothing special and so if I plot the data you can see on the left hand side that the, left part, the first five are red with the further kind of second five are yellowish white. So now if I take the SVD of this, kind of, relatively boring matrix, you can see that I can plot the singular values and you can see that there's one singular value that's very high and the rest are kind of zero, basically. And what you'll see is that the first singular value explains 100% of the variation in the data set. So how can that be? Well, if you think about this data set, even though there is, kind of, 40 rows andten columns, there's really only one dimension to this data set, which is that, if you're in the first five columns, your mean is, you're equal to 0, and if you're in the second five columns, you're equal to 1. So even though you have lots of so-called observations, there's really only kind of a small fraction of information that's actually useful in this matrix. And so that's captured by the SVD. By the fact that you can explain 100% of the variation with a singular, a single component. So let's add a second pattern to the data set. So we can add a pattern that's kind of that's kind of goes across the rows and also goes across the columns. And so one pat, the first pattern to be, going to become a block pattern that we saw before. So the first half of this is going to have one mean and one half of this is going to have another mean. And the second column, the second pattern, basically, is going to alternate between the columns. So you can see what that looks like over here. So on the left hand side that's the data and you can see that there's one pattern. That's kind of like, that's kind of a block pattern so the left five columns are low and the right five columns are higher. And then you can see that, within that, kind of nested within that, is a kind of pattern that's kind of every other column. And so, so the, every other column is, one is higher than the other. And so if you plot, if we, if you plot the kind of, what the truth is, right, we can see that on the, in the middle plot here, the first five columns have a have a lower mean, and the second five columns have a higher mean. And that's the first pattern, and then the second pattern shows that the first column has a mean of 0, the second column has a mean of 1. The third column as a mean of 0 and the fourth column et cetera. So there's, there's two patterns here. One is a shift pattern and the second is this kind of alternating pattern, so that's the truth. But of course, we rarely know the truth, so we need to learn the truth from the data. So if you, the idea is if you were presented with the data set that we've shown here on the left. Can you come up with an algorithm that can pick up on the two separate patterns that are buried within the data? And so that's kind of what the SVD is for. So if we run the SVD on this new matrix with the two patterns. I got the data here on the left and the middle pan, the middle panel here, I've got the first, the first right singular vector. And you can see it roughly picks up the block pattern. Right? So the first, kind of, five are, are somewhat lower, and the second five are somewhat higher. Right? So it's not as pretty as the truth was. But it's somewhat discernible, that there are two different kind of means here. The second sorry, the second right singular vector is on the right-hand panel here. And you can see that, it's a little bit harder to see, but it does try to pick up the alternating mean pattern, right? It's not as obvious as it was when we were plotting the truth, of course. But you can see that every other point is either higher or lower. Now of course since we know what the truth is it's a little bit easier to, kind of, talk about this. But but in general you can see that the two patterns are roughly confounded with each other because it's a little bit hard to set the two patterns apart. So even though they're clearly in the, we know that the truth is, is, is two separate patterns, the, the first and the second right singular vectors are also known as the principle components. They kind of mix the two patterns together, so each of them has a block pattern. And each of them has a kind of alternating pattern. And so unfortunately, just, just like with most real data, the truth is a little, is a little bit hard to discern than if you'd known it in advance. Now if you look at the variance explained in this problem you can see that on the right-hand panel which is the percent of the variance explained, that the first component explains over 50% of the variation, the total variation in the data set. And that's basically because the shift pattern you know, with the first five columns, the second five columns is so strong. It represents a large amount of variation in the data set. So you capture that shift pattern, you kind of capture a lot of that variation. You see the second component only captures about 18 or so percent of the variation and it kind of trails off from there. And so that's going to tell, that kind of tells you roughly, you know, how many components are in this data set. But that alternating pattern which kind of, which is kind of every other column is a little bit harder to pick up, as you can see.