Hello everyone, today we talk about machine learning basics. In this class we'll cover three topics, supervised learning method, unsupervised learning method, and evaluation metrics for machine learning model. Let's start with supervised learning. So in this class for supervised learning we'll talk about how to build a predictive model. Predictive model pipeline, as the name suggested, is not a single algorithm, but a computational pipeline. This pipeline is the iterative process that involve several steps. First, it will define prediction target, second we construct a cohort of relevant patient in cohort construction step. Third we define all the potentially relevant features in feature construction. Fourth, we select which features are actually relevant to predicting the target variable in feature selection step. Fifth, we compute the predictive model, which can be a classification model or a regression model. Sixth, we evaluate the predictive model performance. And this process is often iterative multiple times until we're satisfied with the resulting model. Next, let's introduce each step. First, let's talk about prediction target. There are often many target that investigator want to predict using data, so they are all interesting. However, only a subset of them are possible. Prediction targets are the one that is most interesting to the investigator and is possible to be answered with the data. So how do we know if the target is interesting? There are several options, first, in medical domains, we should talk to domain experts, for example medical doctors, who understand what are the important problems or targets to predict? Second, there are a lot of domain publication in medicine, we can study them to learn what other researchers consider important. Third, we can use general common sense metric, for example, high cost target, such as heart failure admission will be an interesting one. As we can potentially predict heart failure before they actually occur, then we can try to avoid the disease onset. Target that require long time, for example complicated surgery, will also be interesting. Because if we can predict them, we can potentially change the patient's trajectory to avoid the need for complicated surgery. Target that indicate bad quality, for example sepsis, which is infection in the blood, would be interesting to study. Because if we can accurately predict sepsis early, we can save patient's life. How do we know if target is possible before we actually build the model? There are also several strategies, first, we can consider the human performance for this task. For example, for diagnosing an x-ray image, we should first understand the average performance of a human radiologist on this task. And the human performance here can provide a simple feasibility check for any machine learning model. Experience from similar projects is another feasibility validation. For example, if we have developed an accurate predictor model for heart failure in one hospital, we should feel confident to develop another HF predictive model for another hospital. Knowing literature also helps, if other researchers have done a similar task and published their result, you can be more confident this target is possible. And you can also learn from their approach by reading their papers. For this lesson, let's focus on predicting the onset of heart failure, which is an interesting and potentially possible target. Here is a quiz question about heart failure. Make a guess, how many new cases of heart failure occur each year in the USA? Is it A, 17,000, B, 260,000, or C, 550,000, or D, 1,250,000? And the correct answer is C, 550,000, which is a huge healthcare problem. So what are the motivations for early detection of heart failure? Heart failure is a very complex disease. There's no widely accepted characterization and definition of heart failure, probably because the complexity of the syndrome. It has many potential etiologies, diverse clinical features, and numerous clinical subset. If we can predict heart failure earlier, then we can potentially reduce the cost and hospitalization associated with heart failure. We can also introduce early intervention to try to slow down the disease progression and improve quality of life, reduce mortality. In the long term, we can improve existing clinical guidelines for heart failure prevention.