Об этом курсе
4.6
Оценки: 1,507
Рецензии: 268
Специализация

Курс 4 из 4 в программе

100% онлайн

100% онлайн

Начните сейчас и учитесь по собственному графику.
Гибкие сроки

Гибкие сроки

Назначьте сроки сдачи в соответствии со своим графиком.
Часов на завершение

Прибл. 21 часа на выполнение

Предполагаемая нагрузка: 6 weeks of study, 5-8 hours/week...
Доступные языки

Английский

Субтитры: Английский, Арабский

Приобретаемые навыки

Data Clustering AlgorithmsK-Means ClusteringMachine LearningK-D Tree
Специализация

Курс 4 из 4 в программе

100% онлайн

100% онлайн

Начните сейчас и учитесь по собственному графику.
Гибкие сроки

Гибкие сроки

Назначьте сроки сдачи в соответствии со своим графиком.
Часов на завершение

Прибл. 21 часа на выполнение

Предполагаемая нагрузка: 6 weeks of study, 5-8 hours/week...
Доступные языки

Английский

Субтитры: Английский, Арабский

Программа курса: что вы изучите

Неделя
1
Часов на завершение
1 ч. на завершение

Welcome

Clustering and retrieval are some of the most high-impact machine learning tools out there. Retrieval is used in almost every applications and device we interact with, like in providing a set of products related to one a shopper is currently considering, or a list of people you might want to connect with on a social media platform. Clustering can be used to aid retrieval, but is a more broadly useful tool for automatically discovering structure in data, like uncovering groups of similar patients.<p>This introduction to the course provides you with an overview of the topics we will cover and the background knowledge and resources we assume you have....
Reading
4 видео ((всего 25 мин.)), 4 материалов для самостоятельного изучения
Video4 видео
Course overview3мин
Module-by-module topics covered8мин
Assumed background6мин
Reading4 материала для самостоятельного изучения
Important Update regarding the Machine Learning Specialization10мин
Slides presented in this module10мин
Software tools you'll need for this course10мин
A big week ahead!10мин
Неделя
2
Часов на завершение
4 ч. на завершение

Nearest Neighbor Search

We start the course by considering a retrieval task of fetching a document similar to one someone is currently reading. We cast this problem as one of nearest neighbor search, which is a concept we have seen in the Foundations and Regression courses. However, here, you will take a deep dive into two critical components of the algorithms: the data representation and metric for measuring similarity between pairs of datapoints. You will examine the computational burden of the naive nearest neighbor search algorithm, and instead implement scalable alternatives using KD-trees for handling large datasets and locality sensitive hashing (LSH) for providing approximate nearest neighbors, even in high-dimensional spaces. You will explore all of these ideas on a Wikipedia dataset, comparing and contrasting the impact of the various choices you can make on the nearest neighbor results produced....
Reading
22 видео ((всего 137 мин.)), 4 материалов для самостоятельного изучения, 5 тестов
Video22 видео
1-NN algorithm2мин
k-NN algorithm6мин
Document representation5мин
Distance metrics: Euclidean and scaled Euclidean6мин
Writing (scaled) Euclidean distance using (weighted) inner products4мин
Distance metrics: Cosine similarity9мин
To normalize or not and other distance considerations6мин
Complexity of brute force search1мин
KD-tree representation9мин
NN search with KD-trees7мин
Complexity of NN search with KD-trees5мин
Visualizing scaling behavior of KD-trees4мин
Approximate k-NN search using KD-trees7мин
Limitations of KD-trees3мин
LSH as an alternative to KD-trees4мин
Using random lines to partition points5мин
Defining more bins3мин
Searching neighboring bins8мин
LSH in higher dimensions4мин
(OPTIONAL) Improving efficiency through multiple tables22мин
A brief recap2мин
Reading4 материала для самостоятельного изучения
Slides presented in this module10мин
Choosing features and metrics for nearest neighbor search10мин
(OPTIONAL) A worked-out example for KD-trees10мин
Implementing Locality Sensitive Hashing from scratch10мин
Quiz5 практического упражнения
Representations and metrics12мин
Choosing features and metrics for nearest neighbor search10мин
KD-trees10мин
Locality Sensitive Hashing10мин
Implementing Locality Sensitive Hashing from scratch10мин
Неделя
3
Часов на завершение
2 ч. на завершение

Clustering with k-means

In clustering, our goal is to group the datapoints in our dataset into disjoint sets. Motivated by our document analysis case study, you will use clustering to discover thematic groups of articles by "topic". These topics are not provided in this unsupervised learning task; rather, the idea is to output such cluster labels that can be post-facto associated with known topics like "Science", "World News", etc. Even without such post-facto labels, you will examine how the clustering output can provide insights into the relationships between datapoints in the dataset. The first clustering algorithm you will implement is k-means, which is the most widely used clustering algorithm out there. To scale up k-means, you will learn about the general MapReduce framework for parallelizing and distributing computations, and then how the iterates of k-means can utilize this framework. You will show that k-means can provide an interpretable grouping of Wikipedia articles when appropriately tuned....
Reading
13 видео ((всего 79 мин.)), 2 материалов для самостоятельного изучения, 3 тестов
Video13 видео
An unsupervised task6мин
Hope for unsupervised learning, and some challenge cases4мин
The k-means algorithm7мин
k-means as coordinate descent6мин
Smart initialization via k-means++4мин
Assessing the quality and choosing the number of clusters9мин
Motivating MapReduce8мин
The general MapReduce abstraction5мин
MapReduce execution overview and combiners6мин
MapReduce for k-means7мин
Other applications of clustering7мин
A brief recap1мин
Reading2 материала для самостоятельного изучения
Slides presented in this module10мин
Clustering text data with k-means10мин
Quiz3 практического упражнения
k-means18мин
Clustering text data with K-means16мин
MapReduce for k-means10мин
Неделя
4
Часов на завершение
3 ч. на завершение

Mixture Models

In k-means, observations are each hard-assigned to a single cluster, and these assignments are based just on the cluster centers, rather than also incorporating shape information. In our second module on clustering, you will perform probabilistic model-based clustering that provides (1) a more descriptive notion of a "cluster" and (2) accounts for uncertainty in assignments of datapoints to clusters via "soft assignments". You will explore and implement a broadly useful algorithm called expectation maximization (EM) for inferring these soft assignments, as well as the model parameters. To gain intuition, you will first consider a visually appealing image clustering task. You will then cluster Wikipedia articles, handling the high-dimensionality of the tf-idf document representation considered....
Reading
15 видео ((всего 91 мин.)), 4 материалов для самостоятельного изучения, 3 тестов
Video15 видео
Aggregating over unknown classes in an image dataset6мин
Univariate Gaussian distributions2мин
Bivariate and multivariate Gaussians7мин
Mixture of Gaussians6мин
Interpreting the mixture of Gaussian terms5мин
Scaling mixtures of Gaussians for document clustering5мин
Computing soft assignments from known cluster parameters7мин
(OPTIONAL) Responsibilities as Bayes' rule5мин
Estimating cluster parameters from known cluster assignments6мин
Estimating cluster parameters from soft assignments8мин
EM iterates in equations and pictures6мин
Convergence, initialization, and overfitting of EM9мин
Relationship to k-means3мин
A brief recap1мин
Reading4 материала для самостоятельного изучения
Slides presented in this module10мин
(OPTIONAL) A worked-out example for EM10мин
Implementing EM for Gaussian mixtures10мин
Clustering text data with Gaussian mixtures10мин
Quiz3 практического упражнения
EM for Gaussian mixtures18мин
Implementing EM for Gaussian mixtures12мин
Clustering text data with Gaussian mixtures8мин
4.6
Рецензии: 268Chevron Right
Формирование карьерного пути

32%

начал новую карьеру, пройдя эти курсы
Карьерные преимущества

35%

получил значимые преимущества в карьере благодаря этому курсу

Лучшие рецензии

автор: JMJan 17th 2017

Excellent course, well thought out lectures and problem sets. The programming assignments offer an appropriate amount of guidance that allows the students to work through the material on their own.

автор: AGSep 25th 2017

Nice course with all the practical stuffs and nice analysis about each topic but practical part of LDA was restricted for GraphLab users only which is a weak fallback and rest everything is fine.

Преподавателя

Avatar

Emily Fox

Amazon Professor of Machine Learning
Statistics
Avatar

Carlos Guestrin

Amazon Professor of Machine Learning
Computer Science and Engineering

О University of Washington

Founded in 1861, the University of Washington is one of the oldest state-supported institutions of higher education on the West Coast and is one of the preeminent research universities in the world....

О специализации ''Machine Learning'

This Specialization from leading researchers at the University of Washington introduces you to the exciting, high-demand field of Machine Learning. Through a series of practical case studies, you will gain applied experience in major areas of Machine Learning including Prediction, Classification, Clustering, and Information Retrieval. You will learn to analyze large and complex datasets, create systems that adapt and improve over time, and build intelligent applications that can make predictions from data....
Machine Learning

Часто задаваемые вопросы

  • Зарегистрировавшись на сертификацию, вы получите доступ ко всем видео, тестам и заданиям по программированию (если они предусмотрены). Задания по взаимной оценке сокурсниками можно сдавать и проверять только после начала сессии. Если вы проходите курс без оплаты, некоторые задания могут быть недоступны.

  • Записавшись на курс, вы получите доступ ко всем курсам в специализации, а также возможность получить сертификат о его прохождении. После успешного прохождения курса на странице ваших достижений появится электронный сертификат. Оттуда его можно распечатать или прикрепить к профилю LinkedIn. Просто ознакомиться с содержанием курса можно бесплатно.

Остались вопросы? Посетите Центр поддержки учащихся.