Об этом курсе
4.3
Оценки: 663
Рецензии: 142
Специализация

Курс 1 из 4 в программе

100% онлайн

100% онлайн

Начните сейчас и учитесь по собственному графику.
Гибкие сроки

Гибкие сроки

Назначьте сроки сдачи в соответствии со своим графиком.
Часов на завершение

Прибл. 21 часа на выполнение

Предполагаемая нагрузка: 4 weeks of study, 6-8 hours/week...
Доступные языки

Английский

Субтитры: Английский

Приобретаемые навыки

Relational AlgebraPython ProgrammingMapreduceSQL
Специализация

Курс 1 из 4 в программе

100% онлайн

100% онлайн

Начните сейчас и учитесь по собственному графику.
Гибкие сроки

Гибкие сроки

Назначьте сроки сдачи в соответствии со своим графиком.
Часов на завершение

Прибл. 21 часа на выполнение

Предполагаемая нагрузка: 4 weeks of study, 6-8 hours/week...
Доступные языки

Английский

Субтитры: Английский

Программа курса: что вы изучите

Неделя
1
Часов на завершение
6 ч. на завершение

Data Science Context and Concepts

Understand the terminology and recurring principles associated with data science, and understand the structure of data science projects and emerging methodologies to approach them. Why does this emerging field exist? How does it relate to other fields? How does this course distinguish itself? What do data science projects look like, and how should they be approached? What are some examples of data science projects? ...
Reading
22 видео ((всего 125 мин.)), 4 материалов для самостоятельного изучения, 1 тест
Video22 видео
Appetite Whetting: Extreme Weather2мин
Appetite Whetting: Digital Humanities8мин
Appetite Whetting: Bibliometrics4мин
Appetite Whetting: Food, Music, Public Health5мин
Appetite Whetting: Public Health cont'd, Earthquakes, Legal4мин
Characterizing Data Science5мин
Characterizing Data Science, cont'd5мин
Distinguishing Data Science from Related Topics4мин
Four Dimensions of Data Science6мин
Tools vs. Abstractions7мин
Desktop Scale vs. Cloud Scale5мин
Hackers vs. Analysts2мин
Structs vs. Stats5мин
Structs vs. Stats cont'd5мин
A Fourth Paradigm of Science3мин
Data-Intensive Science Examples6мин
Big Data and the 3 Vs5мин
Big Data Definitions4мин
Big Data Sources6мин
Course Logistics7мин
Twitter Assignment: Getting Started14мин
Reading4 материала для самостоятельного изучения
Supplementary: Three-Course Reading List10мин
Supplementary: Resources for Learning Python10мин
Supplementary: Class Virtual Machine10мин
Supplementary: Github Instructions10мин
Неделя
2
Часов на завершение
5 ч. на завершение

Relational Databases and the Relational Algebra

Relational Databases are the workhouse of large-scale data management. Although originally motivated by problems in enterprise operations, they have proven remarkably capable for analytics as well. But most importantly, the principles underlying relational databases are universal in managing, manipulating, and analyzing data at scale. Even as the landscape of large-scale data systems has expanded dramatically in the last decade, relational models and languages have remained a unifying concept. For working with large-scale data, there is no more important programming model to learn....
Reading
24 видео ((всего 122 мин.)), 1 тест
Video24 видео
From Data Models to Databases4мин
Pre-Relational Databases5мин
Motivating Relational Databases3мин
Relational Databases: Key Ideas4мин
Algebraic Optimization Overview6мин
Relational Algebra Overview4мин
Relational Algebra Operators: Union, Difference, Selection6мин
Relational Algebra Operators: Projection, Cross Product4мин
Relational Algebra Operators: Cross Product cont'd, Join6мин
Relational Algebra Operators: Outer Join4мин
Relational Algebra Operators: Theta-Join4мин
From SQL to RA6мин
Thinking in RA: Logical Query Plans4мин
Practical SQL: Binning Timeseries5мин
Practical SQL: Genomic Intervals6мин
User-Defined Functions3мин
Support for User-Defined Functions4мин
Optimization: Physical Query Plans5мин
Optimization: Choosing Physical Plans4мин
Declarative Languages5мин
Declarative Languages: More Examples4мин
Views: Logical Data Independence5мин
Indexes6мин
Неделя
3
Часов на завершение
5 ч. на завершение

MapReduce and Parallel Dataflow Programming

The MapReduce programming model (as distinct from its implementations) was proposed as a simplifying abstraction for parallel manipulation of massive datasets, and remains an important concept to know when using and evaluating modern big data platforms. ...
Reading
26 видео ((всего 122 мин.)), 1 тест
Video26 видео
A Sketch of Algorithmic Complexity5мин
A Sketch of Data-Parallel Algorithms5мин
"Pleasingly Parallel" Algorithms4мин
More General Distributed Algorithms4мин
MapReduce Abstraction4мин
MapReduce Data Model3мин
Map and Reduce Functions2мин
MapReduce Simple Example3мин
MapReduce Simple Example cont'd3мин
MapReduce Example: Word Length Histogram2мин
MapReduce Examples: Inverted Index, Join6мин
Relational Join: Map Phase4мин
Relational Join: Reduce Phase4мин
Simple Social Network Analysis: Counting Friends3мин
Matrix Multiply Overview5мин
Matrix Multiply Illustrated4мин
Shared Nothing Computing4мин
MapReduce Implementation5мин
MapReduce Phases6мин
A Design Space for Large-Scale Data Systems4мин
Parallel and Distributed Query Processing5мин
Teradata Example, MR Extensions5мин
RDBMS vs. MapReduce: Features6мин
RDBMS vs. Hadoop: Grep5мин
RDBMS vs. Hadoop: Select, Aggregate, Join3мин
Неделя
4
Часов на завершение
3 ч. на завершение

NoSQL: Systems and Concepts

NoSQL systems are purely about scale rather than analytics, and are arguably less relevant for the practicing data scientist. However, they occupy an important place in many practical big data platform architectures, and data scientists need to understand their limitations and strengths to use them effectively....
Reading
36 видео ((всего 166 мин.))
Video36 видео
NoSQL Roundup4мин
Relaxing Consistency Guarantees3мин
Two-Phase Commit and Consensus Protocols5мин
Eventual Consistency4мин
CAP Theorem4мин
Types of NoSQL Systems4мин
ACID, Major Impact Systems4мин
Memcached: Consistent Hashing2мин
Consistent Hashing, cont'd4мин
DynamoDB: Vector Clocks5мин
Vector Clocks, cont'd5мин
CouchDB Overview4мин
CouchB Views3мин
BigTable Overview5мин
BigTable Implementation5мин
HBase, Megastore3мин
Spanner5мин
Spanner cont'd, Google Systems6мин
MapReduce-based Systems5мин
Bringing Back Joins4мин
NoSQL Rebuttal4мин
Almost SQL: Pig4мин
Pig Architecture and Performance3мин
Data Model3мин
Load, Filter, Group5мин
Group, Distinct, Foreach, Flatten5мин
CoGroup, Join3мин
Join Algorithms3мин
Skew5мин
Other Commands3мин
Evaluation Walkthrough3мин
Review6мин
Context3мин
Spark Examples5мин
RDDs, Benefits6мин
Часов на завершение
2 ч. на завершение

Graph Analytics

Graph-structured data are increasingly common in data science contexts due to their ubiquity in modeling the communication between entities: people (social networks), computers (Internet communication), cities and countries (transportation networks), or corporations (financial transactions). Learn the common algorithms for extracting information from graph data and how to scale them up. ...
Reading
21 видео ((всего 91 мин.))
Video21 видео
Structural Analysis4мин
Degree Histograms, Structure of the Web4мин
Connectivity and Centrality4мин
PageRank3мин
PageRank in more Detail3мин
Traversal Tasks: Spanning Trees and Circuits5мин
Traversal Tasks: Maximum Flow1мин
Pattern Matching6мин
Querying Edge Tables4мин
Relational Algebra and Datalog for Graphs4мин
Querying Hybrid Graph/Relational Data3мин
Graph Query Example: NSA6мин
Graph Query Example: Recursion4мин
Evaluation of Recursive Programs3мин
Recursive Queries in MapReduce4мин
The End-Game Problem3мин
Representation: Edge Table, Adjacency List4мин
Representation: Adjacency Matrix2мин
PageRank in MapReduce5мин
PageRank in Pregel5мин
4.3
Рецензии: 142Chevron Right

Лучшие рецензии

автор: HAJan 11th 2016

Great course that strikes a balance between teaching general principles and concepts, and providing hands-on technical skills and practice.\n\nThe lessons are well designed and clearly conveyed.

автор: SLMay 28th 2016

I like the breadth of coverage of this class. Each of the exercise is a gem in that I get to learn something new also. I would highly recommend this even to experience practitioner also.

Преподаватель

Avatar

Bill Howe

Director of Research
Scalable Data Analytics

О University of Washington

Founded in 1861, the University of Washington is one of the oldest state-supported institutions of higher education on the West Coast and is one of the preeminent research universities in the world....

О специализации ''Data Science at Scale'

Learn scalable data management, evaluate big data technologies, and design effective visualizations. This Specialization covers intermediate topics in data science. You will gain hands-on experience with scalable SQL and NoSQL data management solutions, data mining algorithms, and practical statistical and machine learning concepts. You will also learn to visualize data and communicate results, and you’ll explore legal and ethical issues that arise in working with big data. In the final Capstone Project, developed in partnership with the digital internship platform Coursolve, you’ll apply your new skills to a real-world data science project....
Data Science at Scale

Часто задаваемые вопросы

  • Зарегистрировавшись на сертификацию, вы получите доступ ко всем видео, тестам и заданиям по программированию (если они предусмотрены). Задания по взаимной оценке сокурсниками можно сдавать и проверять только после начала сессии. Если вы проходите курс без оплаты, некоторые задания могут быть недоступны.

  • Записавшись на курс, вы получите доступ ко всем курсам в специализации, а также возможность получить сертификат о его прохождении. После успешного прохождения курса на странице ваших достижений появится электронный сертификат. Оттуда его можно распечатать или прикрепить к профилю LinkedIn. Просто ознакомиться с содержанием курса можно бесплатно.

Остались вопросы? Посетите Центр поддержки учащихся.