Об этом курсе
4.0
Оценки: 192
Рецензии: 57
Have you ever heard about such technologies as HDFS, MapReduce, Spark? Always wanted to learn these new tools but missed concise starting material? Don’t miss this course either! In this 6-week course you will: - learn some basic technologies of the modern Big Data landscape, namely: HDFS, MapReduce and Spark; - be guided both through systems internals and their applications; - learn about distributed file systems, why they exist and what function they serve; - grasp the MapReduce framework, a workhorse for many modern Big Data applications; - apply the framework to process texts and solve sample business cases; - learn about Spark, the next-generation computational framework; - build a strong understanding of Spark basic concepts; - develop skills to apply these tools to creating solutions in finance, social networks, telecommunications and many other fields. Your learning experience will be as close to real life as possible with the chance to evaluate your practical assignments on a real cluster. No mocking, a friendly considerate atmosphere to make the process of your learning smooth and enjoyable. Get ready to work with real datasets alongside with real masters! Special thanks to: - Prof. Mikhail Roytberg, APT dept., MIPT, who was the initial reviewer of the project, the supervisor and mentor of half of the BigData team. He was the one, who helped to get this show on the road. - Oleg Sukhoroslov (PhD, Senior Researcher at IITP RAS), who has been teaching MapReduce, Hadoop and friends since 2008. Now he is leading the infrastructure team. - Oleg Ivchenko (PhD student APT dept., MIPT), Pavel Akhtyamov (MSc. student at APT dept., MIPT) and Vladimir Kuznetsov (Assistant at P.G. Demidov Yaroslavl State University), superbrains who have developed and now maintain the infrastructure used for practical assignments in this course. - Asya Roitberg, Eugene Baulin, Marina Sudarikova. These people never sleep to babysit this course day and night, to make your learning experience productive, smooth and exciting....
Globe

Только онлайн-курсы

Начните сейчас и учитесь по собственному графику.
Calendar

Гибкие сроки

Назначьте сроки сдачи в соответствии со своим графиком.
Intermediate Level

Промежуточный уровень

Clock

Approx. 41 hours to complete

Предполагаемая нагрузка: 6 weeks of study, 6-8 hours/week...
Comment Dots

English

Субтитры: English...

Приобретаемые навыки

Python ProgrammingApache HadoopMapreduceApache Spark
Globe

Только онлайн-курсы

Начните сейчас и учитесь по собственному графику.
Calendar

Гибкие сроки

Назначьте сроки сдачи в соответствии со своим графиком.
Intermediate Level

Промежуточный уровень

Clock

Approx. 41 hours to complete

Предполагаемая нагрузка: 6 weeks of study, 6-8 hours/week...
Comment Dots

English

Субтитры: English...

Программа курса: что вы изучите

Week
1
Clock
14 минуты на завершение

Welcome

...
Reading
8 видео (всего 14 мин.)
Video8 видео
Issues BigData can solve1мин
BigData Applications1мин
What is BigData Essentials?2мин
Course Structure2мин
Meet Emeli1мин
Meet Alexey2мин
Meet Ivan1мин
Clock
8 ч. на завершение

What are BigData and distributed file systems (e.g. HDFS)?

...
Reading
18 видео (всего 136 мин.), 10 материалов для самостоятельного изучения, 5 тестов
Video18 видео
File system managing6мин
File content exploration 15мин
File content exploration 213мин
Processes4мин
Scaling Distributed File System9мин
Block and Replica States, Recovery Process 16мин
Block and Replica States, Recovery Process 27мин
HDFS Client9мин
Web UI, REST API4мин
Namenode Architecture8мин
Introduction10мин
Text formats9мин
Binary formats 18мин
Binary formats 28мин
Compression7мин
How to submit your first assignment3мин
How to Install Docker on Windows 7, 8, 104мин
Reading10 материала для самостоятельного изучения
Basic Bash Commands10мин
Slack Channel is the quickest way to get answers to your questions10мин
HDFS Lesson Introduction10мин
Gentle Introduction into "curl"10мин
File formats extra (optional)10мин
Grading System: Instructions and Common Problems10мин
Docker Installation Guide10мин
Programming Assignment: Instructions and Common Problems10мин
FAQ How to show your code to teaching staff10мин
Slack channel "Bigdata-coursera" - the quickest to solve technical problems.10мин
Quiz2 практического упражнения
Distributed File Systems16мин
Big Data and Distributed File Systems25мин
Week
2
Clock
3 ч. на завершение

Solving Problems with MapReduce

...
Reading
17 видео (всего 94 мин.), 1 материал для самостоятельного изучения, 3 тестов
Video17 видео
Unreliable Components 28мин
MapReduce4мин
Distributed Shell8мин
Fault Tolerance7мин
Fault Tolerance. Live Demo3мин
Streaming7мин
Streaming in Python3мин
WordCount in Python5мин
Distributed Cache4мин
Environment, Counters4мин
Testing5мин
Combiner5мин
Partitioner7мин
Comparator1мин
Speculative Execution / Backup Tasks3мин
Compression4мин
Reading1 материал для самостоятельного изучения
Hadoop Streaming Assignments: Intro and Code Samples10мин
Quiz3 практического упражнения
Hadoop MapReduce Intro26мин
MapReduce Streaming26мин
Hadoop Streaming Final30мин
Week
3
Clock
4 ч. на завершение

Solving Problems with MapReduce (practice week)

...
Reading
1 видео (всего 3 мин.), 5 материалов для самостоятельного изучения, 5 тестов
Reading5 материала для самостоятельного изучения
Hadoop Streaming Assignments: Intro and Code Samples10мин
Hints to Debug Hadoop Streaming Applications10мин
Grading System and Grading System Sandbox User Guide10мин
Hadoop Streaming Assignments: Instructions10мин
Hint to the "Stop words" programming assignment10мин
Week
4
Clock
3 ч. на завершение

Introduction to Apache Spark

...
Reading
16 видео (всего 95 мин.), 2 материалов для самостоятельного изучения, 2 тестов
Video16 видео
Welcome6мин
RDDs8мин
Transformations 16мин
Transformations 27мин
Actions5мин
Resiliency6мин
Execution & Scheduling6мин
Caching & Persistence5мин
Broadcast variables5мин
Accumulator variables5мин
Getting started with Spark & Python6мин
Working with text files6мин
Joins4мин
Broadcast & Accumulator variables5мин
Spark UI4мин
Cluster mode3мин
Reading2 материала для самостоятельного изучения
Spark Assignments Intro10мин
Instructions for Spark programming assignment10мин
Quiz2 практического упражнения
Lesson 1 Quiz20мин
Lesson 2 Quiz24мин
4.0

Лучшие рецензии

автор: SDJun 28th 2018

Absolutely essential for everyone who wants a proper introduction to HDFS, MapReduce and Spark. Brought to you by a great team of geniuses of their time ;)

автор: NPApr 27th 2018

The course gave me more techical skill with Hadoop and Spark that help me can confidence in my career. Thank coursera and yandex so much.

Преподавателя

Ivan Puzyrevskiy

Technical Team Lead

Alexey A. Dral

Founder and Chief Executive Officer
BigData Team

О Yandex

Yandex is a technology company that builds intelligent products and services powered by machine learning. Our goal is to help consumers and businesses better navigate the online and offline world....

О специализации ''Big Data for Data Engineers'

This specialization is made for people working with data (either small or big). If you are a Data Analyst, Data Scientist, Data Engineer or Data Architect (or you want to become one) — don’t miss the opportunity to expand your knowledge and skills in the field of data engineering and data analysis on the large scale. In four concise courses you will learn the basics of Hadoop, MapReduce, Spark, methods of offline data processing for warehousing, real-time data processing and large-scale machine learning. And Capstone project for you to build and deploy your own Big Data Service (make your portfolio even more competitive). Over the course of the specialization, you will complete progressively harder programming assignments (mostly in Python). Make sure, you have some experience in it. This course will master your skills in designing solutions for common Big Data tasks: - creating batch and real-time data processing pipelines, - doing machine learning at scale, - deploying machine learning models into a production environment — and much more! Join some of best hands-on big data professionals, who know, their job inside-out, to learn the basics, as well as some tricks of the trade, from them. Special thanks to Prof. Mikhail Roytberg (APT dept., MIPT), Oleg Sukhoroslov (PhD, Senior Researcher, IITP RAS), Oleg Ivchenko (APT dept., MIPT), Pavel Akhtyamov (APT dept., MIPT), Vladimir Kuznetsov, Asya Roitberg, Eugene Baulin, Marina Sudarikova....
Big Data for Data Engineers

Часто задаваемые вопросы

  • Зарегистрировавшись на сертификацию, вы получите доступ ко всем видео, тестам и заданиям по программированию (если они предусмотрены). Задания по взаимной оценке сокурсниками можно сдавать и проверять только после начала сессии. Если вы проходите курс без оплаты, некоторые задания могут быть недоступны.

  • Записавшись на курс, вы получите доступ ко всем курсам в специализации, а также возможность получить сертификат о его прохождении. После успешного прохождения курса на странице ваших достижений появится электронный сертификат. Оттуда его можно распечатать или прикрепить к профилю LinkedIn. Просто ознакомиться с содержанием курса можно бесплатно.

Остались вопросы? Посетите Центр поддержки учащихся.