The log-derivative trick

video-placeholder
Loading...
Просмотреть программу курса

Рецензии

4.2 (432 ratings)
  • 5 stars
    58,33 %
  • 4 stars
    23,14 %
  • 3 stars
    9,02 %
  • 2 stars
    4,16 %
  • 1 star
    5,32 %

FZ

13 февр. 2019 г.

Filled StarFilled StarFilled StarFilled StarFilled Star

A great course with very practical assignments to help you learn how to implement RL algorithms. But it also has some stupid quiz questions which makes you feel confusing.

LJ

6 окт. 2019 г.

Filled StarFilled StarFilled StarFilled StarFilled Star

Challenging (unlike many other courses on Coursera, it does not baby you and does not seem to be targeting as high a pass rate as possible), but very very rewarding.

Из урока

Policy-based methods

We spent 3 previous modules working on the value-based methods: learning state values, action values and whatnot. Now's the time to see an alternative approach that doesn't require you to predict all future rewards to learn something.

Преподаватели

  • Placeholder

    Pavel Shvechikov

    Researcher at HSE and Sberbank AI Lab

  • Placeholder

    Alexander Panin

    Lecturer

Ознакомьтесь с нашим каталогом

Присоединяйтесь бесплатно и получайте персонализированные рекомендации, обновления и предложения.