Artificial Intelligence: Reinforcement Learning in Python

Name: Artificial Intelligence: Reinforcement Learning in Python
Brand: Lazy Programmer
Price: 25.00 USD
Availability: Discontinued

5.5 Hours

$25.00$180.00

You save 86%

71 Lessons (5.5h)

Introduction and Outline
Introduction and outline6:22
What is Reinforcement Learning?13:46
Where to get the Code2:41
Strategy for Passing the Course5:56
Return of the Multi-Armed Bandit
Problem Setup and The Explore-Exploit Dilemma3:55
Epsilon-Greedy1:48
Updating a Sample Mean1:22
Comparing Different Epsilons4:06
Optimistic Initial Values2:56
UCB14:56
Bayesian / Thompson Sampling9:52
Thompson Sampling vs. Epsilon-Greedy vs. Optimistic Initial Values vs. UCB15:11
Nonstationary Bandits4:51
Build an Intelligent Tic-Tac-Toe Agent
Naive Solution to Tic-Tac-Toe3:50
Components of a Reinforcement Learning System8:00
Notes on Assigning Rewards2:41
The Value Function and Your First Reinforcement Learning Algorithm16:33
Tic Tac Toe Code: Outline3:16
Tic Tac Toe Code: Representing States2:56
Tic Tac Toe Code: Enumerating States Recursively6:14
Tic Tac Toe Code: The Environment6:36
Tic Tac Toe Code: The Agent5:48
Tic Tac Toe Code: Main Loop and Demo6:02
Tic Tac Toe Summary5:25
Markov Decision Proccesses
Gridworld2:13
The Markov Property4:36
Defining and Formalizing the MDP4:10
Future Rewards3:16
Value Functions4:38
Optimal Policy and Optimal Value Function4:09
MDP Summary1:35
Dynamic Programming
Intro to Dynamic Programming and Iterative Policy Evaluation3:06
Gridworld in Code5:47
Iterative Policy Evaluation in Code6:24
Policy Improvement2:51
Policy Iteration2:00
Policy Iteration in Code3:46
Policy Iteration in Windy Gridworld4:57
Value Iteration3:58
Value Iteration in Code2:14
Dynamic Programming Summary5:14
Monte Carlo
Monte Carlo Intro3:10
Monte Carlo Policy Evaluation5:45
Monte Carlo Policy Evaluation in Code3:35
Policy Evaluation in Windy Gridworld3:38
Monte Carlo Control5:59
Monte Carlo Control in Code4:04
Monte Carlo Control without Exploring Starts2:58
Monte Carlo Control without Exploring Starts in Code2:51
Monte Carlo Summary3:42
Temporal Difference Learning
Temporal Difference Intro1:42
TD(0) Prediction3:46
TD(0) Prediction in Code2:27
SARSA5:15
SARSA in Code3:38
Q Learning3:05
Q Learning in Code2:13
TD Summary2:34
Approximation Methods
Approximation Intro4:11
Linear Models for Reinforcement Learning4:16
Features4:02
Monte Carlo Prediction with Approximation1:54
Monte Carlo Prediction with Approximation in Code2:58
TD(0) Semi-Gradient Prediction4:22
Semi-Gradient SARSA3:08
Semi-Gradient SARSA in Code4:08
Course Summary and Next Steps8:38
Appendix
How to install Numpy, Scipy, Matplotlib, Pandas, IPython, Theano, and TensorFlow17:32
How to Code by Yourself (part 1)15:54
How to Code by Yourself (part 2)9:23
Where to get discount coupons and FREE deep learning material2:20

DescriptionInstructorImportant DetailsRelated Products

Complete Guide to Artificial Intelligence & Machine Learning

Lazy ProgrammerThe Lazy Programmer is a data scientist, big data engineer, and full stack software engineer. For his master's thesis he worked on brain-computer interfaces using machine learning. These assist non-verbal and non-mobile persons to communicate with their family and caregivers.

He has worked in online advertising and digital media as both a data scientist and big data engineer, and built various high-throughput web services around said data. He has created new big data pipelines using Hadoop/Pig/MapReduce, and created machine learning models to predict click-through rate, news feed recommender systems using linear regression, Bayesian Bandits, and collaborative filtering and validated the results using A/B testing.

He has taught undergraduate and graduate students in data science, statistics, machine learning, algorithms, calculus, computer graphics, and physics for students attending universities such as Columbia University, NYU, Humber College, and The New School.

Multiple businesses have benefitted from his web programming expertise. He does all the backend (server), frontend (HTML/JS/CSS), and operations/deployment work. Some of the technologies he has used are: Python, Ruby/Rails, PHP, Bootstrap, jQuery (Javascript), Backbone, and Angular. For storage/databases he has used MySQL, Postgres, Redis, MongoDB, and more.

Description

When people talk about artificial intelligence, they usually don't mean supervised and unsupervised machine learning. These tasks are pretty trivial compared to what we think of AIs doing—playing chess and Go, driving cars, etc. Reinforcement learning has recently become popular for doing all of that and more. Reinforcement learning opens up a whole new world. It's lead to new and amazing insights both in behavioral psychology and neuroscience. It's the closest thing we have so far to a true general artificial intelligence, and this course will be your introduction.

Access 71 lectures & 5.5 hours of content 24/7
Discuss the multi-armed bandit problem & the explore-exploit dilemma
Learn ways to calculate means & moving averages and their relationship to stochastic gradient descent
Explore Markov Decision Processes, Dynamic Programming, Monte Carlo, & Temporal Difference Learning
Understand approximation methods

Specs

Details & Requirements

Length of time users can access this course: lifetime
Access options: web streaming, mobile streaming
Certification of completion not included
Redemption deadline: redeem your code within 30 days of purchase
Experience level required: all levels, but knowledge of calculus, probability, object-oriented programming, Python, Numpy, linear regression, and gradient descent is expected
All code for this course is available for download here, in the directory rl

Terms

Unredeemed licenses can be returned for store credit within 30 days of purchase. Once your license is redeemed, all sales are final.

Your Cart

Your cart is empty. Continue Shopping!

Processing order...