《深度强化学习实践(影印版)(英文版)》(俄罗斯)马克西姆·拉潘东南大学出版社PDF电子书网盘迅雷下载、免费在线阅读-兰台网

图书

深度强化学习实践(影印版)(英文版)

内容

内容推荐

强化学习(RL)的最新发展结合深度学习(DL)，在训练代理以类似人的方式解决复杂问题方面取得了前所未有的进步。Google使用算法在著名的Atari街机游戏中获胜将该领域推至高峰，研究人员也在源源不断地产生新的想法。
本书是关于最新DL工具及其局限性的全面指南。在应用于真实环境之前，你得评估包括交叉熵和策略梯度在内的多种方法。试试Atari的虚拟游戏和像connect4这样的家庭最爱。本书介绍了RL的基础知识，为你提供了编写智能学习代理所需的原理，以承担一系列艰巨的实际任务。让你了解如何在“网格世界”环境中实现Q-learning，教你的代理购买和交易股票，发现自然语言模型如何推动了聊天机器人的火爆。

作者简介

马克西姆·拉潘(Maxim Lapan)，is a deep learning enthusiast and independent researcher. His background and 15 years' work expertise as a software developer and a systems architect lays from low-level Linux kernel driver development to performance optimization and design of distributed applications working on thousands of servers. With vast work experiences in big data,Machine Learning, and large parallel distributed HPC and nonHPC systems, he has a talent to explain a gist of complicated things in simple words and vivid examples.His current areas of interest lie in practical applications of Deep Learning, such as Deep Natural Language Processing and Deep Reinforcement Learning.
Maxim lives in Moscow, Russian Federation, with his family, and he works for an Israeli start-up as a Senior NLP developer.

Preface
Chapter 1: What is Reinforcement Learning?
Learning - supervised, unsupervised, and reinforcement
RL formalisms and relations
Reward
The agent
The environment
Actions
Observations
Markov decision processes
Markov process
Markov reward process
Markov decision process
Summary
Chapter 2: OpenAI Gym
The anatomy of the agent
Hardware and software requirements
OpenAI Gym API
Action space
Observation space
The environment
Creation of the environment
The CartPole session
The random CartPole agent
The extra Gym functionality - wrappers and monitors
Wrappers
Monitor
Summary
Chapter 3: Deep Learning with PyTorch
Tensors
Creation of tensors
Scalar tensors
Tensor operations
GPU tensors
Gradients
Tensors and gradients
NN building blocks
Custom layers
Final glue - loss functions and optimizers
Loss functions
Optimizers
Monitoring with TensorBoard
TensorBoard 101
Plotting stuff
Example -GAN on Atari images
Summary
Chapter 4: The Cross-Entropy Method
Taxonomy of RL methods
Practical cross-entropy
Cross-entropy on CartPole
Cross-entropy on FrozenLake
Theoretical background of the cross-entropy method
Summary
Chapter 5: Tabular Learning and the Bellman Equation
Value, state, and optimality
The Bellman equation of optimality
Value of action
The value iteration method
Value iteration in practice
Q-learning for FrozenLake
Summary
Chapter 6: Deep Q-Networks
Chapter 7: DQN Extensions
Chapter 8: Stocks Trading Using RL
Chapter 9: Policy Gradients - An Alternative
Chapter 10: The Actor-Critic Method
Chapter 11: Asynchronous Advantaqe Actor-Critic
Chapter 12: Chatbots Training with RL
Chapter 13: Web Navigation
Chapter 14: Continuous Action Space
Chapter 15: Trust Regions - TRPO, PPO, and ACKTR
Chapter 16: Black-Box Optimization in RL
Chapter 17: Beyond Model-Free - Imagination
Chapter 18: AlphaGo Zero
Other Books You May Enjoy
Index

标签

缩略图

书名

深度强化学习实践(影印版)(英文版)

副书名

原作名

作者

(俄罗斯)马克西姆·拉潘

译者

编者

绘者

出版社

东南大学出版社

商品编码（ISBN）

9787564183219

开本

16开

页数

523

版次

装订

平装

字数

670

出版时间

2019-05-01

首版时间

2019-05-01

印刷时间

2019-05-01

正文语种

英

读者对象

普通大众

适用范围

发行范围

公开发行

发行模式

实体书

首发网站

连载网址

图书大类

图书小类

重量

868

CIP核字

2019046195

中图分类号

TP181

丛书名

印张

34.25

印次

出版地

江苏

长

233

宽

186

高

整理

媒质

用纸

是否注音

影印版本

出版商国别

是否套装

著作权合同登记号

版权提供者

定价

印数

出品方

作品荣誉

主角

配角

其他角色

一句话简介

立意

作品视角

所属系列

文章进度

内容简介

作者简介

文摘

安全警示

适度休息有益身心健康，请勿长期沉迷于阅读小说。

随便看

兰台网图书档案馆全面收录古今中外各种图书，详细介绍图书的基本信息及目录、摘要等图书资料。