Python reinforcement learning projects : eight hands-on projects exploring reinforcement learning algorithms using TensorFlow /: eight hands-on projects exploring reinforcement learning algorithms using TensorFlow. (2018)
- Record Type:
- Book
- Title:
- Python reinforcement learning projects : eight hands-on projects exploring reinforcement learning algorithms using TensorFlow /: eight hands-on projects exploring reinforcement learning algorithms using TensorFlow. (2018)
- Main Title:
- Python reinforcement learning projects : eight hands-on projects exploring reinforcement learning algorithms using TensorFlow
- Further Information:
- Note: Sean Saito, Yang Wenzhuo, Rajalingappaa Shanmugamani.
- Authors:
- Saito, Sean
Yang, Wenzhuo
Shanmugamani, Rajalingappaa - Contents:
- Cover; Title Page; Copyright and Credits; Packt Upsell; Contributors; Table of Contents; Preface; Chapter 1: Up and Running with Reinforcement Learning; Introduction to this book; Expectations; Hardware and software requirements; Installing packages; What is reinforcement learning?; The agent; Policy; Value function; Model; Markov decision process (MDP); Deep learning; Neural networks; Backpropagation; Convolutional neural networks; Advantages of neural networks; Implementing a convolutional neural network in TensorFlow; TensorFlow; The Fashion-MNIST dataset; Building the network. Methods for building the networkbuild method; fit method; Summary; References; Chapter 2: Balancing CartPole; OpenAI Gym; Gym; Installation ; Running an environment; Atari; Algorithmic tasks; MuJoCo; Robotics; Markov models; CartPole; Summary; Chapter 3: Playing Atari Games; Introduction to Atari games; Building an Atari emulator; Getting started; Implementation of the Atari emulator; Atari simulator using gym; Data preparation; Deep Q-learning; Basic elements of reinforcement learning; Demonstrating basic Q-learning algorithm; Implementation of DQN; Experiments; Summary. Chapter 4: Simulating Control TasksIntroduction to control tasks; Getting started; The classic control tasks; Deterministic policy gradient; The theory behind policy gradient; DPG algorithm; Implementation of DDPG; Experiments; Trust region policy optimization; Theory behind TRPO; TRPO algorithm; Experiments on MuJoCo tasks;Cover; Title Page; Copyright and Credits; Packt Upsell; Contributors; Table of Contents; Preface; Chapter 1: Up and Running with Reinforcement Learning; Introduction to this book; Expectations; Hardware and software requirements; Installing packages; What is reinforcement learning?; The agent; Policy; Value function; Model; Markov decision process (MDP); Deep learning; Neural networks; Backpropagation; Convolutional neural networks; Advantages of neural networks; Implementing a convolutional neural network in TensorFlow; TensorFlow; The Fashion-MNIST dataset; Building the network. Methods for building the networkbuild method; fit method; Summary; References; Chapter 2: Balancing CartPole; OpenAI Gym; Gym; Installation ; Running an environment; Atari; Algorithmic tasks; MuJoCo; Robotics; Markov models; CartPole; Summary; Chapter 3: Playing Atari Games; Introduction to Atari games; Building an Atari emulator; Getting started; Implementation of the Atari emulator; Atari simulator using gym; Data preparation; Deep Q-learning; Basic elements of reinforcement learning; Demonstrating basic Q-learning algorithm; Implementation of DQN; Experiments; Summary. Chapter 4: Simulating Control TasksIntroduction to control tasks; Getting started; The classic control tasks; Deterministic policy gradient; The theory behind policy gradient; DPG algorithm; Implementation of DDPG; Experiments; Trust region policy optimization; Theory behind TRPO; TRPO algorithm; Experiments on MuJoCo tasks; Summary; Chapter 5: Building Virtual Worlds in Minecraft; Introduction to the Minecraft environment; Data preparation; Asynchronous advantage actor-critic algorithm; Implementation of A3C; Experiments; Summary; Chapter 6: Learning to Play Go; A brief introduction to Go. Go and other board gamesGo and AI research; Monte Carlo tree search; Selection; Expansion; Simulation; Update; AlphaGo; Supervised learning policy networks; Reinforcement learning policy networks; Value network; Combining neural networks and MCTS; AlphaGo Zero; Training AlphaGo Zero; Comparison with AlphaGo; Implementing AlphaGo Zero; Policy and value networks; preprocessing.py; features.py; network.py; Monte Carlo tree search; mcts.py; Combining PolicyValueNetwork and MCTS; alphagozero_agent.py; Putting everything together; controller.py; train.py; Summary; References. Chapter 7: Creating a ChatbotThe background problem; Dataset; Step-by-step guide; Data parser; Data reader; Helper methods; Chatbot model; Training the data; Testing and results; Summary; Chapter 8: Generating a Deep Learning Image Classifier; Neural Architecture Search; Generating and training child networks; Training the Controller; Training algorithm; Implementing NAS; child_network.py; cifar10_processor.py; controller.py; Method for generating the Controller; Generating a child network using the Controller; train_controller method; Testing ChildCNN; config.py; train.py. … (more)
- Publisher Details:
- Birmingham : Packt Publishing Ltd
- Publication Date:
- 2018
- Extent:
- 1 online resource (287 pages)
- Subjects:
- 006.31
Algorithms -- Study and teaching
Machine learning
Artificial intelligence
Python (Computer program language)
Algorithms -- Study and teaching
Electronic books - Languages:
- English
- ISBNs:
- 9781788993227
1788993225 - Notes:
- Note: Print version record.
- Access Rights:
- Legal Deposit; Only available on premises controlled by the deposit library and to one user at any one time; The Legal Deposit Libraries (Non-Print Works) Regulations (UK).
- Access Usage:
- Restricted: Printing from this resource is governed by The Legal Deposit Libraries (Non-Print Works) Regulations (UK) and UK copyright law currently in force.
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library HMNTS - ELD.DS.334856
- Ingest File:
- 02_334.xml