HOME > Detail View

Detail View

Statistical reinforcement learning : modern machine learning approaches

Statistical reinforcement learning : modern machine learning approaches (Loan 4 times)

Material type
단행본
Personal Author
Sugiyama, Masashi, 1974- author.
Title Statement
Statistical reinforcement learning : modern machine learning approaches / Masashi Sugiyama, University of Tokyo, Tokyo, Japan.
Publication, Distribution, etc
Boca Raton, FL :   CRC Press, Taylor & Francis Group,   2015.  
Physical Medium
xiii, 192 p. : ill. ; 25 cm.
Series Statement
Chapman & Hall/CRC machine learning & pattern recognition series
ISBN
9781439856895 (hardcover) : 1439856893 (hardcover)
General Note
"A Chapman & Hall book."  
Content Notes
1. Introduction -- 2. Model-free policy iteration -- 3. Model-free policy search -- 4. Model-based reinforcement learning.
Bibliography, Etc. Note
Includes bibliographical references and index.
Subject Added Entry-Topical Term
Reinforcement learning --Statistical methods. Reinforcement learning.
000 00000cam u22002057a 4500
001 000045902446
005 20170407105139
008 170407s2015 flua b 001 0 eng d
010 ▼a 2015413943
020 ▼a 9781439856895 (hardcover) : ▼c USD94.95
020 ▼a 1439856893 (hardcover)
035 ▼a (KERIS)BIB000014249369
040 ▼a 224010 ▼c 224010 ▼d 211009
082 0 4 ▼a 006.31 ▼2 22
084 ▼a 006.31 ▼2 DDCK
090 ▼a 006.31 ▼b S947s
100 1 ▼a Sugiyama, Masashi, ▼d 1974- ▼e author.
245 1 0 ▼a Statistical reinforcement learning : ▼b modern machine learning approaches / ▼c Masashi Sugiyama, University of Tokyo, Tokyo, Japan.
260 ▼a Boca Raton, FL : ▼b CRC Press, Taylor & Francis Group, ▼c 2015.
300 ▼a xiii, 192 p. : ▼b ill. ; ▼c 25 cm.
490 1 ▼a Chapman & Hall/CRC machine learning & pattern recognition series
500 ▼a "A Chapman & Hall book."
504 ▼a Includes bibliographical references and index.
505 0 ▼a 1. Introduction -- 2. Model-free policy iteration -- 3. Model-free policy search -- 4. Model-based reinforcement learning.
650 0 ▼a Reinforcement learning ▼x Statistical methods.
650 7 ▼a Reinforcement learning. ▼2 fast ▼0 (OCoLC)fst01732553.
830 0 ▼a Chapman & Hall/CRC machine learning & pattern recognition series.

Holdings Information

No. Location Call Number Accession No. Availability Due Date Make a Reservation Service
No. 1 Location Medical Library/Monographs(3F)/ Call Number 006.31 S947s Accession No. 131051554 Availability Available Due Date Make a Reservation Service B

Contents information

Table of Contents

Introduction to Reinforcement Learning
Reinforcement Learning
Mathematical Formulation
Structure of the Book
     Model-Free Policy Iteration
     Model-Free Policy Search
     Model-Based Reinforcement Learning

MODEL-FREE POLICY ITERATION

Policy Iteration with Value Function Approximation
Value Functions
     State Value Functions
     State-Action Value Functions
Least-Squares Policy Iteration
      Immediate-Reward Regression
     Algorithm
     Regularization
     Model Selection
Remarks

Basis Design for Value Function Approximation
Gaussian Kernels on Graphs
     MDP-Induced Graph
     Ordinary Gaussian Kernels
     Geodesic Gaussian Kernels
     Extension to Continuous State Spaces
Illustration
     Setup
     Geodesic Gaussian Kernels
     Ordinary Gaussian Kernels
     Graph-Laplacian Eigenbases
     Diffusion Wavelets
Numerical Examples
     Robot-Arm Control
     Robot-Agent Navigation
Remarks

Sample Reuse in Policy Iteration
Formulation
Off-Policy Value Function Approximation
     Episodic Importance Weighting
     Per-Decision Importance Weighting
     Adaptive Per-Decision Importance Weighting
     Illustration
Automatic Selection of Flattening Parameter
     Importance-Weighted Cross-Validation
     Illustration
Sample-Reuse Policy Iteration
     Algorithm
     Illustration
Numerical Examples
     Inverted Pendulum
     Mountain Car
Remarks

Active Learning in Policy Iteration
Efficient Exploration with Active Learning
     Problem Setup
     Decomposition of Generalization Error
     Estimation of Generalization Error
     Designing Sampling Policies
     Illustration
Active Policy Iteration
     Sample-Reuse Policy Iteration with Active Learning
     Illustration
Numerical Examples
Remarks

Robust Policy Iteration
Robustness and Reliability in Policy Iteration
     Robustness
     Reliability
Least Absolute Policy Iteration
     Algorithm
     Illustration
     Properties
Numerical Examples
Possible Extensions
     Huber Loss
     Pinball Loss
     Deadzone-Linear Loss
     Chebyshev Approximation
     Conditional Value-At-Risk
Remarks

MODEL-FREE POLICY SEARCH

Direct Policy Search by Gradient Ascent
Formulation
Gradient Approach
     Gradient Ascent
     Baseline Subtraction for Variance Reduction
     Variance Analysis of Gradient Estimators
Natural Gradient Approach 
     Natural Gradient Ascent
     Illustration
Application in Computer Graphics: Artist Agent
     Sumie Paining 
     Design of States, Actions, and Immediate Rewards
     Experimental Results
Remarks

Direct Policy Search by Expectation-Maximization
Expectation-Maximization Approach
Sample Reuse
     Episodic Importance Weighting
     Per-Decision Importance Weight
     Adaptive Per-Decision Importance Weighting
     Automatic Selection of Flattening Parameter
     Reward-Weighted Regression with Sample Reuse
Numerical Examples
Remarks

Policy-Prior Search
Formulation
Policy Gradients with Parameter-Based Exploration 
     Policy-Prior Gradient Ascent
     Baseline Subtraction for Variance Reduction
     Variance Analysis of Gradient Estimators
     Numerical Examples
Sample Reuse in Policy-Prior Search 
     Importance Weighting
     Variance Reduction by Baseline Subtraction
     Numerical Examples
Remarks

MODEL-BASED REINFORCEMENT LEARNING

Transition Model Estimation
Conditional Density Estimation
     Regression-Based Approach
     Q-Neighbor Kernel Density Estimation
     Least-Squares Conditional Density Estimation
Model-Based Reinforcement Learning
Numerical Examples
     Continuous Chain Walk
     Humanoid Robot Control
Remarks

Dimensionality Reduction for Transition Model Estimation
Sufficient Dimensionality Reduction
Squared-Loss Conditional Entropy
     Conditional Independence
     Dimensionality Reduction with SCE
     Relation to Squared-Loss Mutual Information
Numerical Examples
     Artificial and Benchmark Datasets 
     Humanoid Robot
Remarks

References
Index


Information Provided By: : Aladin

New Arrivals Books in Related Fields

Baumer, Benjamin (2021)