比利时vs摩洛哥足彩
,
university of california san diego
****************************
math 296 - graduate student colloquium
prof. yuhau zhu
uc san diego
a pde-based bellman equation for continuous-time reinforcement learning
abstract:
in this talk, we address the problem of continuous-time reinforcement learning in scenarios where the dynamics follow a stochastic differential equation. when the underlying dynamics remain unknown and we have access only to discrete-time information, how can we effectively conduct policy evaluation? we first demonstrate that the commonly used bellman equation is a first-order approximation to the true value function. we then introduce higher order pde-based bellman equation called phibe. we show that the solution to the i-th order phibe is an i-th order approximation to the true value function. additionally, even the first-order phibe outperforms the bellman equation in approximating the true value function when the system dynamics change slowly. we develop a numerical algorithm based on galerkin method to solve phibe when we possess only discrete-time trajectory data. numerical experiments are provided to validate the theoretical guarantees we propose.
host: jon novak
february 14, 2024
3:00 pm
remote access via zoom
https://ucsd.zoom.us/j/
****************************