Model-free Reinforcement Learning Technique for Nonlinear Systems

Thumbnail Image
Mohamadizaniani, Maryam
Model-free RL , Lie-bracket Averaging , Set-based estimation , ESC
In this study, we propose an extremum-seeking control via Lie-bracket averaging approach for the approximation of optimal control problems for a class of unknown nonlinear dynamical systems. This model-free approach, combines an extremumseeking control (ESC) via Lie-bracket averaging approximation with a reinforcement learning (RL) strategy. The proposed learning approach tries to estimate the unknown value function and the corresponding optimal control policy, by using the Bellman equation and set-based least-squares estimation, which avoids the dual parameterization of the actor-critic methodology for RL. The Lie bracket approximations for ESC is used to approximate the optimal state feedback controller, which provides a model-free approach to avoid the overparameterization of the system's dynamics and the related increase in the estimation bias that happens in typical model-free actor-critic (AC) methods. The proposed approach is shown to provide reasonable approximations of optimal control problems without the need for a parameterization of the nonlinear system's dynamics.
External DOI