• Login
    View Item 
    •   Home
    • Graduate Theses, Dissertations and Projects
    • Queen's Graduate Theses and Dissertations
    • View Item
    •   Home
    • Graduate Theses, Dissertations and Projects
    • Queen's Graduate Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Online Learning in Control Theory

    Thumbnail
    View/Open
    AkbariVarnousfaderani_Mohammad_202206_PhD.pdf (937.8Kb)
    Author
    Akbari Varnousfaderani, Mohammad
    Metadata
    Show full item record
    Abstract
    In this thesis, we study two classes of problems in optimal control theory involving unknown parameters, with focus on Linear-Quadratic-Gaussian systems.

    In the first problem, the control system is known and linear, and a sequence of quadratic cost functions is revealed to the controller in hindsight. The controller aims to achieve a sublinear regret, similar to the online optimization setting. A modified online Riccati algorithm is introduced that under some uniform boundedness assumptions results in a logarithmic regret bound. In particular, the logarithmic regret for the scalar case is achieved without any extra condition, while in the vector case we impose a uniform boundedness assumption. In addition to having a better regret bound, the proposed algorithm has reduced complexity compared to the ones in the literature which mainly rely on solving semi-definite programs in each time step.

    In the second problem, the true system transition parameters (matrices A and B) are unknown, and the objective is to design and analyze algorithms that generate control policies with sublinear regret. Recent studies show that when the system parameters are fully unknown, for any algorithm that only uses data from the past system trajectory there exists a choice of the unknown parameters such that the algorithm at best achieves a square-root regret bound, providing a hard fundamental limit on the achievable regret in general. However, it is known that (poly)-logarithmic regret is achievable when only matrix A, or only matrix B is unknown. We prove a result, showing that (poly)-logarithmic regret is achievable when both of these matrices are unknown, but a hint about one of them is given to the learner periodically. This result is the first attempt to achieve poly-logarithmic regret for the case that both A and B are unknown under milder assumptions.
    URI for this record
    http://hdl.handle.net/1974/30191
    Collections
    • Queen's Graduate Theses and Dissertations
    Request an alternative format
    If you require this document in an alternate, accessible format, please contact the Queen's Adaptive Technology Centre

    DSpace software copyright © 2002-2015  DuraSpace
    Contact Us
    Theme by 
    Atmire NV
     

     

    Browse

    All of QSpaceCommunities & CollectionsPublished DatesAuthorsTitlesSubjectsTypesThis CollectionPublished DatesAuthorsTitlesSubjectsTypes

    My Account

    LoginRegister

    Statistics

    View Usage StatisticsView Google Analytics Statistics

    DSpace software copyright © 2002-2015  DuraSpace
    Contact Us
    Theme by 
    Atmire NV