Learning Linear-Quadratic Regulators Efficiently with only √ T Regret
Abstract
We present the first computationally-efficient algorithm with $\tO(\sqrt{T})$ regret for learning in Linear Quadratic Control systems with unknown linear dynamics and known quadratic costs.