Learning to Switch Optimizers for Quadratic Programming

Grant E Getzelman (Argonne National Lab)*; Prasanna Balaprakash (Argonne National Laboratory)


Quadratic programming (QP) seeks to solve optimization problems involving quadratic functions that can include complex boundary constraints. QP in the unrestricted form is NP-hard, but when restricted to the convex case, they become tractable. Active-set methods and interior point methods are used to solve convex problems, and in the nonconvex case various heuristic or relaxations are used to produce high quality solutions in finitetime. Learning to optimize is an emerging approach to design solvers for optimization problems. We develop a learning to optimize approach that use reinforcement learning tolearn a stochastic policy to switch between pre-existing optimization algorithms to solve QP problem instances. In particular, our agent switches between between three simple optimiz-ers, Adam, Gradient Descent (GD), and Random Search (RS). Our experiments show the learned optimizer minimizes quadratic functions faster and finds better quality solutions in the long term than any of the possible optimizers switched between. We also compare our solver against the standard QP algorithms in Matlab, and find better performance in fewer function evaluations.