Online Strongly Convex Optimization with Unknown Delays

Yuanyu Wan (Nanjing University); Wei-Wei Tu (4Paradigm Inc.); Lijun Zhang (Nanjing University)*

Abstract

We investigate the problem of online convex optimization with unknown delays, in which the feedback of a decision arrives with an arbitrary delay. Previous studies have presented delayed online gradient descent (DOGD), and achieved the regret bound of $O(\sqrt{T+D})$ by only utilizing the convexity condition, where $D$ is the sum of delays over $T$ rounds. In this paper, we further exploit the strong convexity to improve the regret bound. Specifically, we first propose a variant of DOGD for strongly convex functions, and establish a better regret bound of $O(d\log T)$, where $d$ is the maximum delay. The essential idea is to let the learning rate decay with the total number of received feedback linearly. Furthermore, we extend the strongly convex variant of DOGD and its theoretical guarantee to the more challenging bandit setting by combining with the classical $(n+1)$-point and two-point gradient estimators. To the best of our knowledge, this is the first work that solves online strongly convex optimization under the general delayed setting.