Deep Reinforcement Learning
-
BRAC+: Improved Behavior Regularized Actor Critic for Offline Reinforcement Learning
Chi Zhang (University of Southern California)*; Sanmukh Rao Kuppannagari (University of Southern California); Viktor K Prasanna (Unversity of Southern California) -
Dynamic Coordination Graph for Cooperative Multi-Agent Reinforcement Learning
Chapman Siu (University of Technology Sydney)*; Jason Traish (University of Technology Sydney ); Richard Yi Da Xu (University of Technology, Sydney) -
CTS2: Time Series Smoothing with Constrained Reinforcement Learning
Yongshuai Liu (University of California, Davis)*; Xin Liu (University of California) -
Language Representations for Generalization in Reinforcement Learning
Nikolaj S Goodger (Federation University Australia)*; Peter Vamplew (Federation University); Cameron Foale (Federation University); Richard Dazeley (Deakin University) -
Meta-Model-Based Meta-Policy Optimization
Takuya Hiraoka (NEC / AIST / RIKEN)*; Takahisa Imagawa (National Institute of Advanced Industrial Science and Technology); Voot Tangkaratt (RIKEN); Takayuki Osa (Kyushu Institute of Technology / RIKEN); Takashi Onishi (NEC Corporation); Yoshimasa Tsuruoka (The University of Tokyo) -
ContriQ: Ally-Focused Cooperation and Enemy-Concentrated Confrontation in Multi-Agent Reinforcement Learning
Chenran Zhao (National University of Defense Technology); Dianxi Shi (National Innovation Institute of Defense Technology;Tianjin Artiļ¬cial Intelligence Innovation Center)*; Yaowen Zhang (National Innovation Institute of Defense Technology (NIIDT)); Huanhuan Yang (National University of Defense Technology); Shaowu Yang (National University of Defense Technology); Yongjun Zhang (National Innovation Institute of Defense Technology) -
Robust Domain Randomised Reinforcement Learning through Peer-to-Peer Distillation
Chenyang Zhao (University of Edinburgh)*; Timothy Hospedales (Edinburgh University) -
Learning 3-opt heuristics for traveling salesman problem via deep reinforcement learning
Jingyan Sui (Institute of Computing Technology, Chinese Academy of Sciences)*; Shi-Zhe Ding (Institute of Computing Technology, Chinese Academy of Sciences); Ruizhi Liu (Institute of Computing Technology, Chinese Academy of Sciences); Liming Xu (Institute of Computing Technology, Chinese Academy of Sciences); Dongbo Bu (Insitute of Computing Technology, Chinese Academy of Sciences)