Reinforcement Learning with Structured Actions and Policies

Balaraman Ravindran - Indian Institute of Technology Madras


Deep Reinforcement Learning has been very successful in solving a variety of hard problems. But many RL architectures treat the action as coming from an unordered set or from a bounded interval. It is often the case that the actions and policies have a non-trivial structure that can be exploited for more efficient learning. This ranges from game playing settings where the same action is repeated multiple times, to supply-chain problems where the action space has a combinatorial structure, to problems that require a hierarchical decomposition to solve effectively. In this talk, I will present several scenarios in which taking advantage of the structure leads to more efficient learning. In particular, our talk about some of our recent work on action repetition, actions that are related via a graph structure, ensemble policies, and policies learnt through a combination of hierarchical planning and learning.


image-left Professor B. Ravindran heads the Robert Bosch Centre for Data Science & Artificial Intelligence (RBCDSAI) at IIT Madras, one of the leading interdisciplinary AI research centre in India. He is the Mindtree Faculty Fellow and Professor in the Department of Computer Science and Engineering at IIT Madras. He has held visiting positions at the Indian Institute of Science, Bangalore, India, University of Technology, Sydney, Australia and Google Research. Currently, his research interests are centred on learning from and through interactions and span the areas of geometric deep learning and reinforcement learning. He is one of the founding executive committee members of the India chapter of ACM SIGKDD. He is currently serving on the editorial boards of Machine Learning Journal (MLJ), Journal of AI Research (JAIR), ACM Transactions on Intelligent Systems and Technology (ACM TIST), PLOS One, and Frontiers in Big Data and AI. He has published nearly 100 papers in premier journals and conferences. His work with students have won multiple best paper awards, the most recent being the best application paper at PAKDD 2021. He received his PhD from the University of Massachusetts, Amherst and his Master’s degree from the Indian Institute of Science, Bangalore. He is a senior member of the Association for Advancement of AI (AAAI).