Publications
Please check my Google Scholar for a full list of my papers/publications.
Journals
Centralised Rehearsal of Decentralised Cooperation: Multi-Agent Reinforcement Learning for the Scalable Coordination of Residential Energy Flexibility
Flora Charbonnier, Bei Peng, Julie Vienne, Elena Stai, Thomas Moisten, and Malcolm McCulloch. Applied Energy, 2024.[pdf]Dependable Learning-Enabled Multiagent Systems
Xiaowei Huang, Bei Peng, and Xingyu Zhao. AI Communications, 2022.[pdf]Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey
Sanmit Narvekar, Bei Peng, Matteo Leonetti, Jivko Sinapov, Matthew E. Taylor, Peter Stone. Journal of Machine Learning Research (JMLR), 21(181):1-50, 2020. [pdf]Curriculum Design for Machine Learners in Sequential Decision Tasks
Bei Peng, James MacGlashan, Robert Loftin, Michael L. Littman, David L. Roberts, and Matthew E. Taylor. IEEE Transactions on Emerging Topics in Computational Intelligence, 2018. [pdf]Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning
Robert Loftin, Bei Peng, James MacGlashan, Michael L. Littman, Matthew E. Taylor, Jeff Huang, and David L. Roberts. Journal of Autonomous Agents and Multi-Agent Systems, pages 1-30, 2015. [pdf]
Conferences
Improving Diversity of Commonsense Generation by Large Language Models via In-Context Learning
Tianhui Zhang, Bei Peng. and Danushka Bollegala. In Proceedings of the Findings of Empirical Methods in Natural Language Processing (EMNLP), 2024.[pdf]Contextual Transformers for Goal-Oriented Reinforcement Learning
Oliver Dippel, Alexei Lisitsa, and Bei Peng. In Proceedings of the SGAI International Conference on Artificial Intelligence, AI-2024.A Comparison Between Kalman-MLE and KalmanNet for State Estimation with Unknown Noise Parameters
Bettina Hanlon, Ángel F. García-Fernández, and Bei Peng. In Proceedings of the IEEE International Conference on Multisensor Fusion and Integration, 2024.Accelerating Laboratory Automation Through Robot Skill Learning for Sample Scraping
Gabriella Pizzuto, Hetong Wang, Hatem Fakhruldeen, Bei Peng, Kevin Sebastian luck, Andrew Ian Cooper. In Proceedings of the IEEE 20th International Conference on Automation Science and Engineering (CASE), 2024.[pdf]Learning to Predict Concept Ordering for Common Sense Generation
Tianhui Zhang, Danushka Bollegala, and Bei Peng. In Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (IJCNLP-AACL), 2023.Deep Reinforcement Learning for Continuous Control of Material Thickness
Oliver Dippel, Alexei Lisitsa, and Bei Peng. In Proceedings of the SGAI International Conference on Artificial Intelligence, AI-2023.[pdf]FACMAC: Factored Multi-Agent Centralised Policy Gradients
Bei Peng (equal contribution), Tabish Rashid (equal contribution), Christian Schroeder de Witt (equal contribution), Pierre-Alexandre Kamienny, Philip Torr, Wendelin Böhmer, Shimon Whiteson. In proceedings of the 35th Conference on Neural Information Processing Systems (NeurIPS), 2021. [pdf]Regularized Softmax Deep Multi-Agent Q-Learning
Ling Pan, Tabish Rashid, Bei Peng, Longbo Huang, Shimon Whiteson. In proceedings of the 35th Conference on Neural Information Processing Systems (NeurIPS), 2021. [pdf]UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning
Tarun Gupta, Anuj Mahajan, Bei Peng, Wendelin Böhmer, Shimon Whiteson. In proceedings of the 38th International Conference on Machine Learning (ICML), 2021. [pdf]Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning
Shariq Iqbal, Christian A. Schroeder de Witt, Bei Peng, Wendelin Böhmer, Shimon Whiteson, Fei Sha. In proceedings of the 38th International Conference on Machine Learning (ICML), 2021. [pdf]RODE: Learning Roles to Decompose Multi-Agent Tasks
Tonghan Wang, Tarun Gupta, Anuj Mahajan, Bei Peng, Shimon Whiteson, and Chongjie Zhang. In proceedings of the 9th International Conference on Learning Representations (ICLR), 2021 [pdf]Weighted QMIX: Improving Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
Tabish Rashid, Gregory Farquhar, Bei Peng, Shimon Whiteson. In proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS), 2020. [pdf]Optimistic Exploration even with a Pessimistic Initialisation
Tabish Rashid, Bei Peng, Wendelin Böhmer, Shimon Whiteson. In proceedings of the 8th International Conference on Learning Representations (ICLR), 2020. [pdf]Interactive Learning from Policy-Dependent Human Feedback
James MacGlashan, Mark K. Ho, Robert Loftin, Bei Peng, Guan Wang, David L. Roberts, Matthew E. Taylor, and Michael L. Littman. In Proceedings of the 34th International Conferences on Machine Learning (ICML), August 2017. [pdf]Curriculum Design for Machine Learners in Sequential Decision Tasks
Bei Peng, James MacGlashan, Robert Loftin, Michael L. Littman, David L. Roberts, and Matthew E. Taylor. In Proceedings of the 2017 International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Extended Abstracts, May 2017. [pdf]A Need for Speed: Adapting Agent Action Speed to Improve Task Learning from Non-Expert Humans
Bei Peng, James MacGlashan, Robert Loftin, Michael L. Littman, David L. Roberts, and Matthew E. Taylor. In Proceedings of the 2016 International Conference on Autonomous Agents and Multiagent Systems (AAMAS), May 2016. [pdf]Towards Integrating Real-Time Crowd Advice with Reinforcement Learning
Gabriel V. de la Cruz Jr., Bei Peng, Walter S. Lasecki, and Matthew E. Taylor. In The 20th ACM Conference on Intelligent User Interfaces (IUI), Poster Session, March 2015. [pdf]A Strategy-Aware Technique for Learning Behaviors from Discrete Human Feedback
Robert Loftin, James MacGlashan, Bei Peng, Machiael L. Littman, Matthew E. Taylor, Jeff Huang, and David L. Roberts. In Proceedings of the 28th AAAI Conference on Artificial Intelligence (AAAI), July 2014. [pdf]
Workshops
Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients
Bozhidar Vasilev, Tarun Gupta, Bei Peng, Shimon Whiteson. In Proceedings of the Adaptive and Learning Agents workshop at AAMAS, 2021. [pdf]VIABLE: Fast Adaptation via Backpropagating Learned Loss
Leo Feng, Luisa Zintgraf, Bei Peng, Shimon Whiteson. 3rd Workshop on Meta-Learning at NeurIPS, 2019. [pdf]Optimistic Exploration with Pessimistic Initialisation
Tabish Rashid, Bei Peng, Wendelin Böhmer, Shimon Whiteson. 2nd Exploration in Reinforcement Learning Workshop at ICML, 2019. [pdf]Curriculum Design for Machine Learners in Sequential Decision Tasks
Bei Peng, James MacGlashan, Robert Loftin, Michael L. Littman, David L. Roberts, and Matthew E. Taylor. In Proceedings of the Adaptive and Learning Agents workshop (at AAMAS), Sao Paulo, Brazil, May 2017. [pdf]Open Problems for Online Bayesian Inference in Neural Networks
Robert Loftin, Matthew E. Taylor, Michael L. Littman, James MacGlashan, Bei Peng, and David L. Roberts. In Proceedings of Bayesian Deep Learning workshop (at NIPS), December 2016. [pdf]Towards Behavior-Aware Model Learning from Human-Generated Trajectories
Robert Loftin, James MacGlashan, Bei Peng, Matthew E. Taylor, Michael L. Littman, and David L. Roberts. In AAAI Fall Symposium on Artificial Intelligence for Human-Robot Interaction, November 2016. [pdf]Convergent Actor Critic by Humans
James MacGlashan, Michael L. Littman, David L. Roberts, Robert Loftin, Bei Peng, and Matthew E. Taylor. In Workshop on Human-Robot Collaboration: Towards Co-Adaptive Learning Through Semi-Autonomy and Shared Control (at IROS), October 2016. [pdf]An Empirical Study of Non-Expert Curriculum Design for Machine Learners
Bei Peng, James MacGlashan, Robert Loftin, Michael L. Littman, David L. Roberts, and Matthew E. Taylor. In Proceedings of the Interactive Machine Learning workshop (at IJCAI), July 2016. [pdf]On the Ability to Provide Demonstrations on a UAS: Observing 90 Untrained Participants Abusing a Flying Robot
Mitchell Scott, Bei Peng, Madeline Chili, Tanay Nigam, Francis Pascual, Cynthia Matuszek, and Matthew E. Taylor. In Proceedings of the AAAI Fall Symposium on Artificial Intelligence and Human-Robot Interaction (AI-HRI), November 2015. [pdf]Language and Policy Learning from Human-delivered Feedback
Bei Peng, Robert Loftin, James MacGlashan, Michael L. Littman, Matthew E. Taylor, and David L. Roberts. In Proceedings of the Machine Learning for Social Robotics workshop (at ICRA), May 2015. [pdf]Generating Real-Time Crowd Advice to Improve Reinforcement Learning Agents
Gabriel V. de la Cruz Jr., Bei Peng, Walter S. Lasecki, and Matthew E. Taylor. In Proceedings of the Learning for General Competency in Video Games workshop (AAAI), January 2015. [pdf]Learning Something from Nothing: Leveraging Implicit Human Feedback Strategies
Robert Loftin, Bei Peng, James MacGlashan, Michael Littman, Matthew E. Taylor, David Roberts, and Jeff Huang. In Proceedings of the 23rd IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), August 2014. [pdf]Training an Agent to Ground Commands with Reward and Punishment
James Macglashan, Michael L. Littman, Robert Loftin, Bei Peng, David Roberts, and Matthew E. Taylor. In Proceedings of the Machine Learning for Interactive Systems workshop (at AAAI), July 2014. [pdf]
Preprints
- Interactive Learning of Environment Dynamics for Sequential Tasks
Robert Loftin, Bei Peng, Matthew E. Taylor, Michael L. Littman, David L. Roberts. arXiv preprint arXiv:1907.08478, 2019. [pdf]