Publications
Please check my Google Scholar for a full list of my papers/publications.
Journals
- Centralised Rehearsal of Decentralised Cooperation: Multi-Agent Reinforcement Learning for the Scalable Coordination of Residential Energy Flexibility 
 Flora Charbonnier, Bei Peng, Julie Vienne, Elena Stai, Thomas Moisten, and Malcolm McCulloch. Applied Energy, 2024.[pdf]
- Dependable Learning-Enabled Multiagent Systems 
 Xiaowei Huang, Bei Peng, and Xingyu Zhao. AI Communications, 2022.[pdf]
- Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey 
 Sanmit Narvekar, Bei Peng, Matteo Leonetti, Jivko Sinapov, Matthew E. Taylor, Peter Stone. Journal of Machine Learning Research (JMLR), 21(181):1-50, 2020. [pdf]
- Curriculum Design for Machine Learners in Sequential Decision Tasks 
 Bei Peng, James MacGlashan, Robert Loftin, Michael L. Littman, David L. Roberts, and Matthew E. Taylor. IEEE Transactions on Emerging Topics in Computational Intelligence, 2018. [pdf]
- Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning 
 Robert Loftin, Bei Peng, James MacGlashan, Michael L. Littman, Matthew E. Taylor, Jeff Huang, and David L. Roberts. Journal of Autonomous Agents and Multi-Agent Systems, pages 1-30, 2015. [pdf]
Conferences
- Improving Diversity of Commonsense Generation by Large Language Models via In-Context Learning 
 Tianhui Zhang, Bei Peng. and Danushka Bollegala. In Proceedings of the Findings of Empirical Methods in Natural Language Processing (EMNLP), 2024.[pdf]
- Contextual Transformers for Goal-Oriented Reinforcement Learning 
 Oliver Dippel, Alexei Lisitsa, and Bei Peng. In Proceedings of the SGAI International Conference on Artificial Intelligence, AI-2024.
- A Comparison Between Kalman-MLE and KalmanNet for State Estimation with Unknown Noise Parameters 
 Bettina Hanlon, Ángel F. García-Fernández, and Bei Peng. In Proceedings of the IEEE International Conference on Multisensor Fusion and Integration, 2024.
- Accelerating Laboratory Automation Through Robot Skill Learning for Sample Scraping 
 Gabriella Pizzuto, Hetong Wang, Hatem Fakhruldeen, Bei Peng, Kevin Sebastian luck, Andrew Ian Cooper. In Proceedings of the IEEE 20th International Conference on Automation Science and Engineering (CASE), 2024.[pdf]
- Learning to Predict Concept Ordering for Common Sense Generation 
 Tianhui Zhang, Danushka Bollegala, and Bei Peng. In Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (IJCNLP-AACL), 2023.
- Deep Reinforcement Learning for Continuous Control of Material Thickness 
 Oliver Dippel, Alexei Lisitsa, and Bei Peng. In Proceedings of the SGAI International Conference on Artificial Intelligence, AI-2023.[pdf]
- FACMAC: Factored Multi-Agent Centralised Policy Gradients 
 Bei Peng (equal contribution), Tabish Rashid (equal contribution), Christian Schroeder de Witt (equal contribution), Pierre-Alexandre Kamienny, Philip Torr, Wendelin Böhmer, Shimon Whiteson. In proceedings of the 35th Conference on Neural Information Processing Systems (NeurIPS), 2021. [pdf]
- Regularized Softmax Deep Multi-Agent Q-Learning 
 Ling Pan, Tabish Rashid, Bei Peng, Longbo Huang, Shimon Whiteson. In proceedings of the 35th Conference on Neural Information Processing Systems (NeurIPS), 2021. [pdf]
- UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning 
 Tarun Gupta, Anuj Mahajan, Bei Peng, Wendelin Böhmer, Shimon Whiteson. In proceedings of the 38th International Conference on Machine Learning (ICML), 2021. [pdf]
- Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning 
 Shariq Iqbal, Christian A. Schroeder de Witt, Bei Peng, Wendelin Böhmer, Shimon Whiteson, Fei Sha. In proceedings of the 38th International Conference on Machine Learning (ICML), 2021. [pdf]
- RODE: Learning Roles to Decompose Multi-Agent Tasks 
 Tonghan Wang, Tarun Gupta, Anuj Mahajan, Bei Peng, Shimon Whiteson, and Chongjie Zhang. In proceedings of the 9th International Conference on Learning Representations (ICLR), 2021 [pdf]
- Weighted QMIX: Improving Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning 
 Tabish Rashid, Gregory Farquhar, Bei Peng, Shimon Whiteson. In proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS), 2020. [pdf]
- Optimistic Exploration even with a Pessimistic Initialisation 
 Tabish Rashid, Bei Peng, Wendelin Böhmer, Shimon Whiteson. In proceedings of the 8th International Conference on Learning Representations (ICLR), 2020. [pdf]
- Interactive Learning from Policy-Dependent Human Feedback 
 James MacGlashan, Mark K. Ho, Robert Loftin, Bei Peng, Guan Wang, David L. Roberts, Matthew E. Taylor, and Michael L. Littman. In Proceedings of the 34th International Conferences on Machine Learning (ICML), August 2017. [pdf]
- Curriculum Design for Machine Learners in Sequential Decision Tasks 
 Bei Peng, James MacGlashan, Robert Loftin, Michael L. Littman, David L. Roberts, and Matthew E. Taylor. In Proceedings of the 2017 International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Extended Abstracts, May 2017. [pdf]
- A Need for Speed: Adapting Agent Action Speed to Improve Task Learning from Non-Expert Humans 
 Bei Peng, James MacGlashan, Robert Loftin, Michael L. Littman, David L. Roberts, and Matthew E. Taylor. In Proceedings of the 2016 International Conference on Autonomous Agents and Multiagent Systems (AAMAS), May 2016. [pdf]
- Towards Integrating Real-Time Crowd Advice with Reinforcement Learning 
 Gabriel V. de la Cruz Jr., Bei Peng, Walter S. Lasecki, and Matthew E. Taylor. In The 20th ACM Conference on Intelligent User Interfaces (IUI), Poster Session, March 2015. [pdf]
- A Strategy-Aware Technique for Learning Behaviors from Discrete Human Feedback 
 Robert Loftin, James MacGlashan, Bei Peng, Machiael L. Littman, Matthew E. Taylor, Jeff Huang, and David L. Roberts. In Proceedings of the 28th AAAI Conference on Artificial Intelligence (AAAI), July 2014. [pdf]
Workshops
- Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients 
 Bozhidar Vasilev, Tarun Gupta, Bei Peng, Shimon Whiteson. In Proceedings of the Adaptive and Learning Agents workshop at AAMAS, 2021. [pdf]
- VIABLE: Fast Adaptation via Backpropagating Learned Loss 
 Leo Feng, Luisa Zintgraf, Bei Peng, Shimon Whiteson. 3rd Workshop on Meta-Learning at NeurIPS, 2019. [pdf]
- Optimistic Exploration with Pessimistic Initialisation 
 Tabish Rashid, Bei Peng, Wendelin Böhmer, Shimon Whiteson. 2nd Exploration in Reinforcement Learning Workshop at ICML, 2019. [pdf]
- Curriculum Design for Machine Learners in Sequential Decision Tasks 
 Bei Peng, James MacGlashan, Robert Loftin, Michael L. Littman, David L. Roberts, and Matthew E. Taylor. In Proceedings of the Adaptive and Learning Agents workshop (at AAMAS), Sao Paulo, Brazil, May 2017. [pdf]
- Open Problems for Online Bayesian Inference in Neural Networks 
 Robert Loftin, Matthew E. Taylor, Michael L. Littman, James MacGlashan, Bei Peng, and David L. Roberts. In Proceedings of Bayesian Deep Learning workshop (at NIPS), December 2016. [pdf]
- Towards Behavior-Aware Model Learning from Human-Generated Trajectories 
 Robert Loftin, James MacGlashan, Bei Peng, Matthew E. Taylor, Michael L. Littman, and David L. Roberts. In AAAI Fall Symposium on Artificial Intelligence for Human-Robot Interaction, November 2016. [pdf]
- Convergent Actor Critic by Humans 
 James MacGlashan, Michael L. Littman, David L. Roberts, Robert Loftin, Bei Peng, and Matthew E. Taylor. In Workshop on Human-Robot Collaboration: Towards Co-Adaptive Learning Through Semi-Autonomy and Shared Control (at IROS), October 2016. [pdf]
- An Empirical Study of Non-Expert Curriculum Design for Machine Learners 
 Bei Peng, James MacGlashan, Robert Loftin, Michael L. Littman, David L. Roberts, and Matthew E. Taylor. In Proceedings of the Interactive Machine Learning workshop (at IJCAI), July 2016. [pdf]
- On the Ability to Provide Demonstrations on a UAS: Observing 90 Untrained Participants Abusing a Flying Robot 
 Mitchell Scott, Bei Peng, Madeline Chili, Tanay Nigam, Francis Pascual, Cynthia Matuszek, and Matthew E. Taylor. In Proceedings of the AAAI Fall Symposium on Artificial Intelligence and Human-Robot Interaction (AI-HRI), November 2015. [pdf]
- Language and Policy Learning from Human-delivered Feedback 
 Bei Peng, Robert Loftin, James MacGlashan, Michael L. Littman, Matthew E. Taylor, and David L. Roberts. In Proceedings of the Machine Learning for Social Robotics workshop (at ICRA), May 2015. [pdf]
- Generating Real-Time Crowd Advice to Improve Reinforcement Learning Agents 
 Gabriel V. de la Cruz Jr., Bei Peng, Walter S. Lasecki, and Matthew E. Taylor. In Proceedings of the Learning for General Competency in Video Games workshop (AAAI), January 2015. [pdf]
- Learning Something from Nothing: Leveraging Implicit Human Feedback Strategies 
 Robert Loftin, Bei Peng, James MacGlashan, Michael Littman, Matthew E. Taylor, David Roberts, and Jeff Huang. In Proceedings of the 23rd IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), August 2014. [pdf]
- Training an Agent to Ground Commands with Reward and Punishment 
 James Macglashan, Michael L. Littman, Robert Loftin, Bei Peng, David Roberts, and Matthew E. Taylor. In Proceedings of the Machine Learning for Interactive Systems workshop (at AAAI), July 2014. [pdf]
Preprints
- Interactive Learning of Environment Dynamics for Sequential Tasks 
 Robert Loftin, Bei Peng, Matthew E. Taylor, Michael L. Littman, David L. Roberts. arXiv preprint arXiv:1907.08478, 2019. [pdf]
