学术论文
80 篇多智能体强化学习、LLM推理和机器学习系统领域论文。 Google Scholar →
2026
Learning to Reason in Structured In-context Environments with Reinforcement Learning
Peng Yu, Zeyuan Zhao, Shao Zhang, Luoyi Fu, Xinbing Wang, Ying Wen
MAGIC: A Co-Evolving Attacker-Defender Adversarial Game for Robust LLM Safety
Xiaoyu Wen, Zhida He, Han Qi, Ziyu Wan, Zhongtian Ma, Ying Wen, Tianhang Zheng, Xingcheng Xu, Chaochao Lu, Qiaosheng Zhang
Memrl: Self-evolving agents via runtime reinforcement learning on episodic memory
Shengtao Zhang, Jiaqian Wang, Ruiwen Zhou, Junwei Liao, Yuchen Feng, Zhuo Li, Yujie Zheng, Weinan Zhang, Ying Wen, Zhiyu Li, Feiyu Xiong, Yutao Qi, Bo Tang, Muning Wen
Understanding Agent Scaling in LLM-Based Multi-Agent Systems via Diversity
Yingxuan Yang, Chengrui Qu, Muning Wen, Laixi Shi, Ying Wen, Weinan Zhang, Adam Wierman, Shangding Gu
2025
A survey of ai agent protocols
Yingxuan Yang, Huacan Chai, Yuanyi Song, Siyuan Qi, Muning Wen, Ning Li, Junwei Liao, Haoyi Hu, Jianghao Lin, Gaowei Chang, Weiwen Liu, Ying Wen, Yong Yu, Weinan Zhang
Agentic web: Weaving the next web with ai agents
Yingxuan Yang, Mulei Ma, Yuxuan Huang, Huacan Chai, Chenyu Gong, Haoran Geng, Yuanjian Zhou, Ying Wen, Meng Fang, Muhao Chen, Shangding Gu, Ming Jin, Costas Spanos, Yang Yang, Pieter Abbeel, Dawn Song, Weinan Zhang, Jun Wang
Agent exchange: Shaping the future of AI agent economics
Yingxuan Yang, Ying Wen, Jun Wang, Weinan Zhang
AT-Drone: Benchmarking Adaptive Teaming in Multi-Drone Pursuit
Y Li, J Chen, F Xue, J Qiu, W Li, Q Zhang, Y Wen, W Pan
Embodied arena: A comprehensive, unified, and evolving evaluation platform for embodied ai
Fei Ni, Min Zhang, Pengyi Li, Yifu Yuan, Lingfeng Zhang, Yuecheng Liu, Peilong Han, Longxin Kou, Shaojin Ma, Jinbin Qiao, David Gamaliel Arcos Bravo, Yuening Wang, Xiao Hu, Zhanguang Zhang, Xianze Yao, Yutong Li, Zhao Zhang, Ying Wen, Ying-Cong Chen, Xiaodan Liang, Liang Lin, Bin He, Haitham Bou-Ammar, He Wang, Huazhe Xu, Jiankang Deng, Shan Luo, Shuqiang Jiang, Wei Pan, Yang Gao, Stefanos Zafeiriou, Jan Peters, Yuzheng Zhuang, Yingxue Zhang, Yan Zheng, Hongyao Tang, Jianye Hao
Language Games as the Pathway to Artificial Superhuman Intelligence
Ying Wen, Ziyu Wan, Shao Zhang
Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration
Shao Zhang, Xihuai Wang, Wenhao Zhang, Chaoran Li, Junru Song, Tingyu Li, Lin Qiu, Xuezhi Cao, Xunliang Cai, Wen Yao, Weinan Zhang, Xinbing Wang, Ying Wen
Ml-master: Towards ai-for-ai via integration of exploration and reasoning
Zexi Liu, Yuzhu Cai, Xinyu Zhu, Yujie Zheng, Runkun Chen, Ying Wen, Yanfeng Wang, Weinan E, Siheng Chen
PMAT: Optimizing Action Generation Order in Multi-Agent Reinforcement Learning
Kun Hu, Muning Wen, Xihuai Wang, Shao Zhang, Yiwei Shi, Minne Li, Minglong Li, Ying Wen
Progra: Progress-Aware Reinforcement Learning for Multi-Turn Function Calling
Haochen Chai, Zhicheng Cao, Mingxuan Ran, Yiming Yang, Jiawei Lin, Ruizhi Ding, Ziyu Wan, Muning Wen, Weinan Liu, Ying Wen
RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors
Fengshuo Bai, Runze Liu, Yali Du, Ying Wen, Yaodong Yang
Rema: Learning to Meta-Think for LLMs with Multi-Agent Reinforcement Learning
Ziyu Wan, Yunxiang Li, Xiaoyu Wen, Yan Song, Hanjing Wang, Linyi Yang, Mark Schmidt, Jun Wang, Weinan Zhang, Shuyue Hu, Ying Wen
RHINO: Learning Real-Time Humanoid-Human-Object Interaction from Human Demonstrations
J Chen, X Li, J Cao, Z Zhu, W Dong, M Liu, Y Wen, Y Yu, L Zhang
Retrieval-Augmented Process Reward Model for Generalizable Mathematical Reasoning
J Zhu, C Zheng, J Lin, K Du, Y Wen, Y Yu, J Wang, W Zhang
Sequence Pathfinder for Multi-Agent Pickup and Delivery in the Warehouse
Z Zhao, C Li, S Zhang, Y Wen
Retrieval dexterity: Efficient object retrieval in clutters with dexterous hand
F Bai, Y Li, J Chu, T Chou, R Zhu, Y Wen, Y Yang, Y Chen
STAR: Efficient Preference-based Reinforcement Learning via Dual Regularization
Muning Wen, Ziyu Wan, Weinan Zhang, Jun Wang, Ying Wen
Towards Monotonic Improvement in In-Context Reinforcement Learning
W Zhang, S Zhang, X Wang, Y Li, Y Wen
Thinkbench: Dynamic out-of-distribution evaluation for robust llm reasoning
S Huang, L Yang, Y Song, S Chen, L Cui, Z Wan, Q Zeng, Y Wen, K Shao
ToolPRM: Fine-Grained Inference Scaling of Structured Outputs for Function Calling
J Lin, Y Shi, X Peng, R Ding, H Wang, Y Peng, B Bai, W Song, F Bai
Unlocking the potential of decentralized llm-based mas: Privacy preservation and monetization in collective intelligence
Y Yang, Q Peng, J Wang, Y Wen, W Zhang
2024
Agent Exchange: An Auction Platform for AI Agent Marketplaces
Y Yang, Y Wen, J Wang, W Zhang
Aligning Individual and Collective Objectives in Multi-Agent Cooperation
Yang Li, Wenhao Zhang, Jianhong Wang, Shao Zhang, Yali Du, Ying Wen, Wei Pan
Cooperative Open-ended Learning Framework for Zero-Shot Coordination
Yang Li, Shao Zhang, Jichen Sun, Yali Du, Ying Wen, Xinbing Wang, Wei Pan
Conflux-PSRO: Effectively leveraging collective advantages in policy space response oracles
Yucong Huang, Jiesong Lian, Mingzhi Wang, Chengdong Ma, Ying Wen
Cross-Utterance Conditioned VAE for Speech Generation
Y Li, C Yu, G Sun, W Zu, Z Tian, Y Wen, W Pan, C Zhang, J Wang, Y Yang
Critic-Guided Decision Transformer for Offline Reinforcement Learning
Yuanfu Wang, Chao Yang, Ying Wen*, Yu Liu, Yu Qiao
Controlling large language model-based agents for large-scale decision-making: An actor-critic approach
B Zhang, H Mao, J Ruan, Y Wen, Y Li, S Zhang, Z Xu, D Li, Z Li, R Zhao
Efficient model-agnostic alignment via bayesian persuasion
Fengshuo Bai, Mingzhi Wang, Zhaowei Zhang, Boyuan Chen, Yinda Xu, Ying Wen, Yaodong Yang
Efficient preference-based reinforcement learning via aligned experience estimation
F Bai, R Zhao, H Zhang, S Cui, Y Wen, Y Yang, B Xu, L Han
DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning
S Guo, C Deng, Y Wen, H Chen, Y Chang, J Wang
Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement
Muning Wen, Junwei Liao, Cheng Deng, Jun Wang, Weinan Zhang, Ying Wen
Fusion-psro: Nash policy fusion for policy space response oracles
Jiesong Lian, Yucong Huang, Chengdong Ma, Mingzhi Wang, Ying Wen, Long Hu, Yixue Hao
HOLA-Drone: Hypergraphic Open-ended Learning for Zero-Shot Multi-Drone Cooperative Pursuit
Yang Li, Dengyu Zhang, Junfan Chen, Ying Wen, Qingrui Zhang, Shaoshuai Mou, Wei Pan
KaLM: Knowledge-aligned autoregressive language modeling via dual-view knowledge graph contrastive learning
Peng Yu, Cheng Deng, Beiya Dai, Xinbing Wang, Ying Wen
Leveraging Team Correlation for Approximating Equilibrium in Two-Team Zero-Sum Games
Naming Liu, Mingzhi Wang, Youzhi Zhang, Yaodong Yang, Bo An, Ying Wen
Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task
Shao Zhang, Xihuai Wang, Wenhao Zhang, Yongshan Chen, Lian Gao, Dong Wang, Weinan Zhang, Xinbing Wang, Ying Wen
Natural language reinforcement learning
X Feng, B Liu, Y Song, H Fu, Z Wan, GA Koushik, Z Hu, M Yang, Y Wen
Open-Ended Learning in General-Sum Games: The Role of Diversity in Correlated Equilibrium
Z Zhao, M Wen, Y Wen, Y Yang
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
OpenR Team, Ying Wen
Reinforcing Language Agents via Policy Optimization with Action Decomposition
Muning Wen, Ziyu Wan, Weinan Zhang, Jun Wang, Ying Wen
Reinforcing LLM Agents via Policy Optimization with Action Decomposition
M Wen, Z Wan, J Wang, W Zhang, Y Wen
Tackling cooperative incompatibility for zero-shot human-ai coordination
Y Li, S Zhang, J Sun, W Zhang, Y Du, Y Wen, X Wang, W Pan
AlphaZero-like Tree-Search can Guide Large Language Model Decoding and Training
Ziyu Wan, Xidong Feng, Muning Wen, Stephen Marcus McAleer, Ying Wen, Weinan Zhang, Jun Wang
TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision
R Zhou, Y Yang, M Wen, Y Wen, W Wang, C Xi, G Xu, Y Yu, W Zhang
ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-agent Zero-shot Coordination
Xihuai Wang, Shao Zhang, Wenhao Zhang, Wentao Dong, Jingxiao Chen, Ying Wen, Weinan Zhang
2023
GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models
Hanjing Wang, Man-Kit Sit, Congjie He, Ying Wen, Weinan Zhang, J. Wang, Yaodong Yang, Luo Mai
Offline Pre-trained Multi-agent Decision Transformer
Linghui Meng, Muning Wen, Chenyang Le, Xiyun Li, Dengpeng Xing, Weinan Zhang, Ying Wen, Haifeng Zhang, Jun Wang, Yaodong Yang, Bo Xu
Order Matters: Agent-by-agent Policy Optimization
Xihuai Wang, Zheng Tian, Ziyu Wan, Ying Wen, J. Wang, Weinan Zhang
2022
Greedy when sure and conservative when uncertain about the opponents
H Fu, Y Tian, H Yu, W Liu, S Wu, J Xiong, Y Wen, K Li, J Xing, Q Fu
Multi-agent feedback enabled neural networks for intelligent communications
F Sun, Y Li, Y Wen, J Hu, J Wang, Y Yang, K Li
2021
A Game-Theoretic Approach to Multi-Agent Trust Region Optimization
Ying Wen, Hui Chen, Yaodong Yang, Zheng Tian, Minne Li, Xu Chen, Jun Wang
🏆 Diverse Auto-Curriculum is Critical for Successful Real-World Multiagent Learning Systems
Yaodong Yang, Jun Luo, Ying Wen, Oliver Slumbers, D. Graves, H. Ammar, Jun Wang, Matthew E. Taylor
Discovering Multi-Agent Auto-Curricula in Two-Player Zero-Sum Games
Xidong Feng, Oliver Slumbers, Yaodong Yang, Ziyu Wan, Bo Liu (Benjamin Liu), S. McAleer, Ying Wen, Jun Wang
Learning in Nonzero-Sum Stochastic Games with Potentials
D. Mguni, Yutong Wu, Yali Du, Yaodong Yang, Ziyi Wang, Minne Li, Ying Wen, Joel Jennings, Jun Wang
MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning
Ming Zhou, Ziyu Wan, Hanjing Wang, Muning Wen, Runzhe Wu, Ying Wen, Yaodong Yang, Weinan Zhang, Jun Wang
Modelling behavioural diversity for learning in open-ended games
N Perez-Nieves, Y Yang, O Slumbers, DH Mguni, Y Wen, J Wang
Neural Auto-Curricula in Two-Player Zero-Sum Games
Xidong Feng, Oliver Slumbers, Ziyu Wan, Bo Liu (Benjamin Liu), S. McAleer, Ying Wen, Jun Wang, Yaodong Yang
Towards Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games
Xiangyu Liu, Hangtian Jia, Ying Wen, Yaodong Yang, Yujing Hu, Yingfeng Chen, Changjie Fan, Zhipeng Hu
Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning
J. Kuba, Ruiqing Chen, Munning Wen, Ying Wen, Fanglei Sun, Jun Wang, Yaodong Yang
Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games
Xiangyu Liu, Hangtian Jia, Ying Wen, Yaodong Yang, Yujing Hu, Yingfeng Chen, Changjie Fan, Zhipeng Hu
2020
Multi-Agent Determinantal Q-Learning
Yaodong Yang, Ying Wen, Lihuan Chen, Jun Wang, Kun Shao, D. Mguni, Weinan Zhang
🏆 SMARTS: An Open-Source Scalable Multi-Agent RL Training School for Autonomous Driving
Ming Zhou, Jun Luo, Julian Villela, Yaodong Yang, David Rusu, J. Miao, Weinan Zhang, Montgomery Alban, Iman Fadakar, Zheng Chen, Chong-ping Huang, Ying Wen, Kimia Hassanzadeh, D. Graves, Zhengbang Zhu, Yihan Ni, Nhat M. Nguyen, Mohamed Elsayed, H. Ammar, A. Cowen-Rivers, S. Ahilan, Zheng Tian, Daniel Palenicek, Kasra Rezaee, Peyman Yadmellat, Kun Shao, Dong Chen, Baokuan Zhang, Hongbo Zhang, Jianye Hao, Wulong Liu, Jun Wang
SMARTS: Scalable Multi-Agent Reinforcement Learning Training School for Autonomous Driving
Ming Zhou, Jun Luo, Julian Villela, Yaodong Yang, David Rusu, Jiayu Miao, Weinan Zhang, Montgomery Alban, Iman Fadakar, Zheng Chen, Aurora Chongxi Huang, Ying Wen, Kimia Hassanzadeh, D. Graves, Dong Chen, Zhengbang Zhu, Nhat M. Nguyen, M. ElSayed, Kun Shao, S. Ahilan, Baokuan Zhang, Jiannan Wu, Zhengang Fu, Kasra Rezaee, Peyman Yadmellat, Mohsen Rohani, Nicolas Perez Nieves, Yihan Ni, Seyedershad Banijamali, Alexander Cowen Rivers, Zheng Tian, Daniel Palenicek, H. Ammar, Hongbo Zhang, Wulong Liu, Jianye Hao, Jun Wang
2019
A Regularized Opponent Model with Maximum Entropy Objective
Zheng Tian, Ying Wen, Zhichen Gong, Faiz Punakkath, Shihao Zou, Jun Wang
Modelling Bounded Rationality in Multi-Agent Interactions by Generalized Recursive Reasoning
Ying Wen, Yaodong Yang, Jun Wang
Probabilistic Recursive Reasoning for Multi-Agent Reinforcement Learning
Ying Wen, Yaodong Yang, Rui Luo, Jun Wang, Wei Pan
2018
Factorized Q-learning for large-scale multi-agent systems
Yong Chen, M. Zhou, Ying Wen, Yaodong Yang, Yufeng Su, Weinan Zhang, Dell Zhang, Jun Wang, Han Liu
2017
A Study of AI Population Dynamics with Million-agent Reinforcement Learning
Yaodong Yang, Lantao Yu, Yiwei Bai, Ying Wen, Weinan Zhang, Jun Wang
Learning to Design Games: Strategic Environments in Deep Reinforcement Learning
Haifeng Zhang, Jun Wang, Zhiming Zhou, Weinan Zhang, Ying Wen, Yong Yu, Wenxin Li
Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games
Peng Peng, Quan Yuan, Ying Wen, Yaodong Yang, Zhenkun Tang, Haitao Long, Jun Wang
Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games
Peng Peng, Ying Wen, Yaodong Yang, Quan Yuan, Zhenkun Tang, Haitao Long, Jun Wang