My name is Chao Yu（于超）. I received my Ph.D. from the Department of Electronic Engineering at Tsinghua University in 2023. I am currently an Assistant Professor (Distinguished Research Fellow) at the Embodied Decision Intelligence Lab (EDI Lab) at Tsinghua Shenzhen International Graduate School (SIGS). I also serve as the chairman of the Tsinghua Shenzhen International Graduate School - AgiBot Joint Research Center for Embodied Cognition and Decision Systems (JCES) 清华-智元联合研究中⼼主任. I’m also the co-founder of Striding AI(正行创新). I have been selected for the Youth Talent Support Program of the Chinese Institute of Electronics. My research has long focused on reinforcement learning–based decision intelligence. As first author or corresponding author, I have published more than 50 papers in top-tier international conferences and journals, including ICML, NeurIPS, ICLR, CVPR, ECCV, CoRL, IROS, ICRA, TMLR, and RAL, with over 7,000 citations on Google Scholar. My representative works include the multi-agent reinforcement learning algorithm MAPPO, which has received more than 4,000 Google Scholar citations, and RLinf, a large-scale reinforcement learning training framework for embodied intelligence, which has accumulated over 4,000 GitHub stars.

Feel free to reach out if you’d like to discuss research or explore potential collaboration!

📃 Research Interest

RL Infra

My Technical Preference: Scalable reinforcement learning systems, training infrastructure, and system-algorithm co-design for large-scale policy optimization.
Representative works on RL Infra include: RLinf and etc. covering efficient RL training, real-world online policy learning, and VLA+RL system design.

Strategic Agent

My Research Focus: multi-agent RL, strategic reasoning, self-play, cooperation/competition, and language agents.
Representative works include MAPPO, Fictitious Cross-Play, MARSHAL, WideSeek-R1, Werewolf game etc.

Embodied Agent

My Application Interest: embodied intelligence with quadrupeds, drones, multi-robot systems, VLA models, and world-model-based robotic training.
Representative works include WoVR, πRL, World4RL, RoboScape-R, VolleyBots, FlightBench, OmniDrones, and etc.

🔥 News

2026.01: 🎉 2 papers (1xfirst, 1xcontribute) are accepted by The Fourteenth International Conference on Learning Representations (ICLR 2026). See you in Rio de Janeiro🇧🇷!

🏫 Educations

2019 - 2023: Department of Electronic Engineering, Tsinghua University.
  Ph.D. in Electronic Science and Technology.
  Outstanding Doctoral Graduate (Top 5%), Outstanding Doctoral Thesis (Top 10%).
  Advisor: Prof. Yu Wang; Co-advisor: Assistant Prof. Yi Wu.
2016 - 2019: Department of Mechanical Engineering, Tsinghua University.
  M.S. in Mechanical Engineering and Automation.
  Outstanding Master’s Thesis (Top 10%).
  Advisor: Prof. Xin-Jun Liu.
2012 - 2016: School of Automation, Beijing Institute of Technology.
B.S. in Automation.
Outstanding Graduate (Top 15%).

📃 Publications

RAL 2026

Human-Guided Online Reward Adaptation forReal-Robot Arm Manipulations

Tianxing Zhou, Haojia Ao, Haoyang Lu, Guangyan Cheen, Zichen Zhou, Te Cui, Chao Yu📧, Yufeng Yue📧

IEEE Robotics and Automation Letters (RAL 2026)

Paper

RLC 2026

ICPL: Few-shot In-Context Preference Learning via LLMs

Chao Yu, Qixin Tan, Hong Lu, Jiaxuan Gao, Xinting Yang, Yu Wang, Yi Wu, Eugene Vinitsky📧

Reinforcement Learning Conference (RLC 2026)

Paper

OSDI 2026

DynaRL: Flexible and Dynamic Scheduling of Large-scale Reinforcement Learning Training

Yuanqing Wang, Hao Lin, Junhao Hu, Chunyang Zhu, Quanlu Zhang, Zhen Guo, Yuchen Zhang, Xu Fu, Si Xu, Bo Dai, Zixiao Huang, Chao Yu, Boxun Li, Guohao Dai, Zhi Yang, Yu Wang📧

20th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2026)

Paper

ACMMM 2026

Tex3D: Objects as Attack Surfaces via Adversarial 3D Textures for Vision-Language-Action Models

Jiawei Chen⭐️, Simin Huang⭐️, Jiawei Du, Shuaihang Chen, Yu Tian, Mingjie Wei, Chao Yu📧, Zhaoxia Yin📧

Proceedings of the ACM Multimedia Conference (ACM MM 2026)

Paper Project

arXiv 2026

WoVR: World Models as Reliable Simulators for Post-Training VLA Policies with RL

Zhennan Jiang, Shangqing Zhou, Yutong Jiang, Zefang Huang, Mingjie Wei, Yuhui Chen, Tianxing Zhou, Zhen Guo, Hao Lin, Quanlu Zhang, Yu Wang, Haoran Li, Chao Yu, Dongbin Zhao

arXiv preprint arXiv:2602.13977

Paper

arXiv 2026

Beyond Imitation: Reinforcement Learning-Based Sim-Real Co-Training for VLA Models

Liangzhi Shi, Sheng Chen, Feng Gao, Yuhui Chen, Kang Chen, Tonghe Zhang, Hongzhi Zang, Weinan Zhang, Chao Yu, Yu Wang

arXiv preprint arXiv:2602.12628

Paper

RSS 2026

USER: A Unified and Extensible System for Real-World Online Policy Learning in Embodied AI

Hongzhi Zang, Shu'ang Yu, Hao Lin, Tianxing Zhou, Zefang Huang, Zhen Guo, Xin Xu, Jiakai Zhou, Yuze Sheng, Shizhe Zhang, Feng Gao, Wenhao Tang, Yufeng Yue, Quanlu Zhang, Xinlei Chen, Chao Yu, Yu Wang

Robotics: Science and Systems (RSS 2026)

Paper

Preprint 2026

WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning

Zelai Xu⭐️, Zhexuan Xu⭐️, Ruize Zhang⭐️, Chunyang Zhu, Shi Yu, Weilin Liu, Quanlu Zhang, Wenbo Ding, Chao Yu📧, Yu Wang📧

Preprint (2026)

Paper

CVPR 2026

RoboScape-R: Unified Reward-Observation World Models for Generalizable Robotics Training via RL

Yinzhou Tang⭐️, Yu Shang⭐️, Yinuo Chen⭐️, Bingwen Wei, Xin Zhang, Shu'ang Yu, Liangzhi Shi, Chao Yu, Chen Gao, Wei Wu, Yong Li📧

The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)

Paper

ACL 2025

Red Teaming Large Reasoning Models

Jiawei Chen⭐️, Yang Yang⭐️, Chao Yu⭐️, Yu Tian, Zhi Cao, Xue Yang, Linghao Li, Hang Su, Zhaoxia Yin📧

Annual Meeting of the Association for Computational Linguistics (ACL 2025)

Paper

Preprint 2025

πRL: Online RL Fine-tuning for Flow-based Vision-Language-Action Models

Kang Chen⭐️, Zhihao Liu⭐️, Tonghe Zhang, Zhen Guo, Si Xu, Hao Lin, Hongzhi Zang, Xiang Li, Bingwen Wei, Jiakai Zhou, Quanlu Zhang, Zhaofei Yu, Guoliang Fan, Tiejun Huang, Yu Wang📧, Chao Yu📧

Preprint (2025)

Paper

IROS 2025

Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models

Yutao Ouyang, Jinhan Li, Yunfei Li, Zhongyu Li, Chao Yu, Koushil Sreenath, Yi Wu

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)

Paper

ICLR 2026

MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs

Huining Yuan⭐️, Zelai Xu⭐️, Zheyue Tan, Xiangmin Yi, Mo Guang, Kaiwen Long, Haojia Hui, Boxun Li, Xinlei Chen, Bo Zhao, Xiao-Ping Zhang📧, Chao Yu📧, Yu Wang📧

The Fourteenth International Conference on Learning Representations (ICLR 2026)

Paper

RSS 2026

RLinf-VLA: A Unified and Efficient Framework for VLA+RL Training

Hongzhi Zang, Mingjie Wei, Si Xu, Yongji Wu, Zhen Guo, Yuanqing Wang, Hao Lin, Liangzhi Shi, Yuqing Xie, Zhexuan Xu, Zhihao Liu, Kang Chen, Wenhao Tang, Quanlu Zhang, Weinan Zhang, Chao Yu📧, Yu Wang📧

Robotics: Science and Systems (RSS 2026)

Paper

ICLR 2026

SAC Flow: Sample-Efficient Reinforcement Learning of Flow-Based Policies via Velocity-Reparameterized Sequential Modeling

Yixian Zhang, Shu'ang Yu, Tonghe Zhang, Mo Guang, Haojia Hui, Kaiwen Long, Yu Wang, Chao Yu📧, Wenbo Ding📧

The Fourteenth International Conference on Learning Representations (ICLR 2026)

Paper

ICLR 2026

RE-PO: Robust Enhanced Policy Optimization as a General Framework for LLM Alignment

Xiaoyang Cao⭐️, Zelai Xu⭐️, Mo Guang, Kaiwen Long, Michiel A. Bakker, Yu Wang, Chao Yu📧

The Fourteenth International Conference on Learning Representations (ICLR 2026)

Paper

ICRA 2026

JuggleRL: Mastering Ball Juggling with a Quadrotor via Deep Reinforcement Learning

Shilong Ji, Yinuo Chen, Chuqi Wang, Jiayu Chen, Ruize Zhang, Feng Gao, Wenhao Tang, Shu'ang Yu, Sirui Xiang, Xinlei Chen📧, Chao Yu📧, Yu Wang📧

International Conference on Robotics and Automation (ICRA 2026)

Paper

RAL 2026

World4RL: Diffusion World Models for Policy Refinement with Reinforcement Learning for Robotic Manipulation

Zhennan Jiang, Kai Liu, Yuxin Qin, Shuai Tian, Yupeng Zheng, Mingcai Zhou, Chao Yu📧, Haoran Li📧, Dongbin Zhao📧

IEEE Robotics and Automation Letters (RAL 2026)

Paper

OSDI 2026

RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation

Chao Yu, Yuanqing Wang, Zhen Guo, Hao Lin, Si Xu, Hongzhi Zang, Quanlu Zhang, Yongji Wu, Chunyang Zhu, Junhao Hu, Zixiao Huang, Mingjie Wei, Yuqing Xie, Ke Yang, Bo Dai, Zhexuan Xu, Jiakun Du, Xiangyuan Wang, Xu Fu, Letong Shi, Zhihao Liu, Kang Chen, Weilin Liu, Gang Liu, Boxun Li, Jianlei Yang, Zhi Yang, Guohao Dai, Yu Wang📧

20th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2026)

Paper Code

IROS 2026

D3P: Dynamic Denoising Diffusion Policy via Reinforcement Learning

Shu-Ang Yu, Feng Gao, Yi Wu, Chao Yu📧, Yu Wang📧

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2026)

Paper

EMNLP 2025

Spec-VLA: Speculative Decoding for Vision-Language-Action Models with Relaxed Acceptance

Songsheng Wang, Rucheng Yu, Zhihang Yuan, Chao Yu, Feng Gao, Yu Wang, Derek F. Wong

EMNLP 2025 Main Conference

Paper

RA-L 2025

Online Planning for Multi-UAV Pursuit-Evasion in Unknown Environments Using Deep Reinforcement Learning

Jiayu Chen, Chao Yu, Guosheng Li, Wenhao Tang, Shilong Ji, Xinyi Yang, Botian Xu, Huazhong Yang, Yu Wang

IEEE Robotics and Automation Letters (2025)

Paper

arXiv 2025

Exploring the Secondary Risks of Large Language Models

Jiawei Chen, Zhengwei Fang, Xiao Yang, Chao Yu, Zhaoxia Yin, Hang Su

arXiv preprint arXiv:2506.12382

Paper

ICLR 2026

VS-Bench: Evaluating VLMs for Strategic Reasoning and Decision-Making in Multi-Agent Environments

Zelai Xu⭐️, Zhexuan Xu⭐️, Xiangmin Yi, Huining Yuan, Xinlei Chen, Yongji Wu, Chao Yu📧, Yu Wang📧

The Fourteenth International Conference on Learning Representations (ICLR 2026) Oral

Paper

NeurIPS 2025

What Can RL Bring to VLA Generalization? An Empirical Study

Jijia Liu, Feng Gao, Bingwen Wei, Xinlei Chen, Qingmin Liao, Yi Wu, Chao Yu, Yu Wang

NeurIPS 2025

Paper

NeurIPS 2025

ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning

Tonghe Zhang, Chao Yu📧, Shenzhi Su, Yu Wang

NeurIPS 2025

Paper

CoRL 2025

Toward Real-World Cooperative and Competitive Soccer with Quadrupedal Robot Teams

Zhi Su, Yuman Gao, Emily Lukas, Yunfei Li, Jiaze Cai, Faris Tulbah, Fei Gao, Chao Yu, Zhongyu Li, Yi Wu, Koushil Sreenath

Conference on Robot Learning (CoRL 2025)

Paper

Preprint 2025

Fine-tuning Diffusion Policies with Backpropagation Through Diffusion Timesteps

Ningyuan Yang, Jiaxuan Gao, Feng Gao, Yi Wu📧, Chao Yu📧

Preprint (2025)

Paper

CoRL 2025

Mastering Multi-Drone Volleyball through Hierarchical Co-Self-Play Reinforcement Learning

Ruize Zhang, Sirui Xiang, Zelai Xu, Feng Gao, Shilong Ji, Wenhao Tang, Wenbo Ding, Chao Yu📧, Yu Wang📧

9th Conference on Robot Learning (CoRL 2025), Seoul, Korea

Paper Project

Preprint 2025

Hysteresis-Aware Neural Network Modeling and Whole-Body Reinforcement Learning Control of Soft Robots

Zongyuan Chen⭐️, Yan Xia⭐️, Jiayuan Liu, Jijia Liu, Wenhao Tang, Jiayu Chen, Feng Gao, Longfei Ma, Hongen Liao, Yu Wang, Chao Yu📧, Boyu Zhang📧, Fei Xing📧

Preprint (2025)

Paper

CASE 2026

AED: Automatic Discovery of Effective and Diverse Vulnerabilities for Autonomous Driving Policy with Large Language Models

Le Qiu⭐️, Zelai Xu⭐️, Qixin Tan⭐️, Wenhao Tang, Chao Yu📧, Yu Wang📧

IEEE 22nd International Conference on Automation Science and Engineering (CASE 2026)

Paper Code

Survey 2025

Multi-Robot System for Cooperative Exploration in Unknown Environments: A Survey

Chuqi Wang⭐️, Chao Yu⭐️📧, Xin Xu, Yinuo Chen, Yuman Gao, Xinyi Yang, Wenhao Tang, Shu'ang Yu, Feng Gao, Zhuozhu Jian, Xinlei Chen, Fei Gao, Boyu Zhou, Yu Wang📧, Fellow, IEEE

Survey Paper (2025)

Paper

ICLR 2026

Translate Policy to Language: Flow Matching Generated Rewards for LLM Explanations

Xinyi Yang⭐️, Liang Zeng, Heng Dong, Chao Yu, Xiaoran Wu, Huazhong Yang, Yu Wang, Milind Tambe, Tonghan Wang📧

The Fourteenth International Conference on Learning Representations (ICLR 2026)

Paper

ICML 2025

Learning Strategic Language Agents in the Werewolf Game with Iterative Latent Space Policy Optimization

Zelai Xu⭐️📧, Wanjun Gu, Chao Yu📧, Yi Wu📧, Yu Wang📧

Proceedings of the 42nd International Conference on Machine Learning (ICML 2025)

Paper

NeurIPS 2025 D&B

VolleyBots: A Testbed for Multi-Drone Volleyball Game Combining Motion Control and Strategic Play

Zelai Xu⭐️, Ruize Zhang⭐️, Chao Yu📧, Huining Yuan, Xiangmin Yi, Shilong Ji, Chuqi Wang, Wenhao Tang, Feng Gao, Wenbo Ding, Xinlei Chen, Yu Wang📧

The Thirty-ninth Conference on Neural Information Processing Systems (NeurIPS 2025), Track on Datasets and Benchmarks

Paper Code Project

ICML 2025

Learning from Suboptimal Data in Continuous Control via Auto-Regressive Soft Q-Network

Jijia Liu, Feng Gao, Qingmin Liao, Chao Yu📧, Yu Wang📧

Proceedings of the 42nd International Conference on Machine Learning (ICML 2025)

Paper Project

JMLR 2025

Learning Global Nash Equilibrium in Team Competitive Games with Generalized Fictitious Cross-Play

Zelai Xu, Chao Yu📧, Yancheng Liang, Yi Wu, Yu Wang📧

Journal of Machine Learning Research, 26 (2025), 1–30

Paper

PACM IMWUT 2024

SleepNetZero: Zero-Burden Zero-Shot Reliable Sleep Staging with Neural Networks Based on Ballistocardiograms

Shuzhen Li, Yuxin Chen, Xuesong Chen, Rong Gao, Yina Zhang, Chao Yu, Yu Li, Ziyi Ye, Wei Huang, Hui Yi, Jiaxuan Gao, Wenbo Ding, Yu Wang

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (PACM IMWUT 2024)

Paper

RA-L 2025

Neural Internal Model Control: Learning a Robust Control Policy via Predictive Error Feedback

Feng Gao, Chao Yu📧, Yu Wang, Yi Wu📧

IEEE Robotics and Automation Letters (RA-L 2025)

Paper Code

IROS 2025

Multi-UAV Behavior-based Formation with Static and Dynamic Obstacles Avoidance via Reinforcement Learning

Yuqing Xie⭐️, Chao Yu⭐️📧, Hongzhi Zang⭐️, Feng Gao, Wenhao Tang, Jingyi Huang, Jiayu Chen, Botian Xu, Yi Wu, Yu Wang📧

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)

Paper Project

ICRA 2025

Human-Robot Cooperative Distribution Coupling for Hamiltonian-Constrained Social Navigation

Weizheng Wang, Chao Yu, Yu Wang, Byung-Cheol Min

IEEE International Conference on Robotics and Automation (ICRA 2025)

Paper Project

Preprint 2024

Reward-Robust RLHF in LLMs

Yuzi Yan, Xingzhou Lou, Jialian Li, Yiping Zhang, Jian Xie, Chao Yu, Yu Wang, Dong Yan📧, Yuan Shen📧

Preprint (2024)

Paper

Survey 2024

A Survey on Self-Play Methods in Reinforcement Learning

Ruize Zhang, Zelai Xu, Chengdong Ma, Chao Yu📧, Wei-Wei Tu, Wenhao Tang, Shiyu Huang📧, Deheng Ye, Wenbo Ding, Yaodong Yang, Yu Wang📧

Preprint (2024)

Paper

RA-L 2025

FlightBench: A Comprehensive Benchmark of Spatial Planning Methods for Quadrotors

Shu-Ang Yu⭐️, Chao Yu⭐️, Feng Gao⭐️, Yi Wu, Yu Wang

IEEE Robotics and Automation Letters (RA-L 2025)

Paper

Preprint 2024

CityLight: A Universal Model Towards Real-world City-scale Traffic Signal Control Coordination

Jinwei Zeng, Chao Yu, Xinyi Yang, Wenxuan Ao, Jian Yuan, Yong Li, Yu Wang, Huazhong Yang

Preprint (2024)

Paper

ICRA 2024

LAGOON: Language-Guided Motion Control

Shusheng Xu⭐️📧, Huaijie Wang⭐️, Yutao Ouyang, Jiaxuan Gao, Zhiyu Mei, Chao Yu📧, Yi Wu📧

IEEE International Conference on Robotics and Automation (ICRA 2024)

Paper

ICML 2024 Oral

Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study

Shusheng Xu⭐️📧, Wei Fu, Jiaxuan Gao, Wenjie Ye, Weilin Liu, Zhiyu Mei, Guangju Wang, Chao Yu📧, Yi Wu📧

Proceedings of the 41st International Conference on Machine Learning (ICML 2024)

Paper Code

AAAI 2024

Accelerate Multi-Agent Reinforcement Learning in Zero-Sum Games with Subgame Curriculum Learning

Jiayu Chen⭐️, Zelai Xu⭐️, Yunfei Li, Chao Yu, Jiaming Song, Huazhong Yang, Fei Fang, Yu Wang📧, Yi Wu📧

Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2024)

Paper Project

RA-L 2024

OmniDrones: An Efficient and Flexible Platform for Reinforcement Learning in Drone Control

Botian Xu⭐️, Feng Gao⭐️, Chao Yu📧, Ruize Zhang, Yi Wu, Yu Wang📧

IEEE Robotics and Automation Letters, 9(3): 2838–2844 (2024)

Paper Code

NeurIPS 2024

Sharing Minds during MARL Training for Enhanced Cooperative LLM Agents

Jiaxuan Gao, Yule Wen, Chao Yu📧, Yi Wu📧

The Thirty-eighth Conference on Neural Information Processing Systems (NeurIPS 2024)

Paper

ICLR 2025

Few-shot In-context Preference Learning using Large Language Models

Chao Yu, Hong Lu, Jiaxuan Gao, Qixin Tan, Xinting Yang, Yu Wang, Yi Wu, Eugene Vinitsky

The Thirteenth International Conference on Learning Representations (ICLR 2025)

Paper

AAMAS 2024

LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination

Jijia Liu⭐️, Chao Yu⭐️, Jiaxuan Gao⭐️, Yuqing Xie, Qingmin Liao, Yi Wu, Yu Wang

Proc. of the 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2024)

Paper

RA-L 2024

MASP: Scalable Graph-based Planning towards Multi-Agent Navigation

Xinyi Yang⭐️, Xinting Yang⭐️, Chao Yu📧, Jiayu Chen, Wenbo Ding, Huazhong Yang, Yu Wang📧

IEEE Robotics and Automation Letters (RA-L 2024)

Paper Project

Preprint 2023

Active Neural Topological Mapping for Multi-Agent Exploration

Xinyi Yang⭐️, Yuxiang Yang⭐️, Chao Yu📧, Jiayu Chen, Jingchen Yu, Haibing Ren, Huazhong Yang, Yu Wang📧

Preprint (2023)

Paper Project

ICML 2024

Language Agents with Reinforcement Learning for Strategic Play in the Werewolf Game

Zelai Xu⭐️📧, Chao Yu, Fei Fang, Yu Wang, Yi Wu📧

Proceedings of the 41st International Conference on Machine Learning (ICML 2024)

Paper

AAMAS 2023

Fictitious Cross-Play: Learning Global Nash Equilibrium in Mixed Cooperative-Competitive Games

Zelai Xu⭐️, Yancheng Liang, Chao Yu, Yu Wang, Yi Wu

Proceedings of the 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2023)

Paper

IJCAI 2023

Automatic Truss Design with Reinforcement Learning

Weihua Du⭐️, Jinglun Zhao⭐️, Chao Yu, Xingcheng Yao, Zimeng Song, Siyang Wu, Ruifeng Luo, Zhiyuan Liu, Xianzhong Zhao, Yi Wu

IJCAI 2023

Paper

ICLR 2023

Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased

Chao Yu⭐️, Jiaxuan Gao⭐️, Weilin Liu, Botian Xu, Hao Tang, Jiaqi Yang, Yu Wang📧, Yi Wu📧

The Eleventh International Conference on Learning Representations (ICLR 2023)

Paper

AAMAS 2023

Asynchronous Multi-Agent Reinforcement Learning for Efficient Real-Time Multi-Robot Cooperative Exploration

Chao Yu⭐️, Xinyi Yang⭐️, Jiaxuan Gao⭐️, Jiayu Chen, Yunfei Li, Jijia Liu, Yunfei Xiang, Ruixin Huang, Huazhong Yang, Yi Wu📧, Yu Wang📧

Proceedings of the 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2023)

Paper

AAMAS 2023

Learning Graph-Enhanced Commander-Executor for Multi-Agent Navigation

Xinyi Yang⭐️, Shiyu Huang, Yiwen Sun, Yuxiang Yang, Chao Yu, Wei-Wei Tu, Huazhong Yang📧, Yu Wang📧

Proceedings of the 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2023)

Paper

NeurIPS 2022 D&B

The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games

Chao Yu⭐️, Akash Velu⭐️, Eugene Vinitsky, Jiaxuan Gao, Yu Wang, Alexandre Bayen, Yi Wu

The Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS 2022), Track on Datasets and Benchmarks

Paper Code

ROBIO 2022

A Benchmark of Planning-based Exploration Methods in Photo-Realistic 3D Simulator

Xuan Du⭐️, Xinyi Yang⭐️, Chao Yu⭐️, Jiaxuan Gao, Qingmin Liao, Huazhong Yang, Yu Wang📧

IEEE International Conference on Robotics and Biomimetics (ROBIO 2022)

Paper

ECCV 2022

Learning Efficient Multi-Agent Cooperative Visual Exploration

Chao Yu📧, Xinyi Yang⭐️, Jiaxuan Gao⭐️, Huazhong Yang, Yu Wang, Yi Wu📧

European Conference on Computer Vision (ECCV 2022)

Paper Project

ICIP 2022

SAVE: Spatial-Attention Visual Exploration

Xinyi Yang⭐️, Chao Yu⭐️, Jiaxuan Gao⭐️, Yu Wang, Huazhong Yang

IEEE International Conference on Image Processing (ICIP 2022)

Paper

CoG 2022

VMAPD: Generate Diverse Solutions for Multi-Agent Games with Recurrent Trajectory Discriminators

Shiyu Huang, Chao Yu, Bin Wang, Dong Li, Yu Wang, Ting Chen, Jun Zhu

IEEE Conference on Games (CoG 2022)

Paper

ICML 2022

Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning

Wei Fu📧, Chao Yu, Zelai Xu, Jiaqi Yang, Yi Wu📧

Proceedings of the 39th International Conference on Machine Learning (ICML 2022)

Paper Project

Preprint 2021

Multi-Agent Vulnerability Discovery for Autonomous Driving with Hazard Arbitration Reward

Weilin Liu⭐️, Ye Mu⭐️, Chao Yu, Xuefei Ning📧, Zhong Cao, Yi Wu, Shuang Liang, Huazhong Yang, Yu Wang📧

Preprint (2021)

Paper

CAAI 2021

Unlocking the Potential of MAPPO with Asynchronous Optimization

Wei Fu, Chao Yu, Yunfei Li, Yi Wu📧

CAAI International Conference on Artificial Intelligence (2021)

Paper

ICLR 2021

Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization

Zhenggang Tang⭐️, Chao Yu⭐️📧, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon Du, Yu Wang, Yi Wu📧

The Ninth International Conference on Learning Representations (ICLR 2021)

Paper Project

DAC 2020

INCA: INterruptible CNN Accelerator for Multi-tasking in Embedded Robots

Jincheng Yu, Zhilin Xu, Shulin Zeng, Chao Yu, Jiantao Qiu, Chaoyang Shen, Yuanfan Xu, Guohao Dai, Yu Wang, Huazhong Yang

57th ACM/IEEE Design Automation Conference (DAC 2020)

Paper

IPDPSW 2020

CNN-based Monocular Decentralized SLAM on Embedded FPGA

Jincheng Yu, Feng Gao, Jianfei Cao, Chao Yu, Zhaoliang Zhang, Zhengfeng Huang, Yu Wang, Huazhong Yang

IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW 2020)

Paper

FCCM 2020

CNN-based Feature-point Extraction for Real-time Visual SLAM on Embedded FPGA

Zhilin Xu⭐️, Jincheng Yu⭐️, Chao Yu⭐️, Hao Shen, Yu Wang📧, Huazhong Yang📧

IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM 2020)

Paper

ICCRT 2019

Long-Sighted Imitation Learning for Partially Observable Control

Bo Xiong, Fangshi Wang, Chao Yu, Fei Qiao, Yi Yang, Qi Wei, Xinjun Liu

Proceedings of the 2019 2nd International Conference on Control and Robot Technology (ICCRT 2019)

Paper

ROBIO 2019

A DenseNet Feature-based Loop Closure Method for Visual SLAM System

Chao Yu, Zuxin Liu, Xin-Jun Liu, Fei Qiao, Yu Wang, Fugui Xie, Qi Wei, Yi Yang

2019 IEEE International Conference on Robotics and Biomimetics (ROBIO 2019)

Paper

Workshop 2019

Learning Safety-Aware Policy with Imitation Learning for Context-Adaptive Navigation

Bo Xiong, Fangshi Wang, Chao Yu, Fei Qiao, Yi Yang, Qi Wei, Xinjun Liu

Workshop Paper / Technical Report (2019)

Paper

IROS 2018

DS-SLAM: A Semantic Visual SLAM towards Dynamic Environments

Chao Yu, Zuxin Liu, Xin-Jun Liu📧, Fugui Xie, Yi Yang, Qi Wei, Qiao Fei📧

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2018)

Paper

🏆 Awards

2024: China Postdoctoral Excellent Special Foundation (Top 1,000 nationwide), Chinese Postdoctoral Science Foundation (CPSF).
2024: Postdoctoral Fellowship Program (Top 3,000 nationwide), Chinese Postdoctoral Science Foundation (CPSF).
2024: Runner-up for Outstanding Doctoral Thesis (Top 5), Chinese Intelligent Agent and Multi-Agent Systems.
2023: Shuimu Scholar Program, Tsinghua University.
2023: Chuanxin Future Scholar Program, Department of Electronic Engineering, Tsinghua University.
2023: Zhang Keqian Postdoctoral Fellowship, Department of Electronic Engineering, Tsinghua University.
2023: Outstanding Doctoral Thesis (Top 10%), Tsinghua University.
2023: Outstanding Doctoral Graduate (Top 5%), Tsinghua University.
2019: Outstanding Master’s Thesis (Top 10%), Tsinghua University.
2019 - 2023: First-Class Scholarship (3 times), Tsinghua University.
2015: National Scholarship, China Ministry of Education.

🎤 Talks

Nanjing, China🇨🇳

RLinf: A Highly Flexible Reinforcement Learning Post-Training Framework for Embodied Intelligence

智汇金陵 AI开源人才峰会暨魔搭开发者大会 2026.03.22 13:30 - 18:00

Link

Nanjing, China🇨🇳

中国具身智能大会

于超老师受邀在中国具身智能大会做了两场报告

Link

👓 Projects

Projects: [1] 基于深度强化学习的多无人机追逃博弈决策和控制关键技术研究，国家自然科学基金委，青年科学基金项目（C类）, 2025-2027.
Projects: [2] 多机协同高效机器学习系统研究，国家自然科学基金-中德合作交流基金, 2021-2025.
Projects: [3] 具有强推理能力的大语言模型智能体关键技术研究，中国博士后基金特别资助, 2023-2025.

💼 Work Experience

2026.01 - Present: Assistant Professor, Tsinghua University.
2023.07 - 2025.12: Postdoctoral Researcher, Tsinghua University.

🙋 Recruitment

We are actively recruiting!

We are looking for Ph.D. and Master students, Postdocs at Tsinghua University, Ph.D. students of the Joint Program of Zhongguancun Academy and Tsinghua University and Undergraduate Interns with strong interests and motivation to work on frontier research topics including:

Reinforcement learning & embodied intelligence infrastructure

Embodied agent & embodied large model training

Strategic agent

Open source community

Candidates with hands-on systems building abilities and mathematical background are highly encouraged.

Application → RLinf related work →

🔗 Bond

EE, THU

Energy Efficient Computing Group

Nanoscale Integrated Circuits and System Lab, Energy Efficient Computing Group (NICS-EFC) is leaded by Professor Yu Wang, in Electronic Engineering Department, Tsinghua University. The group is committed to the research of energy-efficient circuits and systems design methodology towards the Artificial Intelligence (AI) scenario: Multi-agent Reinforcement Learning Algorithm, Efficient and Robust DL system, Domain Specific Acceleration, and Multi-agent system. Each direction is headed by a research associate.

Link