Under-Review

Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning.

In Artificial Intelligence (under review)

This work designs and analyzes a novel set of algorithms for multi-agent reinforcement learning (MARL) based on the principle of information-directed sampling (IDS).

Qiaosheng Zhang , Chenjia Bai , Shuyu Hu , Zhen Wang^✉ , Xuelong Li^✉

Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning.

VLP: Vision-Language Preference Learning for Embodied Manipulation.

under review

we propose a novel Vision-Language Preference learning framework that learns a vision-language preference model to provide preference feedback for embodied manipulation tasks.

Runze Liu , Chenjia Bai^✉ , Jiafei Lyu , Shengjie Sun , Yali Du , Xiu Li^✉

VLP: Vision-Language Preference Learning for Embodied Manipulation.

Information-Theoretic Reward Decomposition for Generalizable RLHF.

under review

We decompose the reward value in RLHF into two independent components that consists prompt-free reward and prompt-related reward, and propose a new reward learning algorithm by prioritizing data samples based on their prompt-free reward values.

Liyuan Mao , Haoran Xu , Amy Zhang , Weinan Zhang^✉ , Chenjia Bai^✉

Information-Theoretic Reward Decomposition for Generalizable RLHF.

Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning

under review

We propose Adversarial Locomotion and Motion Imitation (ALMI) for humanoid robots, which serves as a novel framework for loco-manipulation tasks, enabling adversarial policy learning between upper and lower body.

Jiyuan Shi , Xinzhe Liu , Dewei Wang , Ouyang Lu , Sören Schwertfeger , Fuchun Sun , Chenjia Bai^✉ , Xuelong Li^✉

Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning

Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective

under review

We propose Diffusion-Inspired Multi-Agent world model (DIMA), a novel framework for multi-agent reinforcement learning that leverages diffusion models to reduce modeling complexity and improve sample efficiency.

Yang Zhang , Xinran Li , Jianing Ye , Delin Qu , Shuang Qiu , Chongjie Zhang , Xiu Li , Chenjia Bai^✉

Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective

Learn as Individuals, Evolve as a Team: Multi-agent LLMs Adaptation in Embodied Environments

under review

We propose the Learn as Individuals, Evolve as a Team (LIET) framework to enable multi-agent LLMs to adapt to embodied environments through individual learning and team evolution

Xinran Li , Chenjia Bai^✉ , Zijian Li , Jiakun Zheng , Ting Xiao , Jun Zhang^✉

Learn as Individuals, Evolve as a Team: Multi-agent LLMs Adaptation in Embodied Environments

Towards a Generalizable Bimanual Foundation Policy via Flow-based Video Prediction

under review

We propose a novel bimanual foundation policy that leverages text-to-video models to predict robot trajectories and uses optical flow as an intermediate variable to improve generalization.

Chenyou Fan , Fangzheng Yan , Chenjia Bai^✉ , Jiepeng Wang , Chi Zhang , Zhen Wang , Xuelong Li^✉

Towards a Generalizable Bimanual Foundation Policy via Flow-based Video Prediction

MoRE: Mixture of Residual Experts for Humanoid Lifelike Gaits Learning on Complex Terrains

under review

We propose a novel framework that enables humanoid robots to traverse complex terrains with controllable human-like gaits using a mixture of latent residual experts and multi-discriminators.

Dewei Wang , Xinmiao Wang , Xinzhe Liu , Jiyuan Shi , Yingnan Zhao , Chenjia Bai^✉ , Xuelong Li^✉

MoRE: Mixture of Residual Experts for Humanoid Lifelike Gaits Learning on Complex Terrains

KungfuBot: Physics-Based Humanoid Whole-Body Control for Learning Highly-Dynamic Skills

under review

We propose a physics-based humanoid control framework, aiming to master highly-dynamic human behaviors such as Kungfu and dancing through multi-steps motion processing and adaptive motion tracking.

Weiji Xie(+) , Jinrui Han(+) , Jiakun Zheng(+) , Huanyu Li , Xinzhe Liu , Jiyuan Shi , Weinan Zhang , Chenjia Bai^✉ , Xuelong Li^✉

KungfuBot: Physics-Based Humanoid Whole-Body Control for Learning Highly-Dynamic Skills