Chenjia Bai
Chenjia Bai
Home
Book
Publications
Team
Join us
Light
Dark
Automatic
Under-Review
On the Value of Myopic Behavior in Policy Reuse.
IEEE Transactions on Pattern Analysis and Machine Intelligence. 2023 (under review)
We present a framework called Selective Myopic bEhavior Control~(SMEC), which results from the insight that the short-term behaviors of prior policies are sharable across tasks.
Kang Xu
,
Chenjia Bai
✉
,
Shuang Qiu
,
Haoran He
,
Bin Zhao
,
Zhen Wang
,
Wei Li
,
Xuelong Li
PDF
Cite
Towards Robust Offline-to-Online Reinforcement Learning via Uncertainty and Smoothness.
Journal of Artificial Intelligence Research (under review)
, 2023
We propose the Robust Offline-to-Online (RO2O) algorithm, designed to enhance offline policies through uncertainty and smoothness, and to mitigate the performance drop in online adaptation.
Xiaoyu Wen
,
Xudong Yu
,
Rui Yang
,
Chenjia Bai
✉
,
Zhen Wang
PDF
Cite
Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning.
In
Artificial Intelligence (under review)
This work designs and analyzes a novel set of algorithms for multi-agent reinforcement learning (MARL) based on the principle of information-directed sampling (IDS).
Qiaosheng Zhang
,
Chenjia Bai
,
Shuyu Hu
,
Zhen Wang
✉
,
Xuelong Li
✉
PDF
Cite
Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration.
under review
We propose a novel framework for multi-agent collaboration that introduces Reinforced Advantage feedback (ReAd) for efficient self-refinement of plans.
Yang Zhang
,
Shixin Yang
,
Chenjia Bai
✉
,
Fei Wu
,
Xiu Li
,
Xuelong Li
,
Zhen Wang
PDF
Cite
Project
公众号
Decentralized Transformers with Centralized Aggregation are Sample-Efficient Multi-Agent World Models.
under review
we propose a novel world model for MARL that learns decentralized local dynamics for scalability, combined with a centralized representation aggregation from all agents.
Yang Zhang
,
Chenjia Bai
✉
,
Bin Zhao
,
Junchi Yan
,
Xiu Li
,
Xuelong Li
PDF
Cite
Task-agnostic Pre-training and Task-guided Fine-tuning for Versatile Diffusion Planner.
under review
We develop a versatile diffusion planner that can leverage large-scale inferior data that contains task-agnostic sub-optimal trajectories, with the ability to fast adapt to specific tasks.
Chenyou Fan
,
Chenjia Bai
✉
,
Zhao Shan
,
Haoran He
,
Yang Zhang
,
Zhen Wang
PDF
Cite
Preference Aligned Diffusion Planner for Quadrupedal Locomotion Control.
under review
We develop a learning framework combining offline diffusion planner and online preference alignment with weak preference labeling for legged locomotion control.
Xinyi Yuan
,
Zhiwei Shang
,
Zifan Wang
,
Chenkai Wang
,
Zhao Shan
,
Zhenchao Qi
,
Meixin Zhu
✉
,
Chenjia Bai
✉
,
Xuelong Li
PDF
Cite
Project
Cite
×