Chenjia Bai
Chenjia Bai
Home
Book
Publications
Team
Join us
Light
Dark
Automatic
Conference
Principled Exploration via Optimistic Bootstrapping and Backward Induction.
In
International Conference on Machine Learning (
ICML
)
, 2021
Spotlight
We propose a principled exploration method for DRL through Optimistic Bootstrapping and Backward Induction (OB2I).
Chenjia Bai
,
Lingxiao Wang
,
Lei Han
,
Jianye Hao
,
Animesh Garg
,
Peng Liu
,
Zhaoran Wang
PDF
Cite
Code
Dynamic Bottleneck for Robust Self-Supervised Exploration.
In
Neural Information Processing Systems (
NeurIPS
)
, 2021
We propose a Dynamic Bottleneck (DB) model, which attains a dynamics-relevant representation based on the information-bottleneck principle.
Chenjia Bai
,
Lingxiao Wang
,
Lei Han
,
Animesh Garg
,
Jianye Hao
,
Peng Liu
,
Zhaoran Wang
PDF
Cite
Code
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning.
International Conference on Learning Representations (
ICLR
)
, 2022
Spotlight
We propose Pessimistic Bootstrapping for offline RL (PBRL), a purely uncertainty-driven offline algorithm without explicit policy constraints.
Chenjia Bai
,
Lingxiao Wang
,
Zhuoran Yang
,
Zhihong Deng
,
Animesh Garg
,
Peng Liu
,
Zhaoran Wang
PDF
Cite
Code
Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning.
In
International Conference on Machine Learning (
ICML
)
, 2022
Spotlight
We study how RL can be empowered by contrastive learning in a class of Markov decision processes (MDPs) and Markov games (MGs) with low-rank transitions. For both models, we propose to extract the correct feature representations of the low-rank model by minimizing a contrastive loss.
Shuang Qiu
,
Lingxiao Wang
,
Chenjia Bai
,
Zhuoran Yang
,
Zhaoran Wang
PDF
Cite
Code
RORL: Robust Offline Reinforcement Learning via Conservative Smoothing.
In
Neural Information Processing Systems (
NeurIPS
)
, 2022
Spotlight
We propose Robust Offline Reinforcement Learning (RORL) with a novel conservative smoothing technique.
Rui Yang
✉
,
Chenjia Bai
✉
,
Xiaoteng Ma
,
Zhaoran Wang
,
Chongjie Zhang
,
Lei Han
PDF
Cite
False Correlation Reduction for Offline Reinforcement Learning.
IEEE Transactions on Pattern Analysis and Machine Intelligence (
TPAMI
)
, 2023
We propose falSe COrrelation REduction (SCORE) for offline RL, a practically effective and theoretically provable algorithm.
Zhihong Deng
,
Zuyue Fu
,
Lingxiao Wang
,
Zhuoran Yang
,
Chenjia Bai
,
Tianyi Zhou
,
Jing Jiang
PDF
Cite
Behavior Contrastive Learning for Unsupervised Skill Discovery.
In
International Conference on Machine Learning (
ICML
)
, 2023
We propose a novel unsupervised skill discovery method through contrastive learning among behaviors, which makes the agent produce similar behaviors for the same skill and diverse behaviors for different skills.
Rushuai Yang
,
Chenjia Bai
✉
,
Hongyi Guo
,
Siyuan Li
,
Bin Zhao
,
Zhen Wang
,
Peng Liu
,
Xuelong Li
PDF
Cite
Code
Cross-Domain Policy Adaptation via Value-Guided Data Filtering.
In
Neural Information Processing Systems (
NeurIPS
)
, 2023
We reveal the limitations of these methods and explore the problem from the value difference perspective via a novel insight on the value consistency across domains.
Kang Xu
,
Chenjia Bai
✉
,
Xiaoteng Ma
,
Dong Wang
,
Bin Zhao
,
Zhen Wang
,
Xuelong Li
,
Wei Li
PDF
Cite
Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning.
In
Neural Information Processing Systems (
NeurIPS
)
, 2023
We aim to investigate the effectiveness of a single diffusion model in modeling large-scale multi-task offline data, which can be challenging due to diverse and multimodal data distribution.
Haoran He
,
Chenjia Bai
✉
,
Kang Xu
,
Zhuoran Yang
,
Weinan Zhang
,
Dong Wang
,
Bin Zhao
,
Xuelong Li
PDF
Cite
OVD-Explorer: Optimism should not be the Sole Pursuit of Exploration in Noisy Environments.
In
AAAI Conference on Artificial Intelligence (
AAAI
)
, 2024
We propose Optimistic Value Distribution Explorer (OVD-Explorer) to achieve a noise-aware optimistic exploration for continuous control.
Jinyi Liu
,
Zhi Wang
,
Yan Zheng
,
Jianye Hao
,
Chenjia Bai
,
Junjie Ye
,
Zhen Wang
,
Et Al.
PDF
Cite
»
Cite
×