Chenjia Bai
Chenjia Bai
Home
Book
Publications
Team
Join us
Light
Dark
Automatic
Article-Journal
Active Sampling for Deep Q-learning Based on TD-error Adaptive Correction.
Journal of Computer Research and Development (in Chinese)
, 2019
We propose an active sampling method based on TD-error adaptive correction in order to solve sample efficiency problem in deep Q-learning.
Chenjia Bai
,
Peng Liu
,
Wei Zhao
,
Xianglong Tang
PDF
Cite
Obtaining Accurate Estimated Action Values in Categorical Distributional Reinforcement Learning.
Knowledge-Based Systems (KBS)
, 2020
This paper describes a method of obtaining more accurate estimated action values for CDRL using adaptive bounds.
Yingnan Zhao
,
Peng Liu
,
Chenjia Bai
,
Wei Zhao
,
Xianglong Tang
PDF
Cite
Generating Attentive Goals for Prioritized Hindsight Reinforcement Learning.
Knowledge-Based Systems (KBS)
, 2020
We propose a novel prioritized hindsight model for multi-goal RL in which the agent is provided with more valuable goals, as measured by the expected temporal-difference (TD) error.
Peng Liu
,
Chenjia Bai
,
Yingnan Zhao
,
Chenyao Bai
,
Wei Zhao
,
Xianglong Tang
PDF
Cite
Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning.
IEEE Transactions on Neural Networks and Learning Systems
, 2021
We propose a variational dynamic model based on the conditional variational inference to model the multimodality and stochasticity.
Chenjia Bai
,
Peng Liu
,
Kaiyu Liu
,
Lingxiao Wang
,
Yingnan Zhao
,
Lei Han
PDF
Cite
Code
Project
Addressing Hindsight Bias in Multi-Goal Reinforcement Learning.
IEEE Transactions on Cybernetics
, 2021
We analyze the hindsight bias due to this use of hindsight goals and propose the bias-corrected HER (BHER), an efficient algorithm that corrects the hindsight bias in training.
Chenjia Bai
,
Lingxiao Wang
,
Yixin Wang
,
Rui Zhao
,
Chenyao Bai
,
Peng Liu
PDF
Cite
Code
Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain.
IEEE Transactions on Neural Networks and Learning Systems
, 2022
We conduct a comprehensive survey on existing exploration methods for both single-agent RL and multiagent RL.
Jianye Hao
,
Tianpei Yang
,
Hongyao Tang
,
Chenjia Bai
,
Jinyi Liu
,
Zhaopeng Meng
,
Peng Liu
,
Zhen Wang
PDF
Cite
Monotonic Quantile Network for Worst-Case Offline Reinforcement Learning.
IEEE Transactions on Neural Networks and Learning Systems
, 2022
We propose monotonic quantile network (MQN) with conservative quantile regression (CQR) for risk-averse policy learning.
Chenjia Bai
,
Ting Xiao
,
Zhoufan Zhu
,
Lingxiao Wang
,
Fan Zhou
,
Peng Liu
PDF
Cite
Self-Supervised Imitation for Offline Reinforcement Learning with Hindsight Relabeling.
IEEE Transactions on Systems, Man, and Cybernetics: Systems
. 2022
We present an offline RL algorithm that combines hindsight relabeling and supervised regression to predict actions without oracle information.
Xudong Yu
,
Chenjia Bai
,
Changhong Wang
,
Dengxiu Yu
,
C. L. Philip Chen
,
Zhen Wang
✉
PDF
Cite
Diverse Randomized Value Functions: A Provably Pessimistic Approach for Offline Reinforcement Learning.
In
Information Sciences
, 2023
We introduce a novel strategy employing diverse randomized value functions to estimate the posterior distribution of Q-values.
Xudong Yu
,
Chenjia Bai
✉
,
Hongyi Guo
,
Changhong Wang
✉
,
Zhen Wang
PDF
Cite
Pessimistic Value Iteration for Multi-Task Data Sharing in Offline Reinforcement Learning.
In
Artificial Intelligence (
AIJ
)
, 2023
We propose an uncertainty-based MTDS approach that shares the entire dataset without data selection.
Chenjia Bai
,
Lingxiao Wang
,
Jianye Hao
,
Zhuoran Yang
,
Bin Zhao
,
Zhen Wang
✉
,
Xuelong Li
✉
PDF
Cite
Code
公众号
»
Cite
×