Chenjia Bai
Chenjia Bai
Home
Book
Publications
Team
Join us
Light
Dark
Automatic
Conference
ODRL: A Benchmark for Off-Dynamics Reinforcement Learning.
In
Neural Information Processing Systems (
NeurIPS
)
, Datasets and Benchmarks Track, 2024
We introduce ODRL, the first benchmark tailored for evaluating off-dynamics RL methods where one needs to transfer policies across different domains with dynamics mismatch.
Jiafei Lyu
,
Kang Xu
,
Jiacheng Xu
,
Mengbei Yan
,
Jing-Wen Yang
,
Zongzhang Zhang
,
Chenjia Bai
✉
,
Zongqing Lu
✉
,
Xiu Li
✉
PDF
Cite
Code
Forward KL Regularized Preference Optimization for Aligning Diffusion Policies.
In
AAAI Conference on Artificial Intelligence (
AAAI
)
, 2025
We propose Forward KL regularized Preference optimization for aligning Diffusion policies to align the diffusion policy with preferences, learning to align the policy output with human intents in various tasks.
Zhao Shan
,
Chenyou Fan
,
Shuang Qiu
,
Jiyuan Shi
,
Chenjia Bai
✉
PDF
Cite
Radiology Report Generation via Multi-objective Preference Optimization.
In
AAAI Conference on Artificial Intelligence (
AAAI
)
, 2025
We propose a new radiology report generation method that aligns the pre-trained model with multiple human preferences via preference-guided multi-objective optimization reinforcement learning.
Ting Xiao
,
Lei Shi
,
Peng Liu
,
Zhe Wang
,
Chenjia Bai
✉
PDF
Cite
«
Cite
×