Publications

Robust Quadrupedal Locomotion via Risk-Averse Policy Learning.
In IEEE International Conference on Robotics and Automation (ICRA), 2024     Oral
We consider a novel risk-sensitive perspective to enhance the robustness of legged locomotion.
Robust Quadrupedal Locomotion via Risk-Averse Policy Learning.
On the Value of Myopic Behavior in Policy Reuse.
IEEE Transactions on Pattern Analysis and Machine Intelligence. 2023 (under review)
We present a framework called Selective Myopic bEhavior Control~(SMEC), which results from the insight that the short-term behaviors of prior policies are sharable across tasks.
On the Value of Myopic Behavior in Policy Reuse.
ODRL: A Benchmark for Off-Dynamics Reinforcement Learning.
In Neural Information Processing Systems (NeurIPS), Datasets and Benchmarks Track, 2024
We introduce ODRL, the first benchmark tailored for evaluating off-dynamics RL methods where one needs to transfer policies across different domains with dynamics mismatch.
ODRL: A Benchmark for Off-Dynamics Reinforcement Learning.
False Correlation Reduction for Offline Reinforcement Learning.
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
We propose falSe COrrelation REduction (SCORE) for offline RL, a practically effective and theoretically provable algorithm.
False Correlation Reduction for Offline Reinforcement Learning.
RORL: Robust Offline Reinforcement Learning via Conservative Smoothing.
In Neural Information Processing Systems (NeurIPS), 2022     Spotlight
We propose Robust Offline Reinforcement Learning (RORL) with a novel conservative smoothing technique.
RORL: Robust Offline Reinforcement Learning via Conservative Smoothing.
Monotonic Quantile Network for Worst-Case Offline Reinforcement Learning.
IEEE Transactions on Neural Networks and Learning Systems, 2022
We propose monotonic quantile network (MQN) with conservative quantile regression (CQR) for risk-averse policy learning.
Monotonic Quantile Network for Worst-Case Offline Reinforcement Learning.
Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain.
IEEE Transactions on Neural Networks and Learning Systems, 2022
We conduct a comprehensive survey on existing exploration methods for both single-agent RL and multiagent RL.
Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain.
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning.
International Conference on Learning Representations (ICLR), 2022     Spotlight
We propose Pessimistic Bootstrapping for offline RL (PBRL), a purely uncertainty-driven offline algorithm without explicit policy constraints.
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning.
Dynamic Bottleneck for Robust Self-Supervised Exploration.
In Neural Information Processing Systems (NeurIPS), 2021
We propose a Dynamic Bottleneck (DB) model, which attains a dynamics-relevant representation based on the information-bottleneck principle.
Dynamic Bottleneck for Robust Self-Supervised Exploration.
Principled Exploration via Optimistic Bootstrapping and Backward Induction.
In International Conference on Machine Learning (ICML), 2021     Spotlight
We propose a principled exploration method for DRL through Optimistic Bootstrapping and Backward Induction (OB2I).
Principled Exploration via Optimistic Bootstrapping and Backward Induction.