Publications

2026

Unleashing Efficient Asynchronous RL Post-Training via Staleness-Constrained Rollout Coordination
Haoyang Li*, Sheng Lin*, Fangcheng Fu, Yuming Zhou, Xiaodong Ji, Yanfeng Zhao, Lefeng Wang, Jie Jiang, Bin Cui
Preprint.
[PDF] [Slides]

Elastor: Elastic and Efficient Model Partitioning and Checkpointing for Fault-tolerant Distributed DL Training
Xuanyu Wang, Fangcheng Fu, Haoyang Li, Hao Ge, Sheng Lin, Jiawen Niu, Bin Cui
PPoPP'26.
[PDF] [Slides]

Hydraulis: Balancing Large Transformer Model Training via Co-designing Parallel Strategies and Data Assignment
Haoyang Li*, Fangcheng Fu*, Sheng Lin, Hao Ge, Xuanyu Wang, Jiawen Niu, Jinbao Xue, Yangyu Tao, Di Wang, Jie Jiang, Bin Cui
SIGMOD'26.
[PDF] [Slides]

Hetu v2: A General and Scalable Deep Learning System with Hierarchical and Heterogeneous Single Program Multiple Data Annotations
Haoyang Li*, Fangcheng Fu*, Hao Ge, Sheng Lin, Xuanyu Wang, Jiawen Niu, Xupeng Miao, Bin Cui
In Preprint.
[PDF] [Slides]

2025

LobRA: Multi-tenant Fine-tuning over Heterogeneous Data
Sheng Lin*, Fangcheng Fu*, Haoyang Li, Hao Ge, Xuanyu Wang, Jiawen Niu, Yaofeng Tu, Bin Cui
VLDB'25.
[PDF] [Slides]

Malleus: Straggler-Resilient Hybrid Parallel Training of Large-scale Models via Malleable Data and Model Parallelization
Haoyang Li*, Fangcheng Fu*, Hao Ge, Sheng Lin, Xuanyu Wang, Jiawen Niu, Yujie Wang, Hailin Zhang, Xiaonan Nie, Bin Cui
SIGMOD'25.
[PDF] [Slides]

2024

Enabling Parallelism Hot Switching for Efficient Training of Large Language Models
Hao Ge*, Fangcheng Fu*, Haoyang Li, Xuanyu Wang, Sheng Lin, Yujie Wang, Xiaonan Nie, Hailin Zhang, Xupeng Miao, Bin Cui
SOSP'24.
[PDF] [Slides]