Hanfei Yu (余涵非)

Email: hyu42 [at] stevens [dot] edu

self-photo-hiking.JPG

Greetings! I am a fifth-year Ph.D. student in the Department of Electrical and Computer Engineering, at Stevens Institute of Technology, advised by Dr. Hao Wang.

I received my Master’s Degree in Computer Science and Systems at University of Washington Tacoma, advised by Dr. Wes J. Lloyd and Dr. Athirai A. Irissappane. I received my Bachelor’s Degree in Electronic Engineering at Shanghai Jiao Tong University.

I was a Research Intern at Microsoft Azure Research and Microsoft 365 Research.

I am the recipient of various academic awards, including SoCC’24 Best Paper Award and SC’24 Best Student Paper Finalist. I was selected as one of the 2025 MLCommons ML and Systems Rising Stars.

My research interests lie in AI Systems, LLM Serving, RL(HF) Systems, and Serverless Computing. I develop full-stack AI systems that integrate cloud and HPC infrastructures through algorithm–system co-design:

News

Mar 13, 2026 [Talk] Invited to give a talk at Computer Systems Seminar @ Rice University
Feb 27, 2026 [Talk] Invited to give a talk at Sky Computing Lab @ UC Berkeley
Jan 13, 2026 [Talk] Invited to give a talk at UTNS Lab @ UT Austin
Nov 12, 2025 [Service] Serve on the Program Committee and Artifact Evaluation Program Committee for MLSys’26
Nov 09, 2025 [Paper]Accelerating ML Inference via Opportunistic Pre-Loading on Serverless Clusters” accepted by TPDS
Oct 21, 2025 [Talk] Invited to give a talk at MSRA Vancouver
Sep 26, 2025 [Paper]Multi-Agent Reinforcement Learning with Serverless Computing” accepted by SoCC’25
Sep 03, 2025 [Service] Serve as a Reviewer for ICLR’26

Selected Publications

  1. EuroSys
    Taming Latency-Memory Trade-Off in MoE-Based LLM Serving via Fine-Grained Expert Offloading
    Hanfei Yu, Xingqi Cui, Hong Zhang, Hao Wang@Rutgers, and Hao Wang
    In ACM European Conference on Computer Systems, 2026
  2. SoCC
    Pre-Warming is Not Enough: Accelerating Serverless Inference With Opportunistic Pre-Loading
    Yifan Sui, Hanfei Yu, Yitao Hu, Jianxun Li, and Hao Wang
    In ACM Symposium on Cloud Computing, 2024
  3. SC
    Stellaris: Staleness-Aware Distributed Reinforcement Learning with Serverless Computing
    In ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis, 2024
  4. ASPLOS
    RainbowCake: Mitigating Cold-starts in Serverless with Layer-wise Container Caching and Sharing
    Hanfei Yu, Rohan Basu Roy, Christian Fontenot, Devesh Tiwari, Jian Li, Hong Zhang, Hao Wang, and Seung-Jong Park
    In ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1, 2024