Hi, I'm Jiarui Yuan, an undergraduate at Tsinghua University, majoring in Computer Science.

I will soon begin my Ph.D. at THUNLP under the supervision of Prof. Zhiyuan Liu. Currently, I work closely with Dr. Weize Chen and Dr. Bingxiang He.

My primary research interests lie in Reinforcement Learning and Agent Systems. Please feel free to contact me if you’re interested in my research!

Research

The Overthinker’s DIET: Cutting Token Calories with Difficulty-Aware Training

Weize Chen*, Jiarui Yuan*, Tailin Jin, Ning Ding, Huimin Chen, Zhiyuan Liu, Maosong Sun

NeurIPS 2025 Poster [paper] [code]

Process Reinforcement through Implicit Rewards

Ganqu Cui, Lifan Yuan, Zefan Wang, Hanbin Wang, Wendi Li, Bingxiang He, Yuchen Fan, Tianyu Yu, Qixin Xu, Weize Chen, Jiarui Yuan, Huayu Chen, Kaiyan Zhang, Xingtai Lv, Shuo Wang, Yuan Yao, Xu Han, Hao Peng, Yu Cheng, Zhiyuan Liu, Maosong Sun, Bowen Zhou, and Ning Ding

Preprint [paper] [code]

Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System

Weize Chen*, Jiarui Yuan*, Chen Qian, Cheng Yang, Zhiyuan Liu, Maosong Sun

ACL 2025 Findings [project page] [paper] [code]

Beyond Natural Language: LLMs Leveraging Alternative Formats for Enhanced Reasoning and Communication

Weize Chen*, Chenfei Yuan*, Jiarui Yuan*, Yusheng Su, Chen Qian, Cheng Yang, Ruobing Xie, Zhiyuan Liu, Maosong Sun

EMNLP 2024 Findings [paper] [code]

Projects

MiniCPM4: Ultra-Efficient LLMs on End Devices

TL;DR: MiniCPM4 is an efficient large language model optimized for on-device deployment. It achieves high performance with only 8T training tokens through innovations in four areas: (1) architecture — InfLLM v2 introduces sparse attention for faster long-context processing; (2) data — UltraClean and UltraChat v2 provide clean, effective pretraining and fine-tuning corpora; (3) training algorithms — ModelTunnel v2 enables efficient strategy search, and BitCPM improves post-training with load-balanced RL; (4) inference systems — CPM.cu integrates quantization, sparse attention, and speculative decoding for acceleration.

[paper] [code]

Selected Awards

  • Overall Excellence Scholarship, Tsinghua University, 2024
  • Academic Excellence Scholarship, Tsinghua University, 2025

Miscellanea

  • Photography: Enjoy capturing street scenes and the night sky, and occasionally landscapes.
  • Sports: Volleyball, tennis.
  • Music: Play some saxophone; fond of R&B and soul music.
  • Others: Enjoy reading novels, especially mystery ones; also like movies and anime.