Qiyuan Zhang

I am currently a third‑year Ph.D. student advised by Prof. Kede Ma and Prof. Chen Ma. Previously, I completed my B.Sc. and M.Sc. in Computer Science at the University of Electronic Science and Technology of China and spent time at Singapore Management University working with Jing Jiang. Soon, I will join Prof. Xue Liu’s group at MBZUAI as a visiting student.

My research interests lie in auto‑evaluation, reward modeling, preference modeling, and improved scaling strategies such as test‑time scaling for large language models. I am always excited about new collaborations—if you share these interests or see potential synergies, feel free to reach out via email!

Now, I am interning with Hunyuan-X team@Tencent, where I am focusing my efforts on advancing generative reward modeling. I am also seeking visiting or research‑intern opportunities to further explore frontier research topics.

In addition, I regularly post self-reflections on Medium—feel free to take a look if you’re interested!

Current Research Areas

LLM‑as‑a‑Judge / Generative Reward Models
Methods for Test‑Time Scaling
LLM Performance Prediction
Automatic Benchmark Construction

News

16 May 2025 One Paper accepted at ACL 2025

31 Mar 2025 Survey released: A Survey on Test‑Time Scaling …

21 Apr 2025 One Paper accepted at PAKDD 2025

11 Feb 2025 One Paper accepted at ICLR 2025

10 Oct 2024 One Paper accepted at EMNLP 2024

Selected Publications

My selected publications represent my research style and interests.

What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models

Qiyuan Zhang, Fuyuan Lyu, Zexu Sun, Lei Wang, Weixu Zhang, Wenyue Hua, Haolun Wu, Zhihan Guo, Yufei Wang, Niklas Muennighoff, Irwin King, Xue Liu, Chen Ma. · Preprint

arXiv Page Code PPT

Crowd Comparative Reasoning: Unlocking Comprehensive Evaluations for LLM-as-a-Judge

Qiyuan Zhang, Yufei Wang, Yuxin Jiang, Liangyou Li, Chuhan Wu, Yasheng Wang, Xin Jiang, Lifeng Shang, Ruiming Tang, Fuyuan Lyu, Chen Ma. · ACL 2025

arXiv Code

RevisEval: Improving LLM-as-a-Judge via Response-Adapted References

Qiyuan Zhang, Yufei Wang, Tiezheng YU, Yuxin Jiang, Chuhan Wu, Liangyou Li, Yasheng Wang, Xin Jiang, Lifeng Shang, Ruiming Tang, Fuyuan Lyu, Chen Ma. · ICLR 2024

arXiv Code

Collaborative Performance Prediction for Large Language Models

Qiyuan Zhang, Fuyuan Lyu, Xue Liu, Chen Ma. · EMNLP 2024

arXiv Code

NOAHQA: Numerical Reasoning with Interpretable Graph QA Dataset

Qiyuan Zhang, Lei Wang, Sicheng Yu, Shuohang Wang, Yang Wang, Jing Jiang, Ee-Peng Lim. · EMNLP 2021 Findings

arXiv Code

MWPToolkit: An Open-Source Framework for DL-Based Math Word Problem Solvers

Yihuai Lan, Lei Wang, Qiyuan Zhang , Yunshi Lan, Bing Tian Dai, Yan Wang, Dongxiang Zhang, Ee-Peng Lim. · AAAI 2021 Workshop

arXiv Code

Current Research Areas

News

Selected Publications

Selected Talks