Document Details
Clip:
FlowRL: Matching Reward Distributions for LLM Reasoning FlowRL: Matching Reward Distributions for LLM Reasoning Xuekai Zhu 1 , Daixuan Cheng 6 , Dinghuai Zhang 3 , Hengli Li 5 , Kaiyan Zhang 4 , Che Jiang 4
Filename:
2509.15207v1.pdf
Filetype:
application/pdf
Size:
926461 bytes
Uploaded On:
2025-10-24
Abstract:
Summary:
Tags:
Notes:
Visible:
1
Status:
Parsed
Author:
Xuekai Zhu; Daixuan Cheng; Dinghuai Zhang; Hengli Li; Kaiyan Zhang; Che Jiang; Youbang Sun; Ermo Hua; Yuxin Zuo; Xingtai Lv; Qizheng Zhang; Lin Chen; Fanghao Shao; Bo Xue; Yunchong Song; Zhenjie Yang; Ganqu Cui; Ning Ding; Jianfeng Gao; Xiaodong Liu; Bowen Zhou; Hongyuan Mei; Zhouhan Lin
Creator:
arXiv GenPDF (tex2pdf:)
DOI:
https://doi.org/10.48550/arXiv.2509.15207
License:
http://arxiv.org/licenses/nonexclusive-distrib/1.0/
PTEX.Fullbanner:
This is pdfTeX, Version 3.141592653-2.6-1.40.28 (TeX Live 2025) kpathsea version 6.4.1
Producer:
pikepdf 8.15.1
Title:
FlowRL: Matching Reward Distributions for LLM Reasoning
Trapped:
False
ArXivID:
https://arxiv.org/abs/2509.15207v1
Pages:
21
Return to Document Library