Document Details


2509.15207v1.pdf
Download View Text Delete
Clip: FlowRL: Matching Reward Distributions for LLM Reasoning FlowRL: Matching Reward Distributions for LLM Reasoning Xuekai Zhu 1 , Daixuan Cheng 6 , Dinghuai Zhang 3 , Hengli Li 5 , Kaiyan Zhang 4 , Che Jiang 4
Filename: 2509.15207v1.pdf
Filetype: application/pdf
Size: 926461 bytes
Uploaded On: 2025-10-24
Abstract:
Summary:
Tags:
Notes:
Visible: 1
Status: Parsed
Author: Xuekai Zhu; Daixuan Cheng; Dinghuai Zhang; Hengli Li; Kaiyan Zhang; Che Jiang; Youbang Sun; Ermo Hua; Yuxin Zuo; Xingtai Lv; Qizheng Zhang; Lin Chen; Fanghao Shao; Bo Xue; Yunchong Song; Zhenjie Yang; Ganqu Cui; Ning Ding; Jianfeng Gao; Xiaodong Liu; Bowen Zhou; Hongyuan Mei; Zhouhan Lin
Creator: arXiv GenPDF (tex2pdf:)
DOI: https://doi.org/10.48550/arXiv.2509.15207
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
PTEX.Fullbanner: This is pdfTeX, Version 3.141592653-2.6-1.40.28 (TeX Live 2025) kpathsea version 6.4.1
Producer: pikepdf 8.15.1
Title: FlowRL: Matching Reward Distributions for LLM Reasoning
Trapped: False
ArXivID: https://arxiv.org/abs/2509.15207v1
Pages: 21

Return to Document Library