Document Details
Clip:
Direct Preference Optimization: Your Language Model is Secretly a Reward Model Rafael Rafailov ∗† Archit Sharma ∗† Eric Mitchell ∗† Stefano Ermon †‡ Christopher D. Manning † Chelsea Finn † †
Filename:
2305.18290
Filetype:
application/pdf
Size:
1298077 bytes
Uploaded On:
2024-06-07
Abstract:
Summary:
Tags:
Notes:
Visible:
1
Status:
Parsed
Author:
CreationDate:
2023-12-14T01:34:45+00:00
Creator:
LaTeX with hyperref
Keywords:
ModDate:
2023-12-14T01:34:45+00:00
PTEX.Fullbanner:
This is pdfTeX, Version 3.141592653-2.6-1.40.25 (TeX Live 2023) kpathsea version 6.3.5
Producer:
pdfTeX-1.40.25
Subject:
Title:
Trapped:
False
Pages:
27
Return to Document Library