Document Details


Clip: Direct Preference Optimization: Your Language Model is Secretly a Reward Model Rafael Rafailov ∗† Archit Sharma ∗† Eric Mitchell ∗† Stefano Ermon †‡ Christopher D. Manning † Chelsea Finn † †
Filename: 2305.18290
Filetype: application/pdf
Size: 1298077 bytes
Uploaded On: 2024-06-07
Abstract:
Summary:
Tags:
Notes:
Visible: 1
Status: Parsed
Author:
CreationDate: 2023-12-14T01:34:45+00:00
Creator: LaTeX with hyperref
Keywords:
ModDate: 2023-12-14T01:34:45+00:00
PTEX.Fullbanner: This is pdfTeX, Version 3.141592653-2.6-1.40.25 (TeX Live 2023) kpathsea version 6.3.5
Producer: pdfTeX-1.40.25
Subject:
Title:
Trapped: False
Pages: 27

Return to Document Library