Document Details


2203.15556.pdf
Download View Text Delete
Clip: Training Compute-Optimal Large Language Models ������ ������� ★ , Sebastian Borgeaud ★ , Arthur Mensch ★ , Elena Buchatskaya, Trevor Cai, Eliza Rutherford, Diego de Las Casas, Lisa Anne Hendricks, Johannes Welbl, Aidan Clark, Tom Hennigan, Eric Noland, Katie Millican, George van den Driessche, Bogdan Damoc, Aurelia Guy, Simon Osindero, Karen Simonyan, Erich Elsen, Jack W. Rae, Oriol Vinyals and Laurent Sifre ★ ★ Equal contributions We investigate the optimal model size and number of tokens for training a transformer language model
Filename: 2203.15556.pdf
Filetype: application/pdf
Size: 6004349 bytes
Uploaded On: 2024-01-27
Abstract:
Summary:
Tags:
Notes:
Visible: 1
Status: Parsed
Author:
CreationDate: 2023-01-08T05:03:37+00:00
Creator: LaTeX with hyperref
Keywords:
ModDate: 2023-01-08T05:03:37+00:00
PTEX.Fullbanner: This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2
Producer: pdfTeX-1.40.21
Subject:
Title:
Trapped: False
Pages: 36

Return to Document Library