Researchclopedia
Research
Researchers
Institutions
Topics
Submit
About
Search...
⌘
K
Command Palette
Search for a command to run...
Back to research
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
2024
35 citations
Journal Article
gold Open Access
Field-Weighted Citation Impact:
12.18
Li Lucy
Xinxi Lyu
Nathan Lambert
Ian Magnusson
Jacob Morrison
Niklas Muennighoff
Aakanksha Naik
Crystal Nam
Matthew E. Peters
Abhilasha Ravichander
Kyle Richardson
Zejiang Shen
Emma Strubell
Nishant Subramani
Oyvind Tafjord
Evan Walsh
Luke Zettlemoyer
Noah A. Smith
Hannaneh Hajishirzi
Iz Beltagy
Dirk Groeneveld
Jesse Dodge
Kyle Lo
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research | Researchclopedia