Collection for: "Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale"
-
Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale
Paper • 2409.17115 • Published • 59 -
gair-prox/FineWeb-pro
Viewer • Updated • 63.1M • 2.34k • 17 -
gair-prox/open-web-math-pro
Viewer • Updated • 2.58M • 1.44k • 9 -
gair-prox/RedPajama-pro
Viewer • Updated • 10.2M • 1.16k • 4