DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models Paper β’ 2309.03883 β’ Published Sep 7, 2023 β’ 33
Nemotron 4 340B Collection Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. β’ 4 items β’ Updated 10 days ago β’ 157
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases β’ 5 items β’ Updated Sep 25 β’ 682
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper β’ 2402.17764 β’ Published Feb 27 β’ 602
Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning Paper β’ 2307.02053 β’ Published Jul 5, 2023 β’ 23
LongNet: Scaling Transformers to 1,000,000,000 Tokens Paper β’ 2307.02486 β’ Published Jul 5, 2023 β’ 80
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis Paper β’ 2307.01952 β’ Published Jul 4, 2023 β’ 82