Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages Paper β’ 2410.16153 β’ Published 22 days ago β’ 42
MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures Paper β’ 2410.13754 β’ Published 26 days ago β’ 74
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 β’ 15 items β’ Updated 19 days ago β’ 462
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. β’ 45 items β’ Updated Sep 18 β’ 326
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching Paper β’ 2410.06885 β’ Published Oct 9 β’ 40
OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction Paper β’ 2410.04932 β’ Published Oct 7 β’ 9
Distilling an End-to-End Voice Assistant Without Instruction Training Data Paper β’ 2410.02678 β’ Published Oct 3 β’ 22
Atlas-Chat: Adapting Large Language Models for Low-Resource Moroccan Arabic Dialect Paper β’ 2409.17912 β’ Published Sep 26 β’ 20
Addition is All You Need for Energy-efficient Language Models Paper β’ 2410.00907 β’ Published Oct 1 β’ 143
Models That Speak Darija Collection A collection of models I tested and verified that could speak darija. some models are Proprietary models like: im-a-good-gpt2-chatbot β’ 3 items β’ Updated Sep 30 β’ 1
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions Paper β’ 2409.18042 β’ Published Sep 26 β’ 36
MonoFormer: One Transformer for Both Diffusion and Autoregression Paper β’ 2409.16280 β’ Published Sep 24 β’ 17
YesBut: A High-Quality Annotated Multimodal Dataset for evaluating Satire Comprehension capability of Vision-Language Models Paper β’ 2409.13592 β’ Published Sep 20 β’ 48
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper β’ 2402.17764 β’ Published Feb 27 β’ 602