Lyte (Yassine Ennaour)

upvoted a paper 21 days ago

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published 22 days ago • 42

upvoted 2 papers 24 days ago

Movie Gen: A Cast of Media Foundation Models

Paper • 2410.13720 • Published 26 days ago • 86

MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures

Paper • 2410.13754 • Published 26 days ago • 74

upvoted 2 collections 28 days ago

Llama 3.2

Collection

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated 19 days ago • 462

Qwen2.5

Collection

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated Sep 18 • 326

upvoted a paper 29 days ago

Baichuan-Omni Technical Report

Paper • 2410.08565 • Published Oct 11 • 83

upvoted 7 papers about 1 month ago

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

Paper • 2410.06885 • Published Oct 9 • 40

Differential Transformer

Paper • 2410.05258 • Published Oct 7 • 165

OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction

Paper • 2410.04932 • Published Oct 7 • 9

Distilling an End-to-End Voice Assistant Without Instruction Training Data

Paper • 2410.02678 • Published Oct 3 • 22

Atlas-Chat: Adapting Large Language Models for Low-Resource Moroccan Arabic Dialect

Paper • 2409.17912 • Published Sep 26 • 20

Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1 • 143

DiaSynth -- Synthetic Dialogue Generation Framework

Paper • 2409.19020 • Published Sep 25 • 19

upvoted a collection about 1 month ago

Models That Speak Darija

Collection

A collection of models I tested and verified that could speak darija. some models are Proprietary models like: im-a-good-gpt2-chatbot • 3 items • Updated Sep 30 • 1

upvoted 2 papers about 1 month ago

MIO: A Foundation Model on Multimodal Tokens

Paper • 2409.17692 • Published Sep 26 • 49

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27 • 90

upvoted 4 papers about 2 months ago

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions

Paper • 2409.18042 • Published Sep 26 • 36

MonoFormer: One Transformer for Both Diffusion and Autoregression

Paper • 2409.16280 • Published Sep 24 • 17

YesBut: A High-Quality Annotated Multimodal Dataset for evaluating Satire Comprehension capability of Vision-Language Models

Paper • 2409.13592 • Published Sep 20 • 48

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 602

Yassine Ennaour

AI & ML interests

Organizations

Lyte's activity

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Movie Gen: A Cast of Media Foundation Models

MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures

Llama 3.2

Qwen2.5

Baichuan-Omni Technical Report

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

Differential Transformer

OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction

Distilling an End-to-End Voice Assistant Without Instruction Training Data

Atlas-Chat: Adapting Large Language Models for Low-Resource Moroccan Arabic Dialect

Addition is All You Need for Energy-efficient Language Models

DiaSynth -- Synthetic Dialogue Generation Framework

Models That Speak Darija

MIO: A Foundation Model on Multimodal Tokens

Emu3: Next-Token Prediction is All You Need

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions

MonoFormer: One Transformer for Both Diffusion and Autoregression

YesBut: A High-Quality Annotated Multimodal Dataset for evaluating Satire Comprehension capability of Vision-Language Models

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits