Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
sugatoray
's Collections
LLMs
LLM Tools
AV LLMs
LLM Training Datasets
Papers
Leaderboards 🔥
Papers-MoE
Papers-LLMEval
LLM LLAMA3
Papers-Fundamentals
TFM: TimeSeries Foundation Models
Papers-Benchmarks
LLMs-EmbeddingModels
LLMs + Mamba
LLM + Datasets : Finance
AV LLMs
updated
2 days ago
A collection of Audio, Video and Visual LLMs.
Upvote
2
myshell-ai/OpenVoice
Text-to-Speech
•
Updated
Apr 24
•
384
Running
927
🤗
OpenVoice
dataautogpt3/ProteusV0.3
Text-to-Image
•
Updated
Feb 12
•
43.4k
•
89
ByteDance/SDXL-Lightning
Text-to-Image
•
Updated
Apr 3
•
96.6k
•
1.88k
openai/whisper-large-v3
Automatic Speech Recognition
•
Updated
Aug 12
•
5.69M
•
•
3.41k
stabilityai/TripoSR
Image-to-3D
•
Updated
Aug 9
•
26.9k
•
437
Efficient-Large-Model/VILA-7b
Text Generation
•
Updated
Mar 4
•
1.05k
•
25
google/paligemma-3b-pt-896
Image-Text-to-Text
•
Updated
Jul 19
•
66.1k
•
106
microsoft/Phi-3-vision-128k-instruct
Text Generation
•
Updated
about 1 month ago
•
167k
•
890
stabilityai/stable-audio-open-1.0
Text-to-Audio
•
Updated
Jul 31
•
23.4k
•
859
OpenVLA: An Open-Source Vision-Language-Action Model
Paper
•
2406.09246
•
Published
Jun 13
•
36
aiola/whisper-medusa-v1
Updated
Aug 3
•
610
•
172
merve/idefics3llama-vqav2
Updated
9 days ago
•
9
black-forest-labs/FLUX.1-schnell
Text-to-Image
•
Updated
Aug 16
•
1.46M
•
•
2.3k
Running
on
Zero
97
😻
Llama3.1 S V0.2 Checkpoint 2024 08 20
gpt-omni/mini-omni
Text-to-Speech
•
Updated
16 days ago
•
4
•
343
fishaudio/fish-speech-1.4
Text-to-Speech
•
Updated
about 15 hours ago
•
4.37k
•
317
Running
on
Zero
92
📲🫴🏻👁
Tonic's GOT OCR
GOT - OCR (from : UCAS, Beijing)
stepfun-ai/GOT-OCR2_0
Image-Text-to-Text
•
Updated
2 days ago
•
119k
•
371
apple/coreml-sam2-large
Mask Generation
•
Updated
6 days ago
•
52
•
8
coreml-projects/sam-2-studio
Updated
6 days ago
•
11
mistralai/Pixtral-12B-2409
Updated
2 days ago
•
9
•
252
Upvote
2
Share collection
View history
Collection guide
Browse collections