Update readme
Browse files- README.md +18 -0
- comparison.png +0 -0
- icon.png +0 -0
README.md
ADDED
@@ -0,0 +1,18 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
inference: false
|
3 |
+
---
|
4 |
+
|
5 |
+
# Model Card
|
6 |
+
|
7 |
+
<p align="center">
|
8 |
+
<img src="./icon.png" alt="Logo" width="350">
|
9 |
+
</p>
|
10 |
+
|
11 |
+
📖 Technical report (coming soon) | 🏠 [Code](https://github.com/BAAI-DCAI/Bunny) | 🐰 [Demo](http://bunny.dataoptim.org/)
|
12 |
+
|
13 |
+
Bunny is a family of lightweight but powerful multimodal models. It offers multiple plug-and-play vision encoders, like EVA-CLIP, SigLIP and language backbones, including Phi-1.5, StableLM-2 and Phi-2. To compensate for the decrease in model size, we construct more informative training data by curated selection from a broader data source. Remarkably, our Bunny-3B model built upon SigLIP and Phi-2 outperforms the state-of-the-art MLLMs, not only in comparison with models of similar size but also against larger MLLM frameworks (7B), and even achieves performance on par with 13B models.
|
14 |
+
|
15 |
+
The model is pretrained on LAION-2M and finetuned on Bunny-695K.
|
16 |
+
More details about this model can be found in [GitHub](https://github.com/BAAI-DCAI/Bunny).
|
17 |
+
|
18 |
+
![comparison](comparison.png)
|
comparison.png
ADDED
icon.png
ADDED