Weiyun1025
commited on
Commit
β’
f41e149
1
Parent(s):
5e68c9e
Upload folder using huggingface_hub
Browse files
README.md
CHANGED
@@ -3,7 +3,7 @@ license: mit
|
|
3 |
pipeline_tag: visual-question-answering
|
4 |
---
|
5 |
|
6 |
-
#
|
7 |
|
8 |
[\[π Blog\]](https://internvl.github.io/blog/) [\[π InternVL 1.0 Paper\]](https://arxiv.org/abs/2312.14238) [\[π InternVL 1.5 Report\]](https://arxiv.org/abs/2404.16821) [\[π¨οΈ Chat Demo\]](https://internvl.opengvlab.com/)
|
9 |
|
@@ -17,6 +17,10 @@ Compared to the state-of-the-art open-source multimodal large language models, I
|
|
17 |
|
18 |
InternVL 2.0 is trained with an 8k context window and utilizes training data consisting of long texts, multiple images, and videos, significantly improving its ability to handle these types of inputs compared to InternVL 1.5. For more details, please refer to our blog and GitHub.
|
19 |
|
|
|
|
|
|
|
|
|
20 |
## Performance
|
21 |
|
22 |
| Benchmark | PaliGemma-3B | Phi-3-Vision | Mini-InternVL-4B-1.5 | InternVL2-4B |
|
|
|
3 |
pipeline_tag: visual-question-answering
|
4 |
---
|
5 |
|
6 |
+
# InternVL2-4B
|
7 |
|
8 |
[\[π Blog\]](https://internvl.github.io/blog/) [\[π InternVL 1.0 Paper\]](https://arxiv.org/abs/2312.14238) [\[π InternVL 1.5 Report\]](https://arxiv.org/abs/2404.16821) [\[π¨οΈ Chat Demo\]](https://internvl.opengvlab.com/)
|
9 |
|
|
|
17 |
|
18 |
InternVL 2.0 is trained with an 8k context window and utilizes training data consisting of long texts, multiple images, and videos, significantly improving its ability to handle these types of inputs compared to InternVL 1.5. For more details, please refer to our blog and GitHub.
|
19 |
|
20 |
+
## Model Details
|
21 |
+
|
22 |
+
InternVL2 is a multimodal large language model series, featuring models of various sizes. For each size, we release instruction-tuned models optimized for multimodal tasks. InternVL2-4B consists of [InternViT-300M-448px](https://huggingface.co/OpenGVLab/InternViT-300M-448px), an MLP projector, and [Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct).
|
23 |
+
|
24 |
## Performance
|
25 |
|
26 |
| Benchmark | PaliGemma-3B | Phi-3-Vision | Mini-InternVL-4B-1.5 | InternVL2-4B |
|