Weiyun1025 commited on
Commit
f41e149
β€’
1 Parent(s): 5e68c9e

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +5 -1
README.md CHANGED
@@ -3,7 +3,7 @@ license: mit
3
  pipeline_tag: visual-question-answering
4
  ---
5
 
6
- # Model Card for InternVL2-4B
7
 
8
  [\[πŸ†• Blog\]](https://internvl.github.io/blog/) [\[πŸ“œ InternVL 1.0 Paper\]](https://arxiv.org/abs/2312.14238) [\[πŸ“œ InternVL 1.5 Report\]](https://arxiv.org/abs/2404.16821) [\[πŸ—¨οΈ Chat Demo\]](https://internvl.opengvlab.com/)
9
 
@@ -17,6 +17,10 @@ Compared to the state-of-the-art open-source multimodal large language models, I
17
 
18
  InternVL 2.0 is trained with an 8k context window and utilizes training data consisting of long texts, multiple images, and videos, significantly improving its ability to handle these types of inputs compared to InternVL 1.5. For more details, please refer to our blog and GitHub.
19
 
 
 
 
 
20
  ## Performance
21
 
22
  | Benchmark | PaliGemma-3B | Phi-3-Vision | Mini-InternVL-4B-1.5 | InternVL2-4B |
 
3
  pipeline_tag: visual-question-answering
4
  ---
5
 
6
+ # InternVL2-4B
7
 
8
  [\[πŸ†• Blog\]](https://internvl.github.io/blog/) [\[πŸ“œ InternVL 1.0 Paper\]](https://arxiv.org/abs/2312.14238) [\[πŸ“œ InternVL 1.5 Report\]](https://arxiv.org/abs/2404.16821) [\[πŸ—¨οΈ Chat Demo\]](https://internvl.opengvlab.com/)
9
 
 
17
 
18
  InternVL 2.0 is trained with an 8k context window and utilizes training data consisting of long texts, multiple images, and videos, significantly improving its ability to handle these types of inputs compared to InternVL 1.5. For more details, please refer to our blog and GitHub.
19
 
20
+ ## Model Details
21
+
22
+ InternVL2 is a multimodal large language model series, featuring models of various sizes. For each size, we release instruction-tuned models optimized for multimodal tasks. InternVL2-4B consists of [InternViT-300M-448px](https://huggingface.co/OpenGVLab/InternViT-300M-448px), an MLP projector, and [Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct).
23
+
24
  ## Performance
25
 
26
  | Benchmark | PaliGemma-3B | Phi-3-Vision | Mini-InternVL-4B-1.5 | InternVL2-4B |