OpenGVLab
/

InternVL2-Llama3-76B

Image-Text-to-Text

feature-extraction

Model card Files Files and versions Community

czczup commited on Aug 23

Commit

e2aff1c

•

1 Parent(s): 17563fb

Upload folder using huggingface_hub

Files changed (1) hide show

README.md +16 -2

README.md CHANGED Viewed

@@ -1,6 +1,20 @@
 ---
 license: llama3
 pipeline_tag: image-text-to-text
 ---
 # InternVL2-Llama3-76B
@@ -64,11 +78,11 @@ InternVL 2.0 is a multimodal large language model series, featuring models of va
 - For more details and evaluation reproduction, please refer to our [Evaluation Guide](https://internvl.readthedocs.io/en/latest/internvl2.0/evaluation.html).
-- We simultaneously use InternVL and VLMEvalKit repositories for model evaluation. Specifically, the results reported for DocVQA, ChartQA, InfoVQA, TextVQA, MME, AI2D, MMBench, CCBench, MMVet, and SEED-Image were tested using the InternVL repository. OCRBench, RealWorldQA, HallBench, and MathVista were evaluated using the VLMEvalKit.
 - For MMMU, we report both the original scores (left side: evaluated using the InternVL codebase for InternVL series models, and sourced from technical reports or webpages for other models) and the VLMEvalKit scores (right side: collected from the OpenCompass leaderboard).
-- Please note that evaluating the same model using different testing toolkits like InternVL and VLMEvalKit can result in slight differences, which is normal. Updates to code versions and variations in environment and hardware can also cause minor discrepancies in results.
 ### Video Benchmarks

 ---
 license: llama3
 pipeline_tag: image-text-to-text
+library_name: transformers
+base_model:
+  - OpenGVLab/InternViT-6B-448px-V1-5
+  - NousResearch/Hermes-2-Theta-Llama-3-70B
+base_model_relation: finetune
+language:
+  - multilingual
+tags:
+  - internvl
+  - vision
+  - ocr
+  - multi-image
+  - video
+  - custom_code
 ---
 # InternVL2-Llama3-76B
 - For more details and evaluation reproduction, please refer to our [Evaluation Guide](https://internvl.readthedocs.io/en/latest/internvl2.0/evaluation.html).
+- We simultaneously use [InternVL](https://github.com/OpenGVLab/InternVL) and [VLMEvalKit](https://github.com/open-compass/VLMEvalKit) repositories for model evaluation. Specifically, the results reported for DocVQA, ChartQA, InfoVQA, TextVQA, MME, AI2D, MMBench, CCBench, MMVet, and SEED-Image were tested using the InternVL repository. OCRBench, RealWorldQA, HallBench, and MathVista were evaluated using the VLMEvalKit.
 - For MMMU, we report both the original scores (left side: evaluated using the InternVL codebase for InternVL series models, and sourced from technical reports or webpages for other models) and the VLMEvalKit scores (right side: collected from the OpenCompass leaderboard).
+- Please note that evaluating the same model using different testing toolkits like [InternVL](https://github.com/OpenGVLab/InternVL) and [VLMEvalKit](https://github.com/open-compass/VLMEvalKit) can result in slight differences, which is normal. Updates to code versions and variations in environment and hardware can also cause minor discrepancies in results.
 ### Video Benchmarks