THUDM
/

cogvlm2-llama3-caption

@@ -1,12 +1,28 @@
 # CogVLM2-Llama3-Caption
 <div align="center">
 <img src=https://raw.githubusercontent.com/THUDM/CogVLM2/cf9cb3c60a871e0c8e5bde7feaf642e3021153e6/resources/logo.svg>
 </div>
-通常情况下，大部分视频数据并没有附带相应的描述性文本，因此有必要将视频数据转换成文本描述，以提供文本到视频模型所需的必要训练数据。
-## 使用方式
 ```python
 import io
 import numpy as np
@@ -119,12 +135,14 @@ if __name__ == '__main__':
 ```
-## 模型协议
-此模型根据 CogVLM2 [LICENSE](https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-base/file/view/master?fileName=LICENSE&status=0) 发布。对于使用 Meta Llama 3 构建的模型，还请遵守
-[LLAMA3_LICENSE](https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-base/file/view/master?fileName=LLAMA3_LICENSE&status=0)。
-## 引用
 🌟 If you find our work helpful, please leave us a star and cite our paper.
@@ -134,5 +152,4 @@ if __name__ == '__main__':
   author={Yang, Zhuoyi and Teng, Jiayan and Zheng, Wendi and Ding, Ming and Huang, Shiyu and Xu, Jiazheng and Yang, Yuanming and Hong, Wenyi and Zhang, Xiaohan and Feng, Guanyu and others},
   journal={arXiv preprint arXiv:2408.06072},
   year={2024}
-}
-```

+---
+license: other
+language:
+  - en
+base_model:
+  - meta-llama/Meta-Llama-3.1-8B-Instruct
+pipeline_tag: video-text-to-text
+inference: false
+---
+[中文阅读](README_zh.md)
 # CogVLM2-Llama3-Caption
 <div align="center">
 <img src=https://raw.githubusercontent.com/THUDM/CogVLM2/cf9cb3c60a871e0c8e5bde7feaf642e3021153e6/resources/logo.svg>
 </div>
+# Introduction
+Typically, most video data does not come with corresponding descriptive text, so it is necessary to convert the video
+data into textual descriptions to provide the essential training data for text-to-video models.
+## Usage
 ```python
 import io
 import numpy as np
 ```
+## License
+This model is released under the
+CogVLM2  [LICENSE](https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-base/file/view/master?fileName=LICENSE&status=0).
+For models built with Meta Llama 3, please also adhere to
+the [LLAMA3_LICENSE](https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-base/file/view/master?fileName=LLAMA3_LICENSE&status=0).
+## Citation
 🌟 If you find our work helpful, please leave us a star and cite our paper.
   author={Yang, Zhuoyi and Teng, Jiayan and Zheng, Wendi and Ding, Ming and Huang, Shiyu and Xu, Jiazheng and Yang, Yuanming and Hong, Wenyi and Zhang, Xiaohan and Feng, Guanyu and others},
   journal={arXiv preprint arXiv:2408.06072},
   year={2024}
+}

README_zh.md CHANGED Viewed

@@ -1,16 +1,14 @@
 # CogVLM2-Llama3-Caption
 <div align="center">
 <img src=https://raw.githubusercontent.com/THUDM/CogVLM2/cf9cb3c60a871e0c8e5bde7feaf642e3021153e6/resources/logo.svg>
 </div>
-# Introduction
-Typically, most video data does not come with corresponding descriptive text, so it is necessary to convert the video
-data into textual descriptions to provide the essential training data for text-to-video models.
-## Usage
 ```python
 import io
 import numpy as np
@@ -123,14 +121,12 @@ if __name__ == '__main__':
 ```
-## License
-This model is released under the
-CogVLM2  [LICENSE](https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-base/file/view/master?fileName=LICENSE&status=0).
-For models built with Meta Llama 3, please also adhere to
-the [LLAMA3_LICENSE](https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-base/file/view/master?fileName=LLAMA3_LICENSE&status=0).
-## Citation
 🌟 If you find our work helpful, please leave us a star and cite our paper.
@@ -140,4 +136,5 @@ the [LLAMA3_LICENSE](https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-b
   author={Yang, Zhuoyi and Teng, Jiayan and Zheng, Wendi and Ding, Ming and Huang, Shiyu and Xu, Jiazheng and Yang, Yuanming and Hong, Wenyi and Zhang, Xiaohan and Feng, Guanyu and others},
   journal={arXiv preprint arXiv:2408.06072},
   year={2024}
-}

+[Read This in English](README_en.md)
 # CogVLM2-Llama3-Caption
 <div align="center">
 <img src=https://raw.githubusercontent.com/THUDM/CogVLM2/cf9cb3c60a871e0c8e5bde7feaf642e3021153e6/resources/logo.svg>
 </div>
+通常情况下，大部分视频数据并没有附带相应的描述性文本，因此有必要将视频数据转换成文本描述，以提供文本到视频模型所需的必要训练数据。
+## 使用方式
 ```python
 import io
 import numpy as np
 ```
+## 模型协议
+此模型根据 CogVLM2 [LICENSE](https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-base/file/view/master?fileName=LICENSE&status=0) 发布。对于使用 Meta Llama 3 构建的模型，还请遵守
+[LLAMA3_LICENSE](https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-base/file/view/master?fileName=LLAMA3_LICENSE&status=0)。
+## 引用
 🌟 If you find our work helpful, please leave us a star and cite our paper.
   author={Yang, Zhuoyi and Teng, Jiayan and Zheng, Wendi and Ding, Ming and Huang, Shiyu and Xu, Jiazheng and Yang, Yuanming and Hong, Wenyi and Zhang, Xiaohan and Feng, Guanyu and others},
   journal={arXiv preprint arXiv:2408.06072},
   year={2024}
+}
+```