sberbank-ai
commited on
Commit
•
fade180
1
Parent(s):
c1277dc
Update README.md
Browse files
README.md
CHANGED
@@ -1,22 +1,23 @@
|
|
1 |
-
#
|
2 |
|
3 |
-
|
4 |
|
5 |
<img src="https://raw.githubusercontent.com/sberbank-ai/ru-dolph/master/pics/RUDOLPH.png" height="60" border="2"/>
|
6 |
|
7 |
Model was trained by [Sber AI](https://github.com/ai-forever) and [AIRI](https://airi.net) teams.
|
8 |
-
* Task: `text2image generation`; `self reranking`; `text ranking`; `image ranking`; `image2text generation`; `zero-shot image classification`, `text2text generation
|
9 |
* Language: `Russian`
|
10 |
* Type: `decoder`
|
11 |
* Num Parameters: `2.7B`
|
12 |
-
* Training Data Volume: `119 million text-image pairs; 60 million text paragraphs
|
|
|
13 |
|
14 |
|
15 |
# Model Description
|
16 |
|
17 |
-
**
|
18 |
|
19 |
-
*(!!!) Hyper-
|
20 |
|
21 |
This is a fine-tuned version of the pre-trained RuDOLPH 2.7B model.
|
22 |
|
|
|
1 |
+
# RUDOLPH-2.7B (XL)
|
2 |
|
3 |
+
RUDOLPH: One Hyper-Tasking Transformer can be creative as DALL-E and smart as CLIP
|
4 |
|
5 |
<img src="https://raw.githubusercontent.com/sberbank-ai/ru-dolph/master/pics/RUDOLPH.png" height="60" border="2"/>
|
6 |
|
7 |
Model was trained by [Sber AI](https://github.com/ai-forever) and [AIRI](https://airi.net) teams.
|
8 |
+
* Task: `text2image generation`; `self reranking`; `text ranking`; `image ranking`; `image2text generation`; `zero-shot image classification`, `text2text generation`; `text-qa`; 'math-qa'; `image captioning`; `image generation`; `text-in-the-wild`; `vqa`;
|
9 |
* Language: `Russian`
|
10 |
* Type: `decoder`
|
11 |
* Num Parameters: `2.7B`
|
12 |
+
* Training Data Volume: `119 million text-image pairs; 60 million text paragraphs`
|
13 |
+
* Fine-tuning Data Volume: `43 334 text question-answer pairs; 100 000 math tasks; 85 000 text-image pairs (for captioning, generation); 85 759 visual question-answer pairs; 140 000 image-text pairs for text recognition`
|
14 |
|
15 |
|
16 |
# Model Description
|
17 |
|
18 |
+
**RU**ssian **D**ecoder **O**n **L**anguage **P**icture **H**yper-Tasking (RUDOLPH) 2.7B is the largest text-image-text transformer designed for an easy fine-tuning setup for the solution of various tasks: from generating images by text description and image classification to visual question answering and more. This model demonstrates the power of Hyper-modality Transformers.
|
19 |
|
20 |
+
*(!!!) Hyper-Tasking means generalized Multi-Tasking, e.g., the model that can solve almost all tasks within supported modalities (two modalities in case of RUDOLPH: images and Russian texts).
|
21 |
|
22 |
This is a fine-tuned version of the pre-trained RuDOLPH 2.7B model.
|
23 |
|