|
--- |
|
language: |
|
- ru |
|
license: apache-2.0 |
|
pipeline_tag: text-generation |
|
tags: |
|
- aeonium |
|
inference: |
|
parameters: |
|
temperature: 0.8 |
|
--- |
|
|
|
# Aeoinum v1 BaseWeb 1B |
|
A state-of-the-art language model for Russian language processing. This checkpoint contains a preliminary version of the model with 1.6 billion parameters. Trained only on web pages. |
|
|
|
## Models |
|
| Name | N of parameters | N of dataset tokens | Context window | |
|
|:---------------------:|:-----------------:|:---------------------:|:--------------:| |
|
| **Aeonium-v1-BaseWeb-1B** | 1.6B | 32B | 4K | |
|
| Aeonium-v1-Base-1B | 1.6B | In training | 4K | |
|
|
|
## Usage |
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
import torch |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("aeonium/Aeonium-v1-BaseWeb-1B") |
|
model = AutoModelForCausalLM.from_pretrained("aeonium/Aeonium-v1-BaseWeb-1B").cuda() |
|
|
|
input_ids = tokenizer("Искусственный интеллект - это", return_tensors='pt').to(model.device)["input_ids"] |
|
output = model.generate(input_ids, max_new_tokens=48, do_sample=True, temperature=0.7) |
|
print(tokenizer.decode(output[0])) |
|
``` |
|
Output: |
|
``` |
|
Искусственный интеллект - это не только про компьютеры и смартфоны. Его возможности безграничны, а с развитием интернета и интернета вещей он становится еще и самым настоящим оружием в борьбе с преступностью. |
|
Мы поговорили с юристом о самых интересных и опасных способах использования ИИ. |
|
``` |
|
|
|
## Dataset Detail |
|
The dataset for pre-training is collected from public data, most of which are web pages in Russian. The total size of the data is 32B tokens. |
|
|
|
## Training Detail |
|
The training is performed thanks to a grant from [TPU Research Cloud](https://sites.research.google/trc/about/) on a TPU v4-128 node. |
|
|
|
Loss: 2.68; Accuracy: 0.48; Batch Size: 1024 |
|
|
|
## Copyright |
|
The model is released under the Apache 2.0 license. |