|
--- |
|
library_name: transformers |
|
tags: |
|
- trl |
|
- sft |
|
datasets: |
|
- Vikhrmodels/Veles-2.5 |
|
- dichspace/darulm |
|
- zjkarina/Vikhr_instruct |
|
--- |
|
|
|
# Veles Instruct [DONT TOUCH, Under Dev] |
|
|
|
Просто лучшая русская инстракт модель теперь с CHATML |
|
|
|
Метрики, DPO, коды для запуска подьедут позже, мне если честно похуй, вам думаю вообще поебать |
|
|
|
|
|
|
|
Самый быстрый старт: https://colab.research.google.com/drive/10g5LSuzwsGVCCtiTuVM35T0LiiXwlWSQ?usp=sharing |
|
|
|
|
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
import torch |
|
model = AutoModelForCausalLM.from_pretrained("Vikhrmodels/Vikhr-7B-instruct_0.3", |
|
device_map="auto", |
|
attn_implementation="flash_attention_2", |
|
torch_dtype=torch.bfloat16) |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("Vikhrmodels/Vikhr-7B-instruct_0.3",use_fast=False) |
|
from transformers import AutoTokenizer, pipeline |
|
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) |
|
prompts = [ |
|
"В чем разница между фруктом и овощем?", |
|
"Годы жизни колмагорова?"] |
|
|
|
def test_inference(prompt): |
|
prompt = pipe.tokenizer.apply_chat_template([{"role": "user", "content": prompt}], tokenize=False, add_generation_prompt=True) |
|
print(prompt) |
|
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95, eos_token_id=tokenizer.eos_token_id) |
|
return outputs[0]['generated_text'][len(prompt):].strip() |
|
|
|
|
|
for prompt in prompts: |
|
print(f" prompt:\n{prompt}") |
|
print(f" response:\n{test_inference(prompt)}") |
|
print("-"*50) |
|
|
|
``` |