simecek commited on
Commit
69ec876
1 Parent(s): cffc544

Usage example

Browse files
Files changed (1) hide show
  1. README.md +32 -0
README.md CHANGED
@@ -8,3 +8,35 @@ language:
8
 
9
  This is a Mistral7B model fine-tuned with QLoRA on Czech Wikipedia data. The model is primarily designed for further fine-tuning for Czech-specific NLP tasks, including summarization and question answering. This adaptation allows for better performance in tasks that require an understanding of the Czech language and context.
10
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
 
9
  This is a Mistral7B model fine-tuned with QLoRA on Czech Wikipedia data. The model is primarily designed for further fine-tuning for Czech-specific NLP tasks, including summarization and question answering. This adaptation allows for better performance in tasks that require an understanding of the Czech language and context.
10
 
11
+ Example of usage:
12
+
13
+ ```python
14
+ from transformers import AutoModelForCausalLM, AutoTokenizer
15
+ import torch
16
+
17
+ model_name = "simecek/cswikimistral_0.1"
18
+ device = "cuda" if torch.cuda.is_available() else "cpu"
19
+
20
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
21
+ model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", load_in_4bit=True)
22
+
23
+ def generate_text(prompt, max_new_tokens=50):
24
+ inputs = tokenizer(prompt, return_tensors="pt").to(device)
25
+ attention_mask = inputs["attention_mask"]
26
+ input_ids = inputs["input_ids"]
27
+
28
+ output = model.generate(
29
+ input_ids,
30
+ attention_mask=attention_mask,
31
+ max_new_tokens=max_new_tokens,
32
+ num_return_sequences=1,
33
+ pad_token_id=tokenizer.eos_token_id,
34
+ )
35
+
36
+ return tokenizer.decode(output[0], skip_special_tokens=True)
37
+
38
+ prompt = "Hlavní město České republiky je"
39
+ generated_text = generate_text(prompt, max_new_tokens=5)
40
+ print(generated_text)
41
+ ```
42
+