Text Generation
PEFT
Japanese
stardust-coder commited on
Commit
f7cdd63
1 Parent(s): 55fc1e4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -1
README.md CHANGED
@@ -1,6 +1,92 @@
1
  ---
2
  library_name: peft
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  ## Training procedure
5
 
6
 
@@ -17,4 +103,4 @@ The following `bitsandbytes` quantization config was used during training:
17
  ### Framework versions
18
 
19
 
20
- - PEFT 0.4.0
 
1
  ---
2
  library_name: peft
3
+ license: llama2
4
+ datasets:
5
+ - izumi-lab/llm-japanese-dataset
6
+ language:
7
+ - ja
8
+ pipeline_tag: text-generation
9
  ---
10
+
11
+ # AIgroup-CVM-utokyohospital/Llama-2-70b-chat-4bit-japanese
12
+
13
+ This model is Llama-2-Chat 70B fine-tuned with a part of the following Japanese version of the alpaca dataset.
14
+
15
+ https://huggingface.co/datasets/izumi-lab/llm-japanese-dataset
16
+
17
+ - 10000 steps
18
+ - batch_size = 4
19
+
20
+
21
+ ## Copyright Notice
22
+
23
+ This model is built on the copyright of Meta's LLaMA.
24
+
25
+ Users of this model must also agree to Meta's license below.
26
+
27
+ https://ai.meta.com/llama/
28
+
29
+
30
+ ## How to use
31
+
32
+ ```python
33
+ import os
34
+ os.environ["CUDA_VISIBLE_DEVICES"] = "0,1,2,3"
35
+ import torch
36
+ torch.cuda.empty_cache()
37
+ from peft import PeftModel
38
+ from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, AutoConfig
39
+
40
+ # Load models
41
+ model_id = "meta-llama/Llama-2-70b-chat-hf"
42
+ bnb_config = BitsAndBytesConfig(
43
+ load_in_4bit=True,
44
+ bnb_4bit_use_double_quant=True,
45
+ bnb_4bit_quant_type="nf4",
46
+ bnb_4bit_compute_dtype=torch.bfloat16
47
+ )
48
+ config = AutoConfig.from_pretrained(model_id)
49
+ config.pretraining_tp = 1 #LLama-2-70bなら必要
50
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
51
+ model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config,
52
+ device_map="auto")
53
+
54
+ # Load weights
55
+ peft_name = "AIgroup-CVM-utokyohospital/Llama-2-70b-chat-4bit-japanese"
56
+ model_peft = PeftModel.from_pretrained(
57
+ model,
58
+ peft_name,
59
+ device_map="auto"
60
+ )
61
+ model_peft.eval()
62
+
63
+ device = "cuda:0"
64
+
65
+ inputs = tokenizer(text, return_tensors="pt").to(device)
66
+
67
+ with torch.no_grad():
68
+ outputs = model.generate(**inputs,
69
+ temperature=0.0,
70
+ repetition_penalty=1.00)
71
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
72
+ outputs = model_peft.generate(**inputs,
73
+ temperature=0.0,
74
+ repetition_penalty=1.00)
75
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
76
+ ```
77
+
78
+ ## Sample Responses
79
+
80
+ ```
81
+
82
+ ```
83
+
84
+
85
+
86
+
87
+
88
+
89
+
90
  ## Training procedure
91
 
92
 
 
103
  ### Framework versions
104
 
105
 
106
+ - PEFT 0.4.0