nitky commited on
Commit
d66ee74
1 Parent(s): ccb5c8c

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +83 -0
README.md CHANGED
@@ -29,6 +29,17 @@ It was a proof of concept for merging LLMs trained in other languages, and paid
29
 
30
  As far as I know, Swallow is the full set Llama 2 model(7B, 13B, 70B) that can output the most beautiful Japanese. Therefore, I used it as the base model for merging this time. Thank you for their wonderful work.
31
 
 
 
 
 
 
 
 
 
 
 
 
32
  ## Prompt template: Swallow (Alpaca format)
33
 
34
  ```
@@ -40,6 +51,78 @@ As far as I know, Swallow is the full set Llama 2 model(7B, 13B, 70B) that can o
40
  ### 応答:
41
  ```
42
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
  ## Merge Details
44
  ### Merge Method
45
 
 
29
 
30
  As far as I know, Swallow is the full set Llama 2 model(7B, 13B, 70B) that can output the most beautiful Japanese. Therefore, I used it as the base model for merging this time. Thank you for their wonderful work.
31
 
32
+ ## Test environment
33
+
34
+ This model was tested using [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main). I use preset `simple-1` for Generation.
35
+
36
+ Users reported that setting **repetition_penalty** is important to prevent repeated output. If you run into any issues, be sure to check your settings.
37
+
38
+ - temperature: 0.7
39
+ - top_p: 0.9
40
+ - **repetition_penalty: 1.15**
41
+ - top_k: 20
42
+
43
  ## Prompt template: Swallow (Alpaca format)
44
 
45
  ```
 
51
  ### 応答:
52
  ```
53
 
54
+ ## Use the instruct model
55
+
56
+ ```
57
+ import torch
58
+ from transformers import AutoTokenizer, AutoModelForCausalLM
59
+
60
+ model_name = "nitky/Superswallow-7b-v0.1"
61
+
62
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
63
+ model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, low_cpu_mem_usage=True, device_map="auto")
64
+
65
+
66
+ PROMPT_DICT = {
67
+ "prompt_input": (
68
+ "以下に、あるタスクを説明する指示があり、それに付随する入力が更なる文脈を提供しています。"
69
+ "リクエストを適切に完了するための回答を記述してください。\n\n"
70
+ "### 指示:\n{instruction}\n\n### 入力:\n{input}\n\n### 応答:"
71
+
72
+ ),
73
+ "prompt_no_input": (
74
+ "以下に、あるタスクを説明する指示があります。"
75
+ "リクエストを適切に完了するための回答を記述してください。\n\n"
76
+ "### 指示:\n{instruction}\n\n### 応答:"
77
+ ),
78
+ }
79
+
80
+ def create_prompt(instruction, input=None):
81
+ """
82
+ Generates a prompt based on the given instruction and an optional input.
83
+ If input is provided, it uses the 'prompt_input' template from PROMPT_DICT.
84
+ If no input is provided, it uses the 'prompt_no_input' template.
85
+
86
+ Args:
87
+ instruction (str): The instruction describing the task.
88
+ input (str, optional): Additional input providing context for the task. Default is None.
89
+
90
+ Returns:
91
+ str: The generated prompt.
92
+ """
93
+ if input:
94
+ # Use the 'prompt_input' template when additional input is provided
95
+ return PROMPT_DICT["prompt_input"].format(instruction=instruction, input=input)
96
+ else:
97
+ # Use the 'prompt_no_input' template when no additional input is provided
98
+ return PROMPT_DICT["prompt_no_input"].format(instruction=instruction)
99
+
100
+ # Example usage
101
+ instruction_example = "以下のトピックに関する簡潔な情報を提供してください。"
102
+ input_example = "東京工業大学の主なキャンパスの一覧を、リスト形式で教えてください"
103
+ prompt = create_prompt(instruction_example, input_example)
104
+
105
+ input_ids = tokenizer.encode(
106
+ prompt,
107
+ add_special_tokens=False,
108
+ return_tensors="pt"
109
+ )
110
+
111
+ tokens = model.generate(
112
+ input_ids.to(device=model.device),
113
+ max_new_tokens=200,
114
+ temperature=0.7,
115
+ top_p=0.9,
116
+ repetition_penalty=1.15,
117
+ top_k=20,
118
+ do_sample=True,
119
+ )
120
+
121
+ out = tokenizer.decode(tokens[0], skip_special_tokens=True)
122
+ print(out)
123
+
124
+ ```
125
+
126
  ## Merge Details
127
  ### Merge Method
128