for testing
#3
by
AICloudOtabek
- opened
README.md
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
---
|
2 |
-
license:
|
3 |
pipeline_tag: text-generation
|
4 |
tags:
|
5 |
- chemistry
|
@@ -8,20 +8,12 @@ language:
|
|
8 |
- zh
|
9 |
---
|
10 |
# ChemLLM-7B-Chat: LLM for Chemistry and Molecule Science
|
11 |
-
|
12 |
-
> [!IMPORTANT]
|
13 |
-
> Better using New version of ChemLLM!
|
14 |
-
> [AI4Chem/ChemLLM-7B-Chat-1.5-DPO](https://huggingface.co/AI4Chem/ChemLLM-7B-Chat-1.5-DPO) or [AI4Chem/ChemLLM-7B-Chat-1.5-SFT](https://huggingface.co/AI4Chem/ChemLLM-7B-Chat-1.5-SFT)
|
15 |
-
|
16 |
-
|
17 |
ChemLLM-7B-Chat, The First Open-source Large Language Model for Chemistry and Molecule Science, Build based on InternLM-2 with ❤
|
18 |
[![Paper page](https://huggingface.co/datasets/huggingface/badges/resolve/main/paper-page-sm.svg)](https://huggingface.co/papers/2402.06852)
|
19 |
|
20 |
<center><img src='https://cdn-uploads.huggingface.co/production/uploads/64bce15bafd1e46c5504ad38/wdFV6p3rTBCtskbeuVwNJ.png'></center>
|
21 |
|
22 |
## News
|
23 |
-
- ChemLLM-1.5 released! Two versions are available [AI4Chem/ChemLLM-7B-Chat-1.5-DPO](https://huggingface.co/AI4Chem/ChemLLM-7B-Chat-1.5-DPO) or [AI4Chem/ChemLLM-7B-Chat-1.5-SFT](https://huggingface.co/AI4Chem/ChemLLM-7B-Chat-1.5-SFT).[2024-4-2]
|
24 |
-
- ChemLLM-1.5 updated! Have a try on [Demo Site](https://chemllm.org/#/chat) or [API Reference](https://api.chemllm.org/docs).[2024-3-23]
|
25 |
- ChemLLM has been featured by HuggingFace on [“Daily Papers” page](https://huggingface.co/papers/2402.06852).[2024-2-13]
|
26 |
- ChemLLM arXiv preprint released.[ChemLLM: A Chemical Large Language Model](https://arxiv.org/abs/2402.06852)[2024-2-10]
|
27 |
- News report from [Shanghai AI Lab](https://mp.weixin.qq.com/s/u-i7lQxJzrytipek4a87fw)[2024-1-26]
|
@@ -44,7 +36,7 @@ import torch
|
|
44 |
model_name_or_id = "AI4Chem/ChemLLM-7B-Chat"
|
45 |
|
46 |
model = AutoModelForCausalLM.from_pretrained(model_name_or_id, torch_dtype=torch.float16, device_map="auto",trust_remote_code=True)
|
47 |
-
tokenizer = AutoTokenizer.from_pretrained(model_name_or_id
|
48 |
|
49 |
prompt = "What is Molecule of Ibuprofen?"
|
50 |
|
@@ -75,21 +67,17 @@ You can format it into this InternLM2 Dialogue format like,
|
|
75 |
```
|
76 |
def InternLM2_format(instruction,prompt,answer,history):
|
77 |
prefix_template=[
|
78 |
-
"<|
|
79 |
-
"{}"
|
80 |
-
"<|im_end|>\n"
|
81 |
]
|
82 |
prompt_template=[
|
83 |
-
"<|
|
84 |
-
"{}",
|
85 |
-
"<|
|
86 |
-
"<|im_start|>assistant\n",
|
87 |
-
"{}",
|
88 |
-
"<|im_end|>\n"
|
89 |
]
|
90 |
-
system = f'{prefix_template[0]}{prefix_template[1].format(instruction)}
|
91 |
-
history = "".join([f'{prompt_template[0]}{prompt_template[1].format(qa[0])}{prompt_template[
|
92 |
-
prompt = f'{prompt_template[0]}{prompt_template[1].format(prompt)}{prompt_template[
|
93 |
return f"{system}{history}{prompt}"
|
94 |
```
|
95 |
And there is a good example for system prompt,
|
|
|
1 |
---
|
2 |
+
license: other
|
3 |
pipeline_tag: text-generation
|
4 |
tags:
|
5 |
- chemistry
|
|
|
8 |
- zh
|
9 |
---
|
10 |
# ChemLLM-7B-Chat: LLM for Chemistry and Molecule Science
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
ChemLLM-7B-Chat, The First Open-source Large Language Model for Chemistry and Molecule Science, Build based on InternLM-2 with ❤
|
12 |
[![Paper page](https://huggingface.co/datasets/huggingface/badges/resolve/main/paper-page-sm.svg)](https://huggingface.co/papers/2402.06852)
|
13 |
|
14 |
<center><img src='https://cdn-uploads.huggingface.co/production/uploads/64bce15bafd1e46c5504ad38/wdFV6p3rTBCtskbeuVwNJ.png'></center>
|
15 |
|
16 |
## News
|
|
|
|
|
17 |
- ChemLLM has been featured by HuggingFace on [“Daily Papers” page](https://huggingface.co/papers/2402.06852).[2024-2-13]
|
18 |
- ChemLLM arXiv preprint released.[ChemLLM: A Chemical Large Language Model](https://arxiv.org/abs/2402.06852)[2024-2-10]
|
19 |
- News report from [Shanghai AI Lab](https://mp.weixin.qq.com/s/u-i7lQxJzrytipek4a87fw)[2024-1-26]
|
|
|
36 |
model_name_or_id = "AI4Chem/ChemLLM-7B-Chat"
|
37 |
|
38 |
model = AutoModelForCausalLM.from_pretrained(model_name_or_id, torch_dtype=torch.float16, device_map="auto",trust_remote_code=True)
|
39 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name_or_id,,trust_remote_code=True)
|
40 |
|
41 |
prompt = "What is Molecule of Ibuprofen?"
|
42 |
|
|
|
67 |
```
|
68 |
def InternLM2_format(instruction,prompt,answer,history):
|
69 |
prefix_template=[
|
70 |
+
"<|system|>:",
|
71 |
+
"{}"
|
|
|
72 |
]
|
73 |
prompt_template=[
|
74 |
+
"<|user|>:",
|
75 |
+
"{}\n",
|
76 |
+
"<|Bot|>:\n"
|
|
|
|
|
|
|
77 |
]
|
78 |
+
system = f'{prefix_template[0]}\n{prefix_template[-1].format(instruction)}\n'
|
79 |
+
history = "\n".join([f'{prompt_template[0]}\n{prompt_template[1].format(qa[0])}{prompt_template[-1]}{qa[1]}' for qa in history])
|
80 |
+
prompt = f'\n{prompt_template[0]}\n{prompt_template[1].format(prompt)}{prompt_template[-1]}'
|
81 |
return f"{system}{history}{prompt}"
|
82 |
```
|
83 |
And there is a good example for system prompt,
|