No chat template is defined for this tokenizer - using the default template for the LlamaTokenizerFast class. If the
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
device = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")
model_path = "itpossible/Chinese-Mistral-7B-Instruct-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.bfloat16, device_map=device)
text = "θ―·δΈΊζζ¨θδΈε½δΈεΊ§ζ―θΎθεηε±±"
messages = [{"role": "user", "content": text}]
print("\n\n====conversation====\n", messages)
print('debug: tokenizer.chat_template:\n{}'.format(tokenizer.chat_template))
print('debug: tokenizer.default_chat_template:\n{}'.format(tokenizer.default_chat_template))
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt",tokenize=True).to(device)
outputs = model.generate(inputs, max_new_tokens=300, do_sample=True)
outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
print(outputs)
D:\code_app\anaconda\envs\ai\python.exe D:\ai\Chinese-Mistral-7B-Instruct-v0.1\main.py
Loading checkpoint shards: 100%|ββββββββββ| 4/4 [00:00<00:00, 26.65it/s]
No chat template is defined for this tokenizer - using the default template for the LlamaTokenizerFast class. If the default is not appropriate for your model, please set tokenizer.chat_template
to an appropriate template. See https://huggingface.co/docs/transformers/main/chat_templating for more information.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask
to obtain reliable results.
Setting pad_token_id
to eos_token_id
:2 for open-end generation.
====conversation====
[{'role': 'user', 'content': 'θ―·δΈΊζζ¨θδΈε½δΈεΊ§ζ―θΎθεηε±±'}]
debug: tokenizer.chat_template:
None
debug: tokenizer.default_chat_template:
......
import torch
from transformers import AutoTokenizer, AutoModelForCausalLMdevice = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")
model_path = "itpossible/Chinese-Mistral-7B-Instruct-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.bfloat16, device_map=device)text = "θ―·δΈΊζζ¨θδΈε½δΈεΊ§ζ―θΎθεηε±±"
messages = [{"role": "user", "content": text}]print("\n\n====conversation====\n", messages)
print('debug: tokenizer.chat_template:\n{}'.format(tokenizer.chat_template))
print('debug: tokenizer.default_chat_template:\n{}'.format(tokenizer.default_chat_template))inputs = tokenizer.apply_chat_template(messages, return_tensors="pt",tokenize=True).to(device)
outputs = model.generate(inputs, max_new_tokens=300, do_sample=True)
outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
print(outputs)D:\code_app\anaconda\envs\ai\python.exe D:\ai\Chinese-Mistral-7B-Instruct-v0.1\main.py
Loading checkpoint shards: 100%|ββββββββββ| 4/4 [00:00<00:00, 26.65it/s]No chat template is defined for this tokenizer - using the default template for the LlamaTokenizerFast class. If the default is not appropriate for your model, please set
tokenizer.chat_template
to an appropriate template. See https://huggingface.co/docs/transformers/main/chat_templating for more information.The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's
attention_mask
to obtain reliable results.
Settingpad_token_id
toeos_token_id
:2 for open-end generation.====conversation====
[{'role': 'user', 'content': 'θ―·δΈΊζζ¨θδΈε½δΈεΊ§ζ―θΎθεηε±±'}]
debug: tokenizer.chat_template:
None
debug: tokenizer.default_chat_template:
......
You can find the usage of tokenizer.apply_chat_template here:
https://huggingface.co/docs/transformers/main/chat_templating
We use the following template to construct the SFT dataset:
register_template(
name="mistral",
prefix=[
"{{system}}"
],
prompt=[
"[INST] {{query}} [/INST]"
],
system="",
sep=[]
)
As you can see, we can just using the default template for inference. I think it has no impact on model performance. You can ignore the warning.
Thank you for your prompt response. Unfortunately, I didn't get the expected output, and upon further investigation, I suspect the issue might be related to the PyTorch and CUDA compatibility on my setup, as torch.cuda.is_available() returns False. I'll continue to investigate potential mismatches between PyTorch and my CUDA version.
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
device = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")
print("torch.cuda.is_available()=",torch.cuda.is_available())
model_path = "itpossible/Chinese-Mistral-7B-Instruct-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.bfloat16, device_map=device)
text = "θ―·δΈΊζζ¨θδΈε½δΈεΊ§ζ―θΎθεηε±±"
messages = [{"role": "user", "content": text}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(device)
print("inputs=",inputs)
outputs = model.generate(inputs, max_new_tokens=300, do_sample=True, pad_token_id=tokenizer.eos_token_id)
outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True)
print("outputs=",outputs)
output:
D:\code_app\anaconda\envs\ai\python.exe D:\ai\Chinese-Mistral-7B-Instruct-v0.1\main.py
torch.cuda.is_available()= False
Loading checkpoint shards: 100%|ββββββββββ| 4/4 [00:00<00:00, 26.54it/s]
No chat template is defined for this tokenizer - using the default template for the LlamaTokenizerFast class. If the default is not appropriate for your model, please set tokenizer.chat_template
to an appropriate template. See https://huggingface.co/docs/transformers/main/chat_templating for more information.
inputs= tensor([[ 1, 733, 16289, 28793, 38919, 41987, 34745, 34481, 29492, 30635,
34653, 55082, 29480, 733, 28748, 16289, 28793]])
(ai) PS D:\ai> nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_May__3_19:00:59_Pacific_Daylight_Time_2022
Cuda compilation tools, release 11.7, V11.7.64
Build cuda_11.7.r11.7/compiler.31294372_0
(ai) PS D:\ai> cd .\Chinese-Mistral-7B-Instruct-v0.1
(ai) PS D:\ai\Chinese-Mistral-7B-Instruct-v0.1> conda install pytorch torchvision torchaudio cudatoolkit=11.7 -c pytorch
Collecting package metadata (current_repodata.json): done
Solving environment: unsuccessful initial attempt using frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: unsuccessful initial attempt using frozen solve. Retrying with flexible solve.
PackagesNotFoundError: The following packages are not available from current channels:
- cudatoolkit=11.7
Current channels:
- https://conda.anaconda.org/pytorch/win-64
- https://conda.anaconda.org/pytorch/noarch
- https://repo.anaconda.com/pkgs/main/win-64
- https://repo.anaconda.com/pkgs/main/noarch
- https://repo.anaconda.com/pkgs/r/win-64
- https://repo.anaconda.com/pkgs/r/noarch
- https://repo.anaconda.com/pkgs/msys2/win-64
- https://repo.anaconda.com/pkgs/msys2/noarch
To search for alternate channels that may provide the conda package you're
looking for, navigate to
https://anaconda.org
and use the search bar at the top of the page.
(ai) PS D:\ai\Chinese-Mistral-7B-Instruct-v0.1>
(ai) PS D:\ai\Chinese-Mistral-7B-Instruct-v0.1> pip install torch torchvision torchaudio
Requirement already satisfied: torch in d:\code_app\anaconda\envs\ai\lib\site-packages (2.2.2)
Collecting torchvision
Downloading torchvision-0.17.2-cp311-cp311-win_amd64.whl.metadata (6.6 kB)
Collecting torchaudio
Downloading torchaudio-2.2.2-cp311-cp311-win_amd64.whl.metadata (6.4 kB)
Requirement already satisfied: filelock in d:\code_app\anaconda\envs\ai\lib\site-packages (from torch) (3.13.3)
Requirement already satisfied: typing-extensions>=4.8.0 in d:\code_app\anaconda\envs\ai\lib\site-packages (from torch) (4.11.0)
Requirement already satisfied: sympy in d:\code_app\anaconda\envs\ai\lib\site-packages (from torch) (1.12)
Requirement already satisfied: networkx in d:\code_app\anaconda\envs\ai\lib\site-packages (from torch) (3.3)
Requirement already satisfied: jinja2 in d:\code_app\anaconda\envs\ai\lib\site-packages (from torch) (3.1.3)
Requirement already satisfied: fsspec in d:\code_app\anaconda\envs\ai\lib\site-packages (from torch) (2024.3.1)
Requirement already satisfied: numpy in d:\code_app\anaconda\envs\ai\lib\site-packages (from torchvision) (1.26.4)
Collecting pillow!=8.3.*,>=5.3.0 (from torchvision)
Downloading pillow-10.3.0-cp311-cp311-win_amd64.whl.metadata (9.4 kB)
Requirement already satisfied: MarkupSafe>=2.0 in d:\code_app\anaconda\envs\ai\lib\site-packages (from jinja2->torch) (2.1.5)
Requirement already satisfied: mpmath>=0.19 in d:\code_app\anaconda\envs\ai\lib\site-packages (from sympy->torch) (1.3.0)
Downloading torchvision-0.17.2-cp311-cp311-win_amd64.whl (1.2 MB)
ββββββββββββββββββββββββββββββββββββββββ 1.2/1.2 MB 6.2 MB/s eta 0:00:00
Downloading torchaudio-2.2.2-cp311-cp311-win_amd64.whl (2.4 MB)
ββββββββββββββββββββββββββββββββββββββββ 2.4/2.4 MB 6.9 MB/s eta 0:00:00
Downloading pillow-10.3.0-cp311-cp311-win_amd64.whl (2.5 MB)
ββββββββββββββββββββββββββββββββββββββββ 2.5/2.5 MB 6.7 MB/s eta 0:00:00
Installing collected packages: pillow, torchvision, torchaudio
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
ultralytics 8.0.154 requires matplotlib>=3.2.2, which is not installed.
ultralytics 8.0.154 requires pandas>=1.1.4, which is not installed.
ultralytics 8.0.154 requires py-cpuinfo, which is not installed.
ultralytics 8.0.154 requires scipy>=1.4.1, which is not installed.
Successfully installed pillow-10.3.0 torchaudio-2.2.2 torchvision-0.17.2
Good luck.
got it :)
(ai) PS D:\ai\Chinese-Mistral-7B-Instruct-v0.1> python .\main.py
torch.cuda.is_available()= True
Loading checkpoint shards: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 4/4 [00:04<00:00, 1.23s/it]
No chat template is defined for this tokenizer - using the default template for the LlamaTokenizerFast class. If the default is not appropriate for your model, please set tokenizer.chat_template
to an appropriate template. See https://huggingface.co/docs/transformers/main/chat_templating for more information.
inputs= tensor([[ 1, 733, 16289, 28793, 38919, 41987, 34745, 34481, 29492, 30635,
34653, 55082, 29480, 733, 28748, 16289, 28793]], device='cuda:0')
outputs= ['[INST] θ―·δΈΊζζ¨θδΈε½δΈεΊ§ζ―θΎθεηε±± [/INST] 1. ζ³°ε±±οΌTaishanοΌοΌε°ε€ε±±δΈηζ³°ε±±εΈοΌζ―βδΊε²³δΉι¦βοΌθ’«εδΊΊθͺ为倩εΊηε
₯ε£γζ³°ε±±δ»₯ε·ε³¨ιδΌθη§°οΌε
Άε±±δ½η±θ±
岩ζζοΌε¨ζ΅·ζ1500ε€η±³ηι«ε€ε»Ίζβ倩θ‘βε―ΊεΊγ\n2. εηΊ³ζ―οΌKeyananοΌοΌδ½δΊζ°ηε·΄ιε€εθ¨ε
θͺζ²»εΏθ₯Ώει¨οΌε 山峦ε·ε³¨οΌζ΅ζ°΄ζ½Ίζ½Ίθθη§°γεηΊ³ζ―ζ―εΊε
ηζΉζ°΄εζ·±θοΌε±±
峦ι«θΈε
₯δΊοΌζβδΊΊι΄δ»ε’βδΉη§°γ\n3. 稻εδΊδΈοΌLinzhiyadingοΌοΌδ½δΊεε·ηηεθζθͺζ²»ε·θ₯Ώει¨γ稻εδΊδΈδ»₯ε
Άζ―θ²ε₯δΈ½οΌθͺηΆη―ε’η¬ηΉοΌθ’«θͺδΈΊδΈε½ηβι¦ζ Όιζβγε
ΆηΎ€ε±±ζοΌζζ΅·ηΌη»οΌθ倩η½δΊδΈιͺε³°δΊ€ηΈθΎζ γ']
(ai) PS D:\ai\Chinese-Mistral-7B-Instruct-v0.1>