No chat template is defined for this tokenizer - using the default template for the LlamaTokenizerFast class. If the

#1
by ooAKLoo - opened

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

device = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")

model_path = "itpossible/Chinese-Mistral-7B-Instruct-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.bfloat16, device_map=device)

text = "θ―·δΈΊζˆ‘ζŽ¨θδΈ­ε›½δΈ‰εΊ§ζ―”θΎƒθ‘—εηš„ε±±"
messages = [{"role": "user", "content": text}]

print("\n\n====conversation====\n", messages)
print('debug: tokenizer.chat_template:\n{}'.format(tokenizer.chat_template))
print('debug: tokenizer.default_chat_template:\n{}'.format(tokenizer.default_chat_template))

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt",tokenize=True).to(device)
outputs = model.generate(inputs, max_new_tokens=300, do_sample=True)
outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
print(outputs)

D:\code_app\anaconda\envs\ai\python.exe D:\ai\Chinese-Mistral-7B-Instruct-v0.1\main.py
Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4/4 [00:00<00:00, 26.65it/s]

No chat template is defined for this tokenizer - using the default template for the LlamaTokenizerFast class. If the default is not appropriate for your model, please set tokenizer.chat_template to an appropriate template. See https://huggingface.co/docs/transformers/main/chat_templating for more information.

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
Setting pad_token_id to eos_token_id:2 for open-end generation.

====conversation====
[{'role': 'user', 'content': 'θ―·δΈΊζˆ‘ζŽ¨θδΈ­ε›½δΈ‰εΊ§ζ―”θΎƒθ‘—εηš„ε±±'}]
debug: tokenizer.chat_template:
None
debug: tokenizer.default_chat_template:
......

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

device = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")

model_path = "itpossible/Chinese-Mistral-7B-Instruct-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.bfloat16, device_map=device)

text = "θ―·δΈΊζˆ‘ζŽ¨θδΈ­ε›½δΈ‰εΊ§ζ―”θΎƒθ‘—εηš„ε±±"
messages = [{"role": "user", "content": text}]

print("\n\n====conversation====\n", messages)
print('debug: tokenizer.chat_template:\n{}'.format(tokenizer.chat_template))
print('debug: tokenizer.default_chat_template:\n{}'.format(tokenizer.default_chat_template))

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt",tokenize=True).to(device)
outputs = model.generate(inputs, max_new_tokens=300, do_sample=True)
outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
print(outputs)

D:\code_app\anaconda\envs\ai\python.exe D:\ai\Chinese-Mistral-7B-Instruct-v0.1\main.py
Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4/4 [00:00<00:00, 26.65it/s]

No chat template is defined for this tokenizer - using the default template for the LlamaTokenizerFast class. If the default is not appropriate for your model, please set tokenizer.chat_template to an appropriate template. See https://huggingface.co/docs/transformers/main/chat_templating for more information.

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
Setting pad_token_id to eos_token_id:2 for open-end generation.

====conversation====
[{'role': 'user', 'content': 'θ―·δΈΊζˆ‘ζŽ¨θδΈ­ε›½δΈ‰εΊ§ζ―”θΎƒθ‘—εηš„ε±±'}]
debug: tokenizer.chat_template:
None
debug: tokenizer.default_chat_template:
......

You can find the usage of tokenizer.apply_chat_template here:
https://huggingface.co/docs/transformers/main/chat_templating

We use the following template to construct the SFT dataset:
register_template(
name="mistral",
prefix=[
"{{system}}"
],
prompt=[
"[INST] {{query}} [/INST]"
],
system="",
sep=[]
)

As you can see, we can just using the default template for inference. I think it has no impact on model performance. You can ignore the warning.

Thank you for your prompt response. Unfortunately, I didn't get the expected output, and upon further investigation, I suspect the issue might be related to the PyTorch and CUDA compatibility on my setup, as torch.cuda.is_available() returns False. I'll continue to investigate potential mismatches between PyTorch and my CUDA version.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

device = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")

print("torch.cuda.is_available()=",torch.cuda.is_available())
model_path = "itpossible/Chinese-Mistral-7B-Instruct-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.bfloat16, device_map=device)

text = "θ―·δΈΊζˆ‘ζŽ¨θδΈ­ε›½δΈ‰εΊ§ζ―”θΎƒθ‘—εηš„ε±±"
messages = [{"role": "user", "content": text}]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(device)
print("inputs=",inputs)
outputs = model.generate(inputs, max_new_tokens=300, do_sample=True, pad_token_id=tokenizer.eos_token_id)
outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True)
print("outputs=",outputs)

output:
D:\code_app\anaconda\envs\ai\python.exe D:\ai\Chinese-Mistral-7B-Instruct-v0.1\main.py
torch.cuda.is_available()= False
Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4/4 [00:00<00:00, 26.54it/s]

No chat template is defined for this tokenizer - using the default template for the LlamaTokenizerFast class. If the default is not appropriate for your model, please set tokenizer.chat_template to an appropriate template. See https://huggingface.co/docs/transformers/main/chat_templating for more information.

inputs= tensor([[ 1, 733, 16289, 28793, 38919, 41987, 34745, 34481, 29492, 30635,
34653, 55082, 29480, 733, 28748, 16289, 28793]])
(ai) PS D:\ai> nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_May__3_19:00:59_Pacific_Daylight_Time_2022
Cuda compilation tools, release 11.7, V11.7.64
Build cuda_11.7.r11.7/compiler.31294372_0
(ai) PS D:\ai> cd .\Chinese-Mistral-7B-Instruct-v0.1
(ai) PS D:\ai\Chinese-Mistral-7B-Instruct-v0.1> conda install pytorch torchvision torchaudio cudatoolkit=11.7 -c pytorch
Collecting package metadata (current_repodata.json): done
Solving environment: unsuccessful initial attempt using frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: unsuccessful initial attempt using frozen solve. Retrying with flexible solve.

PackagesNotFoundError: The following packages are not available from current channels:

  • cudatoolkit=11.7

Current channels:

To search for alternate channels that may provide the conda package you're
looking for, navigate to

https://anaconda.org

and use the search bar at the top of the page.

(ai) PS D:\ai\Chinese-Mistral-7B-Instruct-v0.1>
(ai) PS D:\ai\Chinese-Mistral-7B-Instruct-v0.1> pip install torch torchvision torchaudio
Requirement already satisfied: torch in d:\code_app\anaconda\envs\ai\lib\site-packages (2.2.2)
Collecting torchvision
Downloading torchvision-0.17.2-cp311-cp311-win_amd64.whl.metadata (6.6 kB)
Collecting torchaudio
Downloading torchaudio-2.2.2-cp311-cp311-win_amd64.whl.metadata (6.4 kB)
Requirement already satisfied: filelock in d:\code_app\anaconda\envs\ai\lib\site-packages (from torch) (3.13.3)
Requirement already satisfied: typing-extensions>=4.8.0 in d:\code_app\anaconda\envs\ai\lib\site-packages (from torch) (4.11.0)
Requirement already satisfied: sympy in d:\code_app\anaconda\envs\ai\lib\site-packages (from torch) (1.12)
Requirement already satisfied: networkx in d:\code_app\anaconda\envs\ai\lib\site-packages (from torch) (3.3)
Requirement already satisfied: jinja2 in d:\code_app\anaconda\envs\ai\lib\site-packages (from torch) (3.1.3)
Requirement already satisfied: fsspec in d:\code_app\anaconda\envs\ai\lib\site-packages (from torch) (2024.3.1)
Requirement already satisfied: numpy in d:\code_app\anaconda\envs\ai\lib\site-packages (from torchvision) (1.26.4)
Collecting pillow!=8.3.*,>=5.3.0 (from torchvision)
Downloading pillow-10.3.0-cp311-cp311-win_amd64.whl.metadata (9.4 kB)
Requirement already satisfied: MarkupSafe>=2.0 in d:\code_app\anaconda\envs\ai\lib\site-packages (from jinja2->torch) (2.1.5)
Requirement already satisfied: mpmath>=0.19 in d:\code_app\anaconda\envs\ai\lib\site-packages (from sympy->torch) (1.3.0)
Downloading torchvision-0.17.2-cp311-cp311-win_amd64.whl (1.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 6.2 MB/s eta 0:00:00
Downloading torchaudio-2.2.2-cp311-cp311-win_amd64.whl (2.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.4/2.4 MB 6.9 MB/s eta 0:00:00
Downloading pillow-10.3.0-cp311-cp311-win_amd64.whl (2.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.5/2.5 MB 6.7 MB/s eta 0:00:00
Installing collected packages: pillow, torchvision, torchaudio
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
ultralytics 8.0.154 requires matplotlib>=3.2.2, which is not installed.
ultralytics 8.0.154 requires pandas>=1.1.4, which is not installed.
ultralytics 8.0.154 requires py-cpuinfo, which is not installed.
ultralytics 8.0.154 requires scipy>=1.4.1, which is not installed.
Successfully installed pillow-10.3.0 torchaudio-2.2.2 torchvision-0.17.2

Good luck.

got it :)
(ai) PS D:\ai\Chinese-Mistral-7B-Instruct-v0.1> python .\main.py
torch.cuda.is_available()= True
Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4/4 [00:04<00:00, 1.23s/it]

No chat template is defined for this tokenizer - using the default template for the LlamaTokenizerFast class. If the default is not appropriate for your model, please set tokenizer.chat_template to an appropriate template. See https://huggingface.co/docs/transformers/main/chat_templating for more information.

inputs= tensor([[ 1, 733, 16289, 28793, 38919, 41987, 34745, 34481, 29492, 30635,
34653, 55082, 29480, 733, 28748, 16289, 28793]], device='cuda:0')
outputs= ['[INST] θ―·δΈΊζˆ‘ζŽ¨θδΈ­ε›½δΈ‰εΊ§ζ―”θΎƒθ‘—εηš„ε±± [/INST] 1. 泰山(TaishanοΌ‰οΌšεœ°ε€„ε±±δΈœηœζ³°ε±±εΈ‚οΌŒζ˜―β€œδΊ”ε²³δΉ‹ι¦–β€οΌŒθ’«εŽδΊΊθͺ‰δΈΊε€©εΊ­ηš„ε…₯口。泰山δ»₯ε·ε³¨ι›„δΌŸθ‘—η§°οΌŒε…Άε±±δ½“η”±θŠ±
ε²©ζž„ζˆοΌŒεœ¨ζ΅·ζ‹”1500ε€šη±³ηš„ι«˜ε€„ε»Ίζœ‰β€œε€©θ‘—β€ε―ΊεΊ™γ€‚\n2. ε–€ηΊ³ζ–―οΌˆKeyananοΌ‰οΌšδ½δΊŽζ–°η–†ε·΄ι‡Œε€ε“ˆθ¨ε…‹θ‡ͺ治县θ₯ΏεŒ—ιƒ¨οΌŒε› ε±±ε³¦ε·ε³¨οΌŒζ΅ζ°΄ζ½Ίζ½Ίθ€Œθ‘—η§°γ€‚ε–€ηΊ³ζ–―ζ™―εŒΊε†…ηš„ζΉ–ζ°΄ε‘ˆζ·±θ“οΌŒε±±
ε³¦ι«˜θ€Έε…₯δΊ‘οΌŒζœ‰β€œδΊΊι—΄δ»™ε’ƒβ€δΉ‹η§°γ€‚\n3. 稻城亚丁(LinzhiyadingοΌ‰οΌšδ½δΊŽε››ε·ηœη”˜ε­œθ—ζ—θ‡ͺ治州θ₯Ώε—ιƒ¨γ€‚η¨»εŸŽδΊšδΈδ»₯兢景色ε₯‡δΈ½οΌŒθ‡ͺη„ΆηŽ―ε’ƒη‹¬η‰ΉοΌŒθ’«θͺ‰δΈΊδΈ­ε›½ηš„β€œι¦™ζ Όι‡Œζ‹‰β€γ€‚ε…ΆηΎ€ε±±ζž—οΌŒζž—ζ΅·ηΌ­η»•οΌŒθ“ε€©η™½δΊ‘δΈŽι›ͺε³°δΊ€η›ΈθΎ‰ζ˜ γ€‚']
(ai) PS D:\ai\Chinese-Mistral-7B-Instruct-v0.1>

Sign up or log in to comment