itpossible/Chinese-Mistral-7B-Instruct-v0.1 · No chat template is defined for this tokenizer - using the default template for the LlamaTokenizerFast class. If the

Apr 8

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

device = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")

model_path = "itpossible/Chinese-Mistral-7B-Instruct-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.bfloat16, device_map=device)

text = "请为我推荐中国三座比较著名的山"
messages = [{"role": "user", "content": text}]

print("\n\n====conversation====\n", messages)
print('debug: tokenizer.chat_template:\n{}'.format(tokenizer.chat_template))
print('debug: tokenizer.default_chat_template:\n{}'.format(tokenizer.default_chat_template))

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt",tokenize=True).to(device)
outputs = model.generate(inputs, max_new_tokens=300, do_sample=True)
outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
print(outputs)

D:\code_app\anaconda\envs\ai\python.exe D:\ai\Chinese-Mistral-7B-Instruct-v0.1\main.py
Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 26.65it/s]

No chat template is defined for this tokenizer - using the default template for the LlamaTokenizerFast class. If the default is not appropriate for your model, please set tokenizer.chat_template to an appropriate template. See https://huggingface.co/docs/transformers/main/chat_templating for more information.

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
Setting pad_token_id to eos_token_id:2 for open-end generation.

====conversation====
[{'role': 'user', 'content': '请为我推荐中国三座比较著名的山'}]
debug: tokenizer.chat_template:
None
debug: tokenizer.default_chat_template:
......

itpossible

Owner Apr 8

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

device = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")

model_path = "itpossible/Chinese-Mistral-7B-Instruct-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.bfloat16, device_map=device)

text = "请为我推荐中国三座比较著名的山"
messages = [{"role": "user", "content": text}]

print("\n\n====conversation====\n", messages)
print('debug: tokenizer.chat_template:\n{}'.format(tokenizer.chat_template))
print('debug: tokenizer.default_chat_template:\n{}'.format(tokenizer.default_chat_template))

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt",tokenize=True).to(device)
outputs = model.generate(inputs, max_new_tokens=300, do_sample=True)
outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
print(outputs)

D:\code_app\anaconda\envs\ai\python.exe D:\ai\Chinese-Mistral-7B-Instruct-v0.1\main.py
Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 26.65it/s]

No chat template is defined for this tokenizer - using the default template for the LlamaTokenizerFast class. If the default is not appropriate for your model, please set tokenizer.chat_template to an appropriate template. See https://huggingface.co/docs/transformers/main/chat_templating for more information.

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
Setting pad_token_id to eos_token_id:2 for open-end generation.

====conversation====
[{'role': 'user', 'content': '请为我推荐中国三座比较著名的山'}]
debug: tokenizer.chat_template:
None
debug: tokenizer.default_chat_template:
......

You can find the usage of tokenizer.apply_chat_template here:
https://huggingface.co/docs/transformers/main/chat_templating

We use the following template to construct the SFT dataset:
register_template(
name="mistral",
prefix=[
"{{system}}"
],
prompt=[
"[INST] {{query}} [/INST]"
],
system="",
sep=[]
)

As you can see, we can just using the default template for inference. I think it has no impact on model performance. You can ignore the warning.

ooAKLoo

Apr 8

Thank you for your prompt response. Unfortunately, I didn't get the expected output, and upon further investigation, I suspect the issue might be related to the PyTorch and CUDA compatibility on my setup, as torch.cuda.is_available() returns False. I'll continue to investigate potential mismatches between PyTorch and my CUDA version.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

device = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")

print("torch.cuda.is_available()=",torch.cuda.is_available())
model_path = "itpossible/Chinese-Mistral-7B-Instruct-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.bfloat16, device_map=device)

text = "请为我推荐中国三座比较著名的山"
messages = [{"role": "user", "content": text}]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(device)
print("inputs=",inputs)
outputs = model.generate(inputs, max_new_tokens=300, do_sample=True, pad_token_id=tokenizer.eos_token_id)
outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True)
print("outputs=",outputs)

output:
D:\code_app\anaconda\envs\ai\python.exe D:\ai\Chinese-Mistral-7B-Instruct-v0.1\main.py
torch.cuda.is_available()= False
Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 26.54it/s]

No chat template is defined for this tokenizer - using the default template for the LlamaTokenizerFast class. If the default is not appropriate for your model, please set tokenizer.chat_template to an appropriate template. See https://huggingface.co/docs/transformers/main/chat_templating for more information.

inputs= tensor([[ 1, 733, 16289, 28793, 38919, 41987, 34745, 34481, 29492, 30635,
34653, 55082, 29480, 733, 28748, 16289, 28793]])
(ai) PS D:\ai> nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_May__3_19:00:59_Pacific_Daylight_Time_2022
Cuda compilation tools, release 11.7, V11.7.64
Build cuda_11.7.r11.7/compiler.31294372_0
(ai) PS D:\ai> cd .\Chinese-Mistral-7B-Instruct-v0.1
(ai) PS D:\ai\Chinese-Mistral-7B-Instruct-v0.1> conda install pytorch torchvision torchaudio cudatoolkit=11.7 -c pytorch
Collecting package metadata (current_repodata.json): done
Solving environment: unsuccessful initial attempt using frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: unsuccessful initial attempt using frozen solve. Retrying with flexible solve.

PackagesNotFoundError: The following packages are not available from current channels:

cudatoolkit=11.7

Current channels:

To search for alternate channels that may provide the conda package you're
looking for, navigate to

https://anaconda.org

and use the search bar at the top of the page.

(ai) PS D:\ai\Chinese-Mistral-7B-Instruct-v0.1>
(ai) PS D:\ai\Chinese-Mistral-7B-Instruct-v0.1> pip install torch torchvision torchaudio
Requirement already satisfied: torch in d:\code_app\anaconda\envs\ai\lib\site-packages (2.2.2)
Collecting torchvision
Downloading torchvision-0.17.2-cp311-cp311-win_amd64.whl.metadata (6.6 kB)
Collecting torchaudio
Downloading torchaudio-2.2.2-cp311-cp311-win_amd64.whl.metadata (6.4 kB)
Requirement already satisfied: filelock in d:\code_app\anaconda\envs\ai\lib\site-packages (from torch) (3.13.3)
Requirement already satisfied: typing-extensions>=4.8.0 in d:\code_app\anaconda\envs\ai\lib\site-packages (from torch) (4.11.0)
Requirement already satisfied: sympy in d:\code_app\anaconda\envs\ai\lib\site-packages (from torch) (1.12)
Requirement already satisfied: networkx in d:\code_app\anaconda\envs\ai\lib\site-packages (from torch) (3.3)
Requirement already satisfied: jinja2 in d:\code_app\anaconda\envs\ai\lib\site-packages (from torch) (3.1.3)
Requirement already satisfied: fsspec in d:\code_app\anaconda\envs\ai\lib\site-packages (from torch) (2024.3.1)
Requirement already satisfied: numpy in d:\code_app\anaconda\envs\ai\lib\site-packages (from torchvision) (1.26.4)
Collecting pillow!=8.3.*,>=5.3.0 (from torchvision)
Downloading pillow-10.3.0-cp311-cp311-win_amd64.whl.metadata (9.4 kB)
Requirement already satisfied: MarkupSafe>=2.0 in d:\code_app\anaconda\envs\ai\lib\site-packages (from jinja2->torch) (2.1.5)
Requirement already satisfied: mpmath>=0.19 in d:\code_app\anaconda\envs\ai\lib\site-packages (from sympy->torch) (1.3.0)
Downloading torchvision-0.17.2-cp311-cp311-win_amd64.whl (1.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 6.2 MB/s eta 0:00:00
Downloading torchaudio-2.2.2-cp311-cp311-win_amd64.whl (2.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.4/2.4 MB 6.9 MB/s eta 0:00:00
Downloading pillow-10.3.0-cp311-cp311-win_amd64.whl (2.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.5/2.5 MB 6.7 MB/s eta 0:00:00
Installing collected packages: pillow, torchvision, torchaudio
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
ultralytics 8.0.154 requires matplotlib>=3.2.2, which is not installed.
ultralytics 8.0.154 requires pandas>=1.1.4, which is not installed.
ultralytics 8.0.154 requires py-cpuinfo, which is not installed.
ultralytics 8.0.154 requires scipy>=1.4.1, which is not installed.
Successfully installed pillow-10.3.0 torchaudio-2.2.2 torchvision-0.17.2

itpossible

Owner Apr 8

Good luck.

ooAKLoo

Apr 8

got it :)
(ai) PS D:\ai\Chinese-Mistral-7B-Instruct-v0.1> python .\main.py
torch.cuda.is_available()= True
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:04<00:00, 1.23s/it]

No chat template is defined for this tokenizer - using the default template for the LlamaTokenizerFast class. If the default is not appropriate for your model, please set tokenizer.chat_template to an appropriate template. See https://huggingface.co/docs/transformers/main/chat_templating for more information.

inputs= tensor([[ 1, 733, 16289, 28793, 38919, 41987, 34745, 34481, 29492, 30635,
34653, 55082, 29480, 733, 28748, 16289, 28793]], device='cuda:0')
outputs= ['[INST] 请为我推荐中国三座比较著名的山 [/INST] 1. 泰山（Taishan）：地处山东省泰山市，是“五岳之首”，被后人誉为天庭的入口。泰山以巍峨雄伟著称，其山体由花
岩构成，在海拔1500多米的高处建有“天街”寺庙。\n2. 喀纳斯（Keyanan）：位于新疆巴里坤哈萨克自治县西北部，因山峦巍峨，流水潺潺而著称。喀纳斯景区内的湖水呈深蓝，山
峦高耸入云，有“人间仙境”之称。\n3. 稻城亚丁（Linzhiyading）：位于四川省甘孜藏族自治州西南部。稻城亚丁以其景色奇丽，自然环境独特，被誉为中国的“香格里拉”。其群山林，林海缭绕，蓝天白云与雪峰交相辉映。']
(ai) PS D:\ai\Chinese-Mistral-7B-Instruct-v0.1>