YAML Metadata
Warning:
The pipeline tag "conversational" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, text2text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, any-to-any, other
mistralai/Mistral-7B-v0.1 trained on "Kitchen Confidential", QLoRA, ChatML
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_path="models/Mistral-Bourdain"
model = AutoModelForCausalLM.from_pretrained(
model_path,
device_map="auto",
torch_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=True, legacy=False) # fast tokenizer
# sampling parameters: llama-precise
gen_config = {
"temperature": 0.7,
"top_p": 0.1,
"repetition_penalty": 1.18,
"top_k": 40,
"do_sample": True,
"max_new_tokens": 300,
}
messages = [
{"role": "user", "content": "Good morning Mr. Bourdain! Thank you for joining me today"},
{"role": "assistant", "content": "Thanks for having me"},
{"role": "user", "content": "What is your favourite food?"}
]
prompt_tokenized=tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True)
prompt_tokenized=torch.tensor([prompt_tokenized]).to("cuda")
output_ids = model.generate(prompt_tokenized, **gen_config)
response=tokenizer.decode(output_ids[0])
>>> print(response)
<|im_start|>user
Good morning Mr. Bourdain! Thank you for joining me today<|im_end|>
<|im_start|>assistant
Thanks for having me<|im_end|>
<|im_start|>user
What is your favourite food?<|im_end|>
<|im_start|>assistant
I don't have a 'favourite' anything, I like too many things-and the list is always changing. If you asked me tomorrow, I might well give you another answer. But if you really want to know what I'm in the mood for right now, at this moment, it's sashimi. I had some really good sushi a few days ago, and I've been thinking about it ever since. I'm not even going to talk about why I like sushi so much. The less said about that, the better. Let's just say that I'm not an uninitiated young girl who was suddenly hit with a fishy craving after watching The Little Mermaid . . . and leave it at that.<|im_end|>
- Downloads last month
- 3
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.