metadata
license: mit
datasets:
- glaiveai/glaive-function-calling-v2
language:
- zh
library_name: transformers
pipeline_tag: text-generation
tags:
- function-call
Qwen2-7B-Instruct-glaive-function-calling
Introduction
基于数据集 glaive-function-calling-v2 在 Qwen2-7B-Instruct 上进行微调而来
Training details
使用 lora 进行训练 训练样本如下:
<|im_start|>system
You are a helpful assistant with access to the following functions. Use them if required -
{
"name": "generate_invoice",
"description": "Generate an invoice with specified details",
"parameters": {
"type": "object",
"properties": {
"customer_name": {
"type": "string",
"description": "The name of the customer"
},
"items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "The name of the item"
},
"quantity": {
"type": "integer",
"description": "The quantity of the item"
},
"price": {
"type": "number",
"description": "The price of the item"
}
},
"required": [
"name",
"quantity",
"price"
]
}
}
},
"required": [
"customer_name",
"items"
]
}
}
<|im_end|>
<|im_start|>user
I need to generate an invoice for a customer named John Doe. He bought 2 apples for $1 each and 3 oranges for $2 each.<|im_end|>
<|im_start|>assistant
<functioncall> {"name": "generate_invoice", "arguments": '{"customer_name": "John Doe", "items": [{"name": "apple", "quantity": 2, "price": 1}, {"name": "orange", "quantity": 3, "price": 2}]}'} <|endoftext|><|im_end|>
<|im_start|>function
{"invoice_id": "INV12345", "customer_name": "John Doe", "items": [{"name": "apple", "quantity": 2, "price": 1, "total": 2}, {"name": "orange", "quantity": 3, "price": 2, "total": 6}], "total": 8, "status": "Generated"}<|im_end|>
<|im_start|>assistant
The invoice has been successfully generated. The invoice ID is INV12345. The total amount for 2 apples and 3 oranges is $8. <|endoftext|><|im_end|>
Quickstart
参考 Qwen2-7B-Instruct
Here provides a code snippet with apply_chat_template
to show you how to load the tokenizer and model and how to generate contents.
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto
model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen2-7B-Instruct",
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-7B-Instruct")
prompt = "Give me a short introduction to large language model."
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)
generated_ids = model.generate(
model_inputs.input_ids,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]