File size: 2,840 Bytes
0abae74
 
6db1f48
 
0abae74
 
 
 
6db1f48
0abae74
 
6db1f48
0abae74
6db1f48
 
0abae74
6db1f48
 
0abae74
6db1f48
0abae74
6db1f48
0abae74
6db1f48
 
0abae74
6db1f48
0abae74
6db1f48
 
0abae74
6db1f48
 
0abae74
6db1f48
 
 
 
 
 
 
 
 
 
0abae74
6db1f48
0abae74
6db1f48
0abae74
6db1f48
0abae74
6db1f48
 
0abae74
6db1f48
0abae74
6db1f48
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0abae74
6db1f48
 
0abae74
6db1f48
 
0abae74
6db1f48
 
 
0abae74
6db1f48
0abae74
6db1f48
0abae74
6db1f48
 
 
 
 
0abae74
6db1f48
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
---
library_name: transformers
datasets:
- hypervariance/function-calling-sharegpt
---

# Model Card for Model ID

Gemma 2B function calling. [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it) finetuned on [hypervariance/function-calling-sharegpt](https://huggingface.co/datasets/hypervariance/function-calling-sharegpt).


## Usage

```python
from transformers import AutoModelForCausalLM , AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("rodrigo-pedro/gemma-2b-function-calling", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("rodrigo-pedro/gemma-2b-function-calling", trust_remote_code=True, device_map="auto")

inputs = tokenizer(prompt,return_tensors="pt").to(model.device)

outputs = model.generate(**inputs,do_sample=True,temperature=0.1,top_p=0.95,max_new_tokens=100)

print(tokenizer.decode(outputs[0]))
```

You can also use sharegpt formatted prompts:

```python
from transformers import AutoModelForCausalLM , AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("rodrigo-pedro/gemma-2b-function-calling", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("rodrigo-pedro/gemma-2b-function-calling", trust_remote_code=True, device_map="auto")

chat = [
  {
      "from": "system",
      "value": "SYSTEM PROMPT",
  },
  {
      "from": "human",
      "value": "USER QUESTION"
  },
]

prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs,do_sample=True,temperature=0.1,top_p=0.95,max_new_tokens=100)

print(tokenizer.decode(outputs[0]))
```

## Prompt template

```text
You are a helpful assistant with access to the following functions. Use them if required -
{
    "name": "function name",
    "description": "function description",
    "parameters": {
        "type": "type (object/number/string)",
        "properties": {
            "property_1": {
                "type": "type",
                "description": "property description"
            }
        },
        "required": [
            "property_1"
        ]
    }
}

To use these functions respond with:
<functioncall> {"name": "function_name", "arguments": {"arg_1": "value_1", "arg_1": "value_1", ...}} </functioncall>

Edge cases you must handle:
 - If there are no functions that match the user request, you will respond politely that you cannot help.

User Question:
USER_QUESTION
```

Function calls are enclosed in `<functioncall>` `</functioncall>`.

The model was trained using the same delimiters as [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it):

```text
<bos><start_of_turn>user
Write a hello world program<end_of_turn>
<start_of_turn>model
```

Use `<end_of_turn>` stop sequence to prevent the model from generating further text.