RonanMcGovern
commited on
Commit
•
42afa5c
1
Parent(s):
ad8a331
add yi models
Browse files
README.md
CHANGED
@@ -13,11 +13,12 @@ tags:
|
|
13 |
- function calling
|
14 |
- sharded
|
15 |
---
|
16 |
-
# Function Calling Llama 2 + Mistral + Zephyr + Deepseek Coder Models (version 2)
|
17 |
- Function calling Llama extends the hugging face Llama 2 models with function calling capabilities.
|
18 |
- The model responds with a structured json argument with the function name and arguments.
|
19 |
|
20 |
**Recent Updates**
|
|
|
21 |
- Nov 8th 2023 -> added Zephyr beta, an improved version of Mistral 7B (achieved via DPO)
|
22 |
- November 6th 2023 -> added Deepseek Coder 1.3B, 6.7B and 33B
|
23 |
- October 11th 2023 -> added Mistral 7B with function calling
|
@@ -27,7 +28,9 @@ tags:
|
|
27 |
1. Shortened syntax: Only function descriptions are needed for inference and no added instruction is required.
|
28 |
2. Function descriptions are moved outside of the system prompt. This avoids the behaviour of function calling being affected by how the system prompt had been trained to influence the model.
|
29 |
|
30 |
-
|
|
|
|
|
31 |
- Deepseek-Coder-1.3B-Instruct with function calling ([Base Model](https://huggingface.co/Trelis/deepseek-coder-1.3b-instruct-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/deepseek-coder-1.3b-instruct-function-calling-adapters-v2/settings)) - Paid, [purchase here](https://buy.stripe.com/9AQbJubSda9Z8EM00A)
|
32 |
- Llama-7B-chat with function calling ([Base Model](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-adapters-v2)), ([GGUF - files are in the main branch of the base model]) - Free
|
33 |
- zephyr-7b-beta with function calling ([Base Model](https://huggingface.co/Trelis/zephyr-7b-beta-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/zephyr-7b-beta-function-calling-adapters-v2)), ([GGUF - files are in the main branch of the base model]) - Paid, [purchase here](https://buy.stripe.com/14k00M4pLeqf9IQbJk)
|
@@ -58,6 +61,8 @@ Mistral-7B, Llama-13B, Code-llama-34b, Llama-70B and Falcon-180B with function c
|
|
58 |
|
59 |
Use of all Llama models with function calling is further subject to terms in the [Meta license](https://ai.meta.com/resources/models-and-libraries/llama-downloads/).
|
60 |
|
|
|
|
|
61 |
Zephr models were generated using Ultrachat, which relies on openai. OpenAI does not permit the use of it's models to train competitive models. This makes it unclear as to whether Zephyr may be used commercial. Buyers/users do so at their sole risk.
|
62 |
|
63 |
## Dataset
|
@@ -92,6 +97,7 @@ import json
|
|
92 |
B_FUNC, E_FUNC = "<FUNCTIONS>", "</FUNCTIONS>\n\n"
|
93 |
B_INST, E_INST = "[INST] ", " [/INST]" #Llama style
|
94 |
# B_INST, E_INST = "\n### Instruction:\n", "\n### Response:\n" #DeepSeek Coder Style
|
|
|
95 |
|
96 |
# Define the function metadata
|
97 |
function_metadata = {
|
@@ -135,6 +141,7 @@ Example without a system message:
|
|
135 |
B_FUNC, E_FUNC = "<FUNCTIONS>", "</FUNCTIONS>\n\n"
|
136 |
B_INST, E_INST = "[INST] ", " [/INST]" #Llama style
|
137 |
# B_INST, E_INST = "\n### Instruction:\n", "\n### Response:\n" #DeepSeek Coder Style
|
|
|
138 |
|
139 |
functionList = {function_1_metadata}{function_2_metadata}...
|
140 |
user_prompt = '...'
|
@@ -149,6 +156,7 @@ Example with a system message:
|
|
149 |
B_FUNC, E_FUNC = "<FUNCTIONS>", "</FUNCTIONS>\n\n"
|
150 |
B_INST, E_INST = "[INST] ", " [/INST]" #Llama style
|
151 |
# B_INST, E_INST = "\n### Instruction:\n", "\n### Response:\n" #DeepSeek Coder Style
|
|
|
152 |
B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"
|
153 |
|
154 |
# assuming functionList is defined as above
|
|
|
13 |
- function calling
|
14 |
- sharded
|
15 |
---
|
16 |
+
# Function Calling Llama 2 + Yi + Mistral + Zephyr + Deepseek Coder Models (version 2)
|
17 |
- Function calling Llama extends the hugging face Llama 2 models with function calling capabilities.
|
18 |
- The model responds with a structured json argument with the function name and arguments.
|
19 |
|
20 |
**Recent Updates**
|
21 |
+
- Nov 15th 2023 -> added Yi 200k context models in 6B and 34B form.
|
22 |
- Nov 8th 2023 -> added Zephyr beta, an improved version of Mistral 7B (achieved via DPO)
|
23 |
- November 6th 2023 -> added Deepseek Coder 1.3B, 6.7B and 33B
|
24 |
- October 11th 2023 -> added Mistral 7B with function calling
|
|
|
28 |
1. Shortened syntax: Only function descriptions are needed for inference and no added instruction is required.
|
29 |
2. Function descriptions are moved outside of the system prompt. This avoids the behaviour of function calling being affected by how the system prompt had been trained to influence the model.
|
30 |
|
31 |
+
Latest Models:
|
32 |
+
- Yi-6B-200k context with function calling ([Base Model](Trelis/Yi-6B-200K-Llamafied-function-calling-v2)), ([PEFT Adapters](Trelis/Yi-6B-200K-Llamafied-function-calling-adapters-v2)) - Paid, [purchase here](https://buy.stripe.com/00gdRC7BX1Dt08gbJp)
|
33 |
+
- Yi-34B-200k context with function calling ([Base Model](Trelis/Yi-34B-200K-Llamafied-function-calling-v2)), ([PEFT Adapters](Trelis/Yi-34B-200K-Llamafied-function-calling-adapters-v2)) - Paid, [purchase here](https://buy.stripe.com/8wM00M5tP81R6wE9Bi)
|
34 |
- Deepseek-Coder-1.3B-Instruct with function calling ([Base Model](https://huggingface.co/Trelis/deepseek-coder-1.3b-instruct-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/deepseek-coder-1.3b-instruct-function-calling-adapters-v2/settings)) - Paid, [purchase here](https://buy.stripe.com/9AQbJubSda9Z8EM00A)
|
35 |
- Llama-7B-chat with function calling ([Base Model](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-adapters-v2)), ([GGUF - files are in the main branch of the base model]) - Free
|
36 |
- zephyr-7b-beta with function calling ([Base Model](https://huggingface.co/Trelis/zephyr-7b-beta-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/zephyr-7b-beta-function-calling-adapters-v2)), ([GGUF - files are in the main branch of the base model]) - Paid, [purchase here](https://buy.stripe.com/14k00M4pLeqf9IQbJk)
|
|
|
61 |
|
62 |
Use of all Llama models with function calling is further subject to terms in the [Meta license](https://ai.meta.com/resources/models-and-libraries/llama-downloads/).
|
63 |
|
64 |
+
Yi models are subject to the Yi license, which permits commercial use as of Nov 15th 2023.
|
65 |
+
|
66 |
Zephr models were generated using Ultrachat, which relies on openai. OpenAI does not permit the use of it's models to train competitive models. This makes it unclear as to whether Zephyr may be used commercial. Buyers/users do so at their sole risk.
|
67 |
|
68 |
## Dataset
|
|
|
97 |
B_FUNC, E_FUNC = "<FUNCTIONS>", "</FUNCTIONS>\n\n"
|
98 |
B_INST, E_INST = "[INST] ", " [/INST]" #Llama style
|
99 |
# B_INST, E_INST = "\n### Instruction:\n", "\n### Response:\n" #DeepSeek Coder Style
|
100 |
+
# B_INST, E_INST = "Human: ", " Assistant: " #Yi Style
|
101 |
|
102 |
# Define the function metadata
|
103 |
function_metadata = {
|
|
|
141 |
B_FUNC, E_FUNC = "<FUNCTIONS>", "</FUNCTIONS>\n\n"
|
142 |
B_INST, E_INST = "[INST] ", " [/INST]" #Llama style
|
143 |
# B_INST, E_INST = "\n### Instruction:\n", "\n### Response:\n" #DeepSeek Coder Style
|
144 |
+
# B_INST, E_INST = "Human: ", " Assistant: " #Yi Style
|
145 |
|
146 |
functionList = {function_1_metadata}{function_2_metadata}...
|
147 |
user_prompt = '...'
|
|
|
156 |
B_FUNC, E_FUNC = "<FUNCTIONS>", "</FUNCTIONS>\n\n"
|
157 |
B_INST, E_INST = "[INST] ", " [/INST]" #Llama style
|
158 |
# B_INST, E_INST = "\n### Instruction:\n", "\n### Response:\n" #DeepSeek Coder Style
|
159 |
+
# B_INST, E_INST = "Human: ", " Assistant: " #Yi Style
|
160 |
B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"
|
161 |
|
162 |
# assuming functionList is defined as above
|