RonanMcGovern commited on
Commit
42afa5c
1 Parent(s): ad8a331

add yi models

Browse files
Files changed (1) hide show
  1. README.md +10 -2
README.md CHANGED
@@ -13,11 +13,12 @@ tags:
13
  - function calling
14
  - sharded
15
  ---
16
- # Function Calling Llama 2 + Mistral + Zephyr + Deepseek Coder Models (version 2)
17
  - Function calling Llama extends the hugging face Llama 2 models with function calling capabilities.
18
  - The model responds with a structured json argument with the function name and arguments.
19
 
20
  **Recent Updates**
 
21
  - Nov 8th 2023 -> added Zephyr beta, an improved version of Mistral 7B (achieved via DPO)
22
  - November 6th 2023 -> added Deepseek Coder 1.3B, 6.7B and 33B
23
  - October 11th 2023 -> added Mistral 7B with function calling
@@ -27,7 +28,9 @@ tags:
27
  1. Shortened syntax: Only function descriptions are needed for inference and no added instruction is required.
28
  2. Function descriptions are moved outside of the system prompt. This avoids the behaviour of function calling being affected by how the system prompt had been trained to influence the model.
29
 
30
- Most Popular Models:
 
 
31
  - Deepseek-Coder-1.3B-Instruct with function calling ([Base Model](https://huggingface.co/Trelis/deepseek-coder-1.3b-instruct-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/deepseek-coder-1.3b-instruct-function-calling-adapters-v2/settings)) - Paid, [purchase here](https://buy.stripe.com/9AQbJubSda9Z8EM00A)
32
  - Llama-7B-chat with function calling ([Base Model](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-adapters-v2)), ([GGUF - files are in the main branch of the base model]) - Free
33
  - zephyr-7b-beta with function calling ([Base Model](https://huggingface.co/Trelis/zephyr-7b-beta-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/zephyr-7b-beta-function-calling-adapters-v2)), ([GGUF - files are in the main branch of the base model]) - Paid, [purchase here](https://buy.stripe.com/14k00M4pLeqf9IQbJk)
@@ -58,6 +61,8 @@ Mistral-7B, Llama-13B, Code-llama-34b, Llama-70B and Falcon-180B with function c
58
 
59
  Use of all Llama models with function calling is further subject to terms in the [Meta license](https://ai.meta.com/resources/models-and-libraries/llama-downloads/).
60
 
 
 
61
  Zephr models were generated using Ultrachat, which relies on openai. OpenAI does not permit the use of it's models to train competitive models. This makes it unclear as to whether Zephyr may be used commercial. Buyers/users do so at their sole risk.
62
 
63
  ## Dataset
@@ -92,6 +97,7 @@ import json
92
  B_FUNC, E_FUNC = "<FUNCTIONS>", "</FUNCTIONS>\n\n"
93
  B_INST, E_INST = "[INST] ", " [/INST]" #Llama style
94
  # B_INST, E_INST = "\n### Instruction:\n", "\n### Response:\n" #DeepSeek Coder Style
 
95
 
96
  # Define the function metadata
97
  function_metadata = {
@@ -135,6 +141,7 @@ Example without a system message:
135
  B_FUNC, E_FUNC = "<FUNCTIONS>", "</FUNCTIONS>\n\n"
136
  B_INST, E_INST = "[INST] ", " [/INST]" #Llama style
137
  # B_INST, E_INST = "\n### Instruction:\n", "\n### Response:\n" #DeepSeek Coder Style
 
138
 
139
  functionList = {function_1_metadata}{function_2_metadata}...
140
  user_prompt = '...'
@@ -149,6 +156,7 @@ Example with a system message:
149
  B_FUNC, E_FUNC = "<FUNCTIONS>", "</FUNCTIONS>\n\n"
150
  B_INST, E_INST = "[INST] ", " [/INST]" #Llama style
151
  # B_INST, E_INST = "\n### Instruction:\n", "\n### Response:\n" #DeepSeek Coder Style
 
152
  B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"
153
 
154
  # assuming functionList is defined as above
 
13
  - function calling
14
  - sharded
15
  ---
16
+ # Function Calling Llama 2 + Yi + Mistral + Zephyr + Deepseek Coder Models (version 2)
17
  - Function calling Llama extends the hugging face Llama 2 models with function calling capabilities.
18
  - The model responds with a structured json argument with the function name and arguments.
19
 
20
  **Recent Updates**
21
+ - Nov 15th 2023 -> added Yi 200k context models in 6B and 34B form.
22
  - Nov 8th 2023 -> added Zephyr beta, an improved version of Mistral 7B (achieved via DPO)
23
  - November 6th 2023 -> added Deepseek Coder 1.3B, 6.7B and 33B
24
  - October 11th 2023 -> added Mistral 7B with function calling
 
28
  1. Shortened syntax: Only function descriptions are needed for inference and no added instruction is required.
29
  2. Function descriptions are moved outside of the system prompt. This avoids the behaviour of function calling being affected by how the system prompt had been trained to influence the model.
30
 
31
+ Latest Models:
32
+ - Yi-6B-200k context with function calling ([Base Model](Trelis/Yi-6B-200K-Llamafied-function-calling-v2)), ([PEFT Adapters](Trelis/Yi-6B-200K-Llamafied-function-calling-adapters-v2)) - Paid, [purchase here](https://buy.stripe.com/00gdRC7BX1Dt08gbJp)
33
+ - Yi-34B-200k context with function calling ([Base Model](Trelis/Yi-34B-200K-Llamafied-function-calling-v2)), ([PEFT Adapters](Trelis/Yi-34B-200K-Llamafied-function-calling-adapters-v2)) - Paid, [purchase here](https://buy.stripe.com/8wM00M5tP81R6wE9Bi)
34
  - Deepseek-Coder-1.3B-Instruct with function calling ([Base Model](https://huggingface.co/Trelis/deepseek-coder-1.3b-instruct-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/deepseek-coder-1.3b-instruct-function-calling-adapters-v2/settings)) - Paid, [purchase here](https://buy.stripe.com/9AQbJubSda9Z8EM00A)
35
  - Llama-7B-chat with function calling ([Base Model](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-adapters-v2)), ([GGUF - files are in the main branch of the base model]) - Free
36
  - zephyr-7b-beta with function calling ([Base Model](https://huggingface.co/Trelis/zephyr-7b-beta-function-calling-v2)), ([PEFT Adapters](https://huggingface.co/Trelis/zephyr-7b-beta-function-calling-adapters-v2)), ([GGUF - files are in the main branch of the base model]) - Paid, [purchase here](https://buy.stripe.com/14k00M4pLeqf9IQbJk)
 
61
 
62
  Use of all Llama models with function calling is further subject to terms in the [Meta license](https://ai.meta.com/resources/models-and-libraries/llama-downloads/).
63
 
64
+ Yi models are subject to the Yi license, which permits commercial use as of Nov 15th 2023.
65
+
66
  Zephr models were generated using Ultrachat, which relies on openai. OpenAI does not permit the use of it's models to train competitive models. This makes it unclear as to whether Zephyr may be used commercial. Buyers/users do so at their sole risk.
67
 
68
  ## Dataset
 
97
  B_FUNC, E_FUNC = "<FUNCTIONS>", "</FUNCTIONS>\n\n"
98
  B_INST, E_INST = "[INST] ", " [/INST]" #Llama style
99
  # B_INST, E_INST = "\n### Instruction:\n", "\n### Response:\n" #DeepSeek Coder Style
100
+ # B_INST, E_INST = "Human: ", " Assistant: " #Yi Style
101
 
102
  # Define the function metadata
103
  function_metadata = {
 
141
  B_FUNC, E_FUNC = "<FUNCTIONS>", "</FUNCTIONS>\n\n"
142
  B_INST, E_INST = "[INST] ", " [/INST]" #Llama style
143
  # B_INST, E_INST = "\n### Instruction:\n", "\n### Response:\n" #DeepSeek Coder Style
144
+ # B_INST, E_INST = "Human: ", " Assistant: " #Yi Style
145
 
146
  functionList = {function_1_metadata}{function_2_metadata}...
147
  user_prompt = '...'
 
156
  B_FUNC, E_FUNC = "<FUNCTIONS>", "</FUNCTIONS>\n\n"
157
  B_INST, E_INST = "[INST] ", " [/INST]" #Llama style
158
  # B_INST, E_INST = "\n### Instruction:\n", "\n### Response:\n" #DeepSeek Coder Style
159
+ # B_INST, E_INST = "Human: ", " Assistant: " #Yi Style
160
  B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"
161
 
162
  # assuming functionList is defined as above