brittlewis12
commited on
Commit
•
38be3f9
1
Parent(s):
c181f75
Update README.md
Browse files
README.md
CHANGED
@@ -17,6 +17,8 @@ quantized_by: brittlewis12
|
|
17 |
|
18 |
# Phi 3 Mini 4K Instruct GGUF
|
19 |
|
|
|
|
|
20 |
**Original model**: [Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct)
|
21 |
|
22 |
**Model creator**: [Microsoft](https://huggingface.co/microsoft)
|
@@ -31,7 +33,7 @@ Learn more on Microsoft’s [Model page](https://azure.microsoft.com/en-us/blog/
|
|
31 |
|
32 |
GGUF is a file format for representing AI models. It is the third version of the format,
|
33 |
introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.
|
34 |
-
Converted with llama.cpp build
|
35 |
using [autogguf](https://github.com/brittlewis12/autogguf).
|
36 |
|
37 |
### Prompt template
|
@@ -63,6 +65,24 @@ using [autogguf](https://github.com/brittlewis12/autogguf).
|
|
63 |
|
64 |
## Original Model Evaluation
|
65 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
66 |
> As is now standard, we use few-shot prompts to evaluate the models, at temperature 0.
|
67 |
> The prompts and number of shots are part of a Microsoft internal tool to evaluate language models, and in particular we did no optimization to the pipeline for Phi-3.
|
68 |
> More specifically, we do not change prompts, pick different few-shot examples, change prompt format, or do any other form of optimization for the model.
|
|
|
17 |
|
18 |
# Phi 3 Mini 4K Instruct GGUF
|
19 |
|
20 |
+
***Updated with Microsoft’s [latest model changes](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/commit/4f818b18e097c9ae8f93a29a57027cad54b75304) as of July 21, 2024***
|
21 |
+
|
22 |
**Original model**: [Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct)
|
23 |
|
24 |
**Model creator**: [Microsoft](https://huggingface.co/microsoft)
|
|
|
33 |
|
34 |
GGUF is a file format for representing AI models. It is the third version of the format,
|
35 |
introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.
|
36 |
+
Converted with llama.cpp build 3432 (revision [45f2c19](https://github.com/ggerganov/llama.cpp/commit/45f2c19cc57286eead7b232ce8028273a817aa4d)),
|
37 |
using [autogguf](https://github.com/brittlewis12/autogguf).
|
38 |
|
39 |
### Prompt template
|
|
|
65 |
|
66 |
## Original Model Evaluation
|
67 |
|
68 |
+
Comparison of July update vs original April release:
|
69 |
+
|
70 |
+
| Benchmarks | Original | June 2024 Update |
|
71 |
+
|------------|----------|------------------|
|
72 |
+
| Instruction Extra Hard | 5.7 | 6.0 |
|
73 |
+
| Instruction Hard | 4.9 | 5.1 |
|
74 |
+
| Instructions Challenge | 24.6 | 42.3 |
|
75 |
+
| JSON Structure Output | 11.5 | 52.3 |
|
76 |
+
| XML Structure Output | 14.4 | 49.8 |
|
77 |
+
| GPQA | 23.7 | 30.6 |
|
78 |
+
| MMLU | 68.8 | 70.9 |
|
79 |
+
| **Average** | **21.9** | **36.7** |
|
80 |
+
|
81 |
+
|
82 |
+
---
|
83 |
+
|
84 |
+
### Original April release
|
85 |
+
|
86 |
> As is now standard, we use few-shot prompts to evaluate the models, at temperature 0.
|
87 |
> The prompts and number of shots are part of a Microsoft internal tool to evaluate language models, and in particular we did no optimization to the pipeline for Phi-3.
|
88 |
> More specifically, we do not change prompts, pick different few-shot examples, change prompt format, or do any other form of optimization for the model.
|