doberst commited on
Commit
b865dd0
1 Parent(s): d78042d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -6
README.md CHANGED
@@ -3,11 +3,11 @@ license: apache-2.0
3
  inference: false
4
  ---
5
 
6
- # bling-phi-3-gguf
7
 
8
  <!-- Provide a quick summary of what the model is/does. -->
9
 
10
- bling-phi-3-gguf is part of the BLING ("Best Little Instruct No-GPU") model series, RAG-instruct trained for fact-based question-answering use cases on top of a Microsoft Phi-3 base model.
11
 
12
 
13
  ### Benchmark Tests
@@ -15,17 +15,17 @@ bling-phi-3-gguf is part of the BLING ("Best Little Instruct No-GPU") model seri
15
  Evaluated against the benchmark test: [RAG-Instruct-Benchmark-Tester](https://www.huggingface.co/datasets/llmware/rag_instruct_benchmark_tester)
16
  1 Test Run (with temperature = 0.0 and sample = False) with 1 point for correct answer, 0.5 point for partial correct or blank / NF, 0.0 points for incorrect, and -1 points for hallucinations.
17
 
18
- --**Accuracy Score**: **100.0** correct out of 100
19
  --Not Found Classification: 95.0%
20
- --Boolean: 97.5%
21
- --Math/Logic: 80.0%
22
  --Complex Questions (1-5): 4 (Above Average - multiple-choice, causal)
23
  --Summarization Quality (1-5): 4 (Above Average)
24
  --Hallucinations: No hallucinations observed in test runs.
25
 
26
  For test run results (and good indicator of target use cases), please see the files ("core_rag_test" and "answer_sheet" in this repo).
27
 
28
- Note: compare results with [bling-phi-2](https://www.huggingface.co/llmware/bling-phi-2-v0), and [dragon-mistral-7b](https://www.huggingface.co/llmware/dragon-mistral-7b-v0).
29
 
30
 
31
  ### Model Description
 
3
  inference: false
4
  ---
5
 
6
+ # dragon-mistral-0.3-gguf
7
 
8
  <!-- Provide a quick summary of what the model is/does. -->
9
 
10
+ dragon-mistral-0.3-gguf is part of the DRAGON model series, RAG-instruct trained for fact-based question-answering use cases on top of a Mistral 7b v0.3 base model.
11
 
12
 
13
  ### Benchmark Tests
 
15
  Evaluated against the benchmark test: [RAG-Instruct-Benchmark-Tester](https://www.huggingface.co/datasets/llmware/rag_instruct_benchmark_tester)
16
  1 Test Run (with temperature = 0.0 and sample = False) with 1 point for correct answer, 0.5 point for partial correct or blank / NF, 0.0 points for incorrect, and -1 points for hallucinations.
17
 
18
+ --**Accuracy Score**: **99.5** correct out of 100
19
  --Not Found Classification: 95.0%
20
+ --Boolean: 82.5%
21
+ --Math/Logic: 67.5%
22
  --Complex Questions (1-5): 4 (Above Average - multiple-choice, causal)
23
  --Summarization Quality (1-5): 4 (Above Average)
24
  --Hallucinations: No hallucinations observed in test runs.
25
 
26
  For test run results (and good indicator of target use cases), please see the files ("core_rag_test" and "answer_sheet" in this repo).
27
 
28
+ Note: compare results with [dragon-mistral-7b](https://www.huggingface.co/llmware/dragon-mistral-7b-v0).
29
 
30
 
31
  ### Model Description