Update README.md
Browse files
README.md
CHANGED
@@ -7,6 +7,7 @@ tags: []
|
|
7 |
|
8 |
<!-- Provide a quick summary of what the model is/does. -->
|
9 |
|
|
|
10 |
|
11 |
|
12 |
## Model Details
|
@@ -15,7 +16,15 @@ tags: []
|
|
15 |
|
16 |
<!-- Provide a longer summary of what this model is. -->
|
17 |
|
18 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
|
20 |
- **Developed by:** [More Information Needed]
|
21 |
- **Funded by [optional]:** [More Information Needed]
|
|
|
7 |
|
8 |
<!-- Provide a quick summary of what the model is/does. -->
|
9 |
|
10 |
+
This model aims to optimize QA & summerization tasks for the capstone project "Edge LLM - Reducing LLM Memory Footprint to < 2GB" in UW, sponsered by Amazon.
|
11 |
|
12 |
|
13 |
## Model Details
|
|
|
16 |
|
17 |
<!-- Provide a longer summary of what this model is. -->
|
18 |
|
19 |
+
Base model is Fighoture/Llama-2-7b-chat-shortgpt-with-angular-25-percent, which has been pruned with shortgpt by 25%(8) layers according to angular distance.
|
20 |
+
|
21 |
+
This model is fine-tuned by a dataset combination including:
|
22 |
+
|
23 |
+
1. randomly-selected 5k sample of sharegpt dataset. Link is as followed: https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/blob/main/ShareGPT_V3_unfiltered_cleaned_split_no_imsorry.json
|
24 |
+
|
25 |
+
2. randomly-selected 2.5k sample of Open-Orca dataset, filtered by allenai/tulu-v2-sft-mixture
|
26 |
+
|
27 |
+
3. randomly-selected 2.5k sample of GPT4-Alpaca dataset, filtered by allenai/tulu-v2-sft-mixture
|
28 |
|
29 |
- **Developed by:** [More Information Needed]
|
30 |
- **Funded by [optional]:** [More Information Needed]
|