Fighoture commited on
Commit
c7d2d36
1 Parent(s): 6d45717

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -1
README.md CHANGED
@@ -7,6 +7,7 @@ tags: []
7
 
8
  <!-- Provide a quick summary of what the model is/does. -->
9
 
 
10
 
11
 
12
  ## Model Details
@@ -15,7 +16,15 @@ tags: []
15
 
16
  <!-- Provide a longer summary of what this model is. -->
17
 
18
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
 
 
 
 
 
 
 
 
19
 
20
  - **Developed by:** [More Information Needed]
21
  - **Funded by [optional]:** [More Information Needed]
 
7
 
8
  <!-- Provide a quick summary of what the model is/does. -->
9
 
10
+ This model aims to optimize QA & summerization tasks for the capstone project "Edge LLM - Reducing LLM Memory Footprint to < 2GB" in UW, sponsered by Amazon.
11
 
12
 
13
  ## Model Details
 
16
 
17
  <!-- Provide a longer summary of what this model is. -->
18
 
19
+ Base model is Fighoture/Llama-2-7b-chat-shortgpt-with-angular-25-percent, which has been pruned with shortgpt by 25%(8) layers according to angular distance.
20
+
21
+ This model is fine-tuned by a dataset combination including:
22
+
23
+ 1. randomly-selected 5k sample of sharegpt dataset. Link is as followed: https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/blob/main/ShareGPT_V3_unfiltered_cleaned_split_no_imsorry.json
24
+
25
+ 2. randomly-selected 2.5k sample of Open-Orca dataset, filtered by allenai/tulu-v2-sft-mixture
26
+
27
+ 3. randomly-selected 2.5k sample of GPT4-Alpaca dataset, filtered by allenai/tulu-v2-sft-mixture
28
 
29
  - **Developed by:** [More Information Needed]
30
  - **Funded by [optional]:** [More Information Needed]