Malmuk1 commited on
Commit
cdb2339
1 Parent(s): 8b23958

Upload 10 files

Browse files
LICENSE.txt ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ This project utilizes materials from Llama 2, provided by Meta Platforms, Inc. The Llama 2 materials are licensed under the LLAMA 2 Community License, Copyright (c) Meta Platforms, Inc. All Rights Reserved.
2
+
3
+ A copy of the license agreement can be found at [Link to the License, e.g. https://github.com/facebookresearch/llama/blob/main/LICENSE].
4
+
5
+ All applicable terms and conditions outlined in the LLAMA 2 Community License Agreement have been adhered to, including but not limited to the retention of the attribution notice in all redistributed copies of the Llama Materials as follows:
6
+
7
+ "Llama 2 is licensed under the LLAMA 2 Community License, Copyright (c) Meta Platforms, Inc. All Rights Reserved."
8
+
9
+ This project complies with all applicable laws and regulations and adheres to the Acceptable Use Policy for the Llama Materials.
10
+
11
+
12
+ AceGPT COMMUNITY LICENSE AGREEMENT
13
+ AceGPT Version Release Date: Sep 23, 2023
14
+
15
+
16
+ "Agreement" means the terms and conditions for use, reproduction, distribution and modification of the AceGPT Materials set forth herein.
17
+
18
+ "Licensee" or "you" means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity's behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf.
19
+
20
+ "AceGPT" means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by SRIBD, CUHK(shenzhen) and KAUST at https://github.com/FreedomIntelligence/AceGPT and https://huggingface.co/FreedomIntelligence/.
21
+
22
+ "AceGPT Materials" means, collectively, our proprietary AceGPT and
23
+ Documentation (and any portion thereof) made available under this Agreement.
24
+
25
+ By clicking "I Accept" below or by using or distributing any portion or element of our Materials, you agree to be bound by this Agreement.
26
+
27
+ 1. License Rights and Redistribution.
28
+
29
+ a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable and royalty-free limited license under our intellectual property or other rights owned by our embodied in the AceGPT Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the AceGPT Materials.
30
+
31
+ b. Redistribution and Use.
32
+
33
+ i. If you distribute or make the AceGPT Materials, or any derivative works thereof, available to a third party, you shall provide a copy of this Agreement to such third party.
34
+ ii. If you receive AceGPT Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you.
35
+
36
+ iii. You must retain in all copies of the AceGPT Materials that you distribute the following attribution notice within a "Notice" text file distributed as a part of such copies: "AceGPT is licensed under the AceGPT Community License"
37
+
38
+ iv. Your use of the AceGPT Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the AceGPT Materials, which is hereby incorporated by reference into this Agreement.
39
+
40
+ v. You will not use the AceGPT Materials or any output or results of the AceGPT Materials to improve any other large language model (excluding AceGPT or derivative works thereof).
41
+
42
+ 2. Additional Commercial Terms. If, on the AceGPT version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee's affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from SRIBD, which SRIBD may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until SRIBD otherwise expressly grants you such rights.
43
+
44
+ 3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE ACEGPT MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE ACEGPT MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE ACEGPT MATERIALS AND ANY OUTPUT AND RESULTS.
45
+
46
+ 4. Limitation of Liability. IN NO EVENT WILL SRIBD OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF SRIBD OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING.
47
+
48
+ 5. Intellectual Property.
49
+
50
+ a. No trademark licenses are granted under this Agreement, and in connection with the AceGPT Materials, neither SRIBD nor Licensee may use any name or m
README.md ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - ar
5
+ ---
6
+ # <b>AceGPT</b>
7
+ AceGPT is a fully fine-tuned generative text model collection based on LlaMA2, particularly in the
8
+ Arabic language domain. This is the repository for the 7B-chat pre-trained model.
9
+
10
+ ---
11
+ ## Model Details
12
+ We have released the AceGPT family of large language models, which is a collection of fully fine-tuned generative text models based on LlaMA2, ranging from 7B to 13B parameters. Our models include two main categories: AceGPT and AceGPT-chat. AceGPT-chat is an optimized version specifically designed for dialogue applications. It is worth mentioning that our models have demonstrated superior performance compared to all currently available open-source Arabic dialogue models in multiple benchmark tests. Furthermore, in our human evaluations, our models have shown comparable satisfaction levels to some closed-source models, such as ChatGPT, in the Arabic language.
13
+ ## Model Developers
14
+ We are from the School of Data Science, the Chinese University of Hong Kong, Shenzhen (CUHKSZ), the Shenzhen Research Institute of Big Data (SRIBD), and the King Abdullah University of Science and Technology (KAUST).
15
+ ## Variations
16
+ AceGPT famils come in a range of parameter sizes —— 7B and 13B, each size of model has a base category and a -chat category.
17
+ ## Input
18
+ Models input text only.
19
+ ## Output
20
+ Models output text only.
21
+ ## Model Evaluation Results
22
+
23
+ Experiments on Arabic Vicuna-80, Arabic AlpacaEval. Numbers are the average performance ratio of ChatGPT over three runs. We do not report the results of raw Llama-2 models since they cannot properly generate Arabic texts.
24
+ | | Arabic Vicuna-80 | Arabic AlpacaEval |
25
+ |------------------------------|--------------------|---------------------|
26
+ | Phoenix Chen et al. (2023a) | 71.92% ± 0.2% | 65.62% ± 0.3% |
27
+ | Phoenix–multiple-langs Chen et al. (2023b) | 71.67% ± 0.7% | 65.36% ± 0.1% |
28
+ | Jais-13B-chat Sengupta et al. (2023) | 75.40% ± 1.6% | 74.95% ± 0.2% |
29
+ | AceGPT-7B-chat | 94.82% ± 0.2% | 93.81% ± 0.1% |
30
+ | AceGPT-13B-chat | 100.88% ± 0.4% | 97.95% ± 0.1% |
31
+
32
+ ## Samples
33
+ #### Sample1(alpaca)
34
+ * question:
35
+
36
+ ما هي أسماء بعض الممثلين المشهورين الذين بدأوا مسيراتهم المهنية على برودواي؟
37
+ * answer:
38
+ 1. ليوناردو دي كابريو: بدأ مسيرته الفنية في برودواي على الرغم من أنه اشتهر بشكل أكبر من خلال أدواره في الأفلام مثل \"The Wolf of Wall Street\" و \"Titanic\".
39
+ 2. فرانسيس مكدورماند: بدأت مسيرتها الفنية في برودواي وأصبحت ممثلة مشهورة في أوائل القرن العشرين بفضل أدوارها في العديد من المسرحيات الكلاسيكية.
40
+ 3. جودي دنش: بدأت مسيرتها الفنية في برودواي وأصبحت ممثلة مشهورة بفضل أدائها المتميز في العديد من المسرحيات الكلاسيكية.
41
+ 4. بيتر أوستروم: بدأ مسيرته الفنية في برودواي وأصبح ممثلًا مشهورًا بفضل أدائه المتميز في العديد من المسرحيات الكلاسيكية.
42
+ 5. مارك رايلانس: بدأ مسيرته الفنية في برودواي وأصبح ممثلًا مشهورًا بفضل أدائه المتميز في العديد من المسرحيات الكلاسيكية.
43
+ 6. جيمس كان: بدأ مسيرته الفنية في برودواي وأصبح ممثلًا مشهورًا بفضل أدائه المتميز في العديد من المسرحيات الكلاسيكية.
44
+ 7. إليزابيث تايلور: بدأت مسيرتها الفنية في برودواي وأصبحت ممثلة مشهورة في أوائل القرن العشرين بفضل أدوارها في العديد من المسرحيات الكلاسيكية.
45
+ 8. روبرت ويلسون: بدأ مسيرته الفنية في برودواي وأصبح ممثلًا مشهورًا بفضل أدائه المتميز في العديد من المسرحيات الكلاسيكية.
46
+ 9. جودي غارلاند: بدأت مسيرتها الفنية في برودواي وأصبحت ممثلة مشهورة في أوائل القرن العشرين بفضل أدوارها في العديد من المسرحيات الكلاسيكية.
47
+ 10. ماثيو برودريك: بدأ مسيرته الفنية في برودواي وأصبح ممثلًا مشهورًا بفضل أدائه المتميز في العديد من المسرحيات الكلاسيكية."
48
+
49
+ ---
50
+ #### Sample2(Vicuna)
51
+ * question:
52
+
53
+ كيف يمكنني تحسين مهارات إدارة الوقت الخاصة بي؟
54
+ * answer:
55
+ 1. تحديد الأولويات: قم بتحديد الأعمال الأكثر أهمية والتي تتطلب اهتمامك الأكبر.
56
+ 2. تنظيم الجدول الزمني: استخدم جدول زمني أو تقويم لتنظيم وقتك وتحديد الأوقات المحددة للقيام بمهام معينة.
57
+ 3. تجنب التسويف: تجنب تأجيل الأعمال التي يمكنك القيام بها الآن.
58
+ 4. استخدام تقنيات الإدارة الزمنية: استخدم تقنيات مثل تقنية بومودورو، حيث تعمل لمدة 25 دقيقة ثم تأخذ استراحة لمدة 5 دقائق.
59
+ 5. تعلم قول \"لا\": تعلم كيفية رفض الطلبات التي لا تتناسب مع أهدافك وجدولك الزمني.
60
+ 6. استخدم التكنولوجيا: استخدم التطبيقات والأدوات التكنولوجية لتتبع الوقت وتنظيم المهام.
61
+ 7. تحديد الوقت للراحة: تأكد من تحديد الوقت للراحة والاسترخاء لتجنب الإرهاق والشعور بالحرق.
62
+ 8. تحسين المهارات التنظيمية: تعلم كيفية تنظيم المهام والأولويات بشكل فعال.
63
+ 9. استخدم التقنيات البصرية: استخدم الرسوم البيانية والرسوم البيانية لتتبع الوقت والتقدم.
64
+ 10. تحسين الصحة العامة: تأكد من الحصول على قسط كاف من النوم والتغذية الجيدة والتمارين الرياضية لتحسين قدرتك على التركيز والإنتاجية.
65
+ 11. استخدم تقنية الطماطم: هذه التقنية تتطلب منك العمل لمدة 25 دقيقة دون انقطاع، ثم تأخذ استراحة لمدة 5 دقائق.
66
+ 12. تعلم كيفية قول \"لا\": قدرتك على قول \"لا\" للأشياء التي لا تخدم أهدافك يمكن أن تحسن بشكل كبير إدارة الوقت الخاصة بك.
67
+ 13. استخدم تقنية التركيز العميق: تقنية تستخدم التقنيات مثل التأمل والتركيز العقلي لزيادة القدرة على التركيز وتحسين الإنتاجية.
68
+ 14. تعلم كيفية تقسيم المهام: تعلم كيفية تقسيم المهام الكبيرة إلى مهام أصغر وأكثر إدارة.
69
+ 15. استخدم تقنية الطماطم: تقنية تتطلب منك العمل لمدة 25 دقيقة دون انقطاع، ثم تأخذ استراحة لمدة 5 دقائق.
70
+ 16. تعلم كيفية تحديد الأولويات: تعلم كيفية تحديد الأولويات والتركيز على المهام الأكثر أهمية أولاً.
71
+ 17. استخدم تقنية الترتيب الثلاثي: تقنية تتطلب منك ترتيب المهام حسب الأهمية والعاجلة، ثم تعمل على المهمة الأعلى أولاً.
72
+ 18. تعلم كيفية تحسين التركيز: تعلم"
73
+ # You can get more details at https://github.com/FreedomIntelligence/AceGPT/tree/main
config.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "/ibex/user/xuj0h/LLM_ckpt/llama-pretrained-RLHF-v1.0",
3
+ "architectures": [
4
+ "LlamaForCausalLM"
5
+ ],
6
+ "bos_token_id": 1,
7
+ "dropout": 0.0,
8
+ "end_token_id": 2,
9
+ "eos_token_id": 2,
10
+ "hidden_act": "silu",
11
+ "hidden_size": 4096,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 11008,
14
+ "max_length": 4096,
15
+ "max_position_embeddings": 2048,
16
+ "model_type": "llama",
17
+ "num_attention_heads": 32,
18
+ "num_hidden_layers": 32,
19
+ "num_key_value_heads": 32,
20
+ "pad_token_id": 2,
21
+ "pretraining_tp": 1,
22
+ "rms_norm_eps": 1e-05,
23
+ "rope_scaling": null,
24
+ "rope_theta": 10000.0,
25
+ "tie_word_embeddings": false,
26
+ "torch_dtype": "float32",
27
+ "transformers_version": "4.33.2",
28
+ "use_cache": true,
29
+ "vocab_size": 32000
30
+ }
convert.py ADDED
@@ -0,0 +1,293 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from exllamav2 import ExLlamaV2, ExLlamaV2Config, ExLlamaV2Tokenizer
2
+ import argparse, os, shutil
3
+ import sys
4
+ import json
5
+ from conversion.tokenize import tokenize
6
+ from conversion.quantize import embeddings, measure_quant, quant
7
+ from conversion.optimize import optimize
8
+ from conversion.compile import compile_model
9
+ from conversion.qparams import qparams_headoptions
10
+
11
+ # import tracemalloc
12
+ # tracemalloc.start()
13
+
14
+ parser = argparse.ArgumentParser(description = "Convert model to ExLlamaV2")
15
+ parser.add_argument("-i", "--in_dir", type = str, help = "Input directory", default = "")
16
+ parser.add_argument("-o", "--out_dir", type = str, help = "Output (working) directory")
17
+ parser.add_argument("-nr", "--no_resume", action = "store_true", help = "Do not resume an interrupted job (deletes all files in the output directory)")
18
+ parser.add_argument("-cf", "--compile_full", type = str, help = "Output folder for compiled model with all config/tokenizer files")
19
+ parser.add_argument("-om", "--output_measurement", type = str, help = "Only perform measurement pass, then save measurement to the specified file")
20
+ parser.add_argument("-c", "--cal_dataset", type = str, help = "Calibration dataset (.parquet file)", default = "")
21
+ parser.add_argument("-r", "--dataset_rows", type = int, default = 100, help = "Number of rows to apply from dataset")
22
+ parser.add_argument("-mr", "--measurement_rows", type = int, default = 16, help = "Number of rows to apply from dataset when measuring")
23
+ parser.add_argument("-gr", "--gpu_rows", type = int, default = 0, help = "Threshold for paging hidden state to CPU")
24
+ parser.add_argument("-l", "--length", type = int, default = 2048, help = "Max no. tokens per sample")
25
+ parser.add_argument("-ml", "--measurement_length", type = int, default = 2048, help = "Max no. tokens per sample when measuring")
26
+ parser.add_argument("-b", "--bits", type = float, default = 4.125, help = "Target bits per weight")
27
+ parser.add_argument("-hb", "--head_bits", type = int, default = 6, help = "Target bits per weight (head layer)")
28
+ parser.add_argument("-m", "--measurement", type = str, help = "Reuse previous measurement")
29
+ parser.add_argument("-ss", "--shard_size", type = float, help = "Max shard size in MB (default: 8192)", default = 8192)
30
+
31
+ args = parser.parse_args()
32
+
33
+ # Check some args
34
+
35
+ if not args.in_dir:
36
+ print(" ## Please specify input model directory (-i, --in_dir)")
37
+ sys.exit()
38
+
39
+ if not args.out_dir:
40
+ print(" ## Please specify output/working directory (-o, --out_dir)")
41
+ sys.exit()
42
+
43
+ if not args.cal_dataset:
44
+ print(" ## Please specify dataset Parquet file (-c, --cal_dataset)")
45
+ sys.exit()
46
+
47
+ if args.length > 2048 or args.measurement_length > 2048:
48
+ print(" !! Warning: calibration rows > 2048 tokens may result in excessive VRAM use")
49
+
50
+ if not args.head_bits in qparams_headoptions:
51
+ print(f" ## Error: {args.head_bits} is not a supported option for head layer bitrate")
52
+ sys.exit()
53
+
54
+ if args.bits < 2 or args.bits > 8:
55
+ print(f" !! Warning: target bitrate {args.bits} will likely not be attainable")
56
+
57
+ if args.output_measurement is not None and args.compile_full is not None:
58
+ print(" ## Conflicting options: --output_measurement and --compile_full")
59
+ sys.exit()
60
+
61
+ # Arguments
62
+
63
+ in_dir = None if args.in_dir == "" else os.path.abspath(args.in_dir)
64
+ out_dir = os.path.abspath(args.out_dir)
65
+ cal_dataset = None if args.cal_dataset == "" else os.path.abspath(args.cal_dataset)
66
+ dataset_rows = args.dataset_rows
67
+ measurement_rows = args.measurement_rows
68
+ gpu_rows = args.gpu_rows
69
+ length = args.length
70
+ measurement_length = args.measurement_length
71
+ bits = args.bits
72
+ head_bits = args.head_bits
73
+ reuse_measurement = args.measurement
74
+ shard_size = args.shard_size if args.shard_size > 0 else 1024 ** 3 # 1 PB = unlimited
75
+ no_resume = args.no_resume
76
+ output_measurement = args.output_measurement
77
+ if output_measurement is not None:
78
+ if os.path.isdir(output_measurement):
79
+ output_measurement = os.path.join(output_measurement, "measurement.json")
80
+
81
+ compile_full = args.compile_full
82
+
83
+ if not os.path.exists(out_dir):
84
+ print(f" ## Error: Directory not found: {out_dir}")
85
+ sys.exit()
86
+
87
+ # Create config
88
+
89
+ config = ExLlamaV2Config()
90
+ config.model_dir = in_dir
91
+ config.qkv_embed = False
92
+ config.prepare()
93
+
94
+ # Tokenizer
95
+
96
+ tokenizer = ExLlamaV2Tokenizer(config)
97
+
98
+ # Job file
99
+
100
+ job_file = os.path.join(out_dir, "job.json")
101
+
102
+ # Create new job
103
+
104
+ def save_job():
105
+ global job_file, job
106
+ with open(job_file, "w") as f:
107
+ f.write(json.dumps(job, indent = 4))
108
+
109
+ if no_resume or not os.path.exists(job_file):
110
+
111
+ print(f" -- Beginning new job")
112
+
113
+ if len(os.listdir(out_dir)) != 0:
114
+ print(f" !! Warning: Output directory is not empty: {out_dir}")
115
+ if no_resume:
116
+ print(f" !! Cleaning output directory: {out_dir}")
117
+ for filename in os.listdir(out_dir):
118
+ file_path = os.path.join(out_dir, filename)
119
+ if os.path.isfile(file_path):
120
+ os.unlink(file_path)
121
+ elif os.path.isdir(file_path):
122
+ shutil.rmtree(file_path)
123
+
124
+ if in_dir is None:
125
+ print(f" ## Error: No input directory specified")
126
+ sys.exit()
127
+
128
+ if cal_dataset is None:
129
+ print(f" ## Error: No calibration dataset specified")
130
+ sys.exit()
131
+
132
+ job = { "in_dir": in_dir,
133
+ "out_dir": out_dir,
134
+ "cal_dataset": cal_dataset,
135
+ "dataset_rows": dataset_rows,
136
+ "measurement_rows": measurement_rows,
137
+ "gpu_rows": gpu_rows,
138
+ "length": length,
139
+ "measurement_length": measurement_length,
140
+ "bits": bits,
141
+ "head_bits": head_bits,
142
+ "progress": "begin",
143
+ "shard_size": shard_size,
144
+ "output_measurement": output_measurement,
145
+ "compile_full": compile_full
146
+ }
147
+
148
+ if reuse_measurement is not None:
149
+
150
+ with open(reuse_measurement, "r") as f:
151
+
152
+ imp_measurement = json.load(f)
153
+ job["measurement"] = imp_measurement["measurement"]
154
+ job["last_module_idx"] = imp_measurement["last_module_idx"]
155
+ job["base_perplexity"] = imp_measurement["base_perplexity"]
156
+ job["reuse_measurement"] = reuse_measurement
157
+
158
+ save_job()
159
+
160
+ # Resume existing job
161
+
162
+ else:
163
+
164
+ print(f" -- Resuming job")
165
+ print(f" !! Note: Overriding options with settings from existing job")
166
+
167
+ with open(job_file, "r") as f:
168
+ job = json.load(f)
169
+
170
+ if "invalid" in job:
171
+ print(" ** Error: Corrupted job")
172
+ sys.exit()
173
+
174
+ if "shard_size" not in job: job["shard_size"] = shard_size
175
+ if "output_measurement" not in job: job["output_measurement"] = output_measurement
176
+ if "compile_full" not in job: job["compile_full"] = compile_full
177
+
178
+ job["out_dir"] = out_dir
179
+
180
+ # Feedback
181
+
182
+ print(f" -- Input: {job['in_dir']}")
183
+ print(f" -- Output: {out_dir}")
184
+ print(f" -- Calibration dataset: {job['cal_dataset']}, {job['dataset_rows']} / {job['measurement_rows']} ({job['gpu_rows']}) rows, {job['length']} tokens per sample")
185
+
186
+ if job["output_measurement"] is None:
187
+ print(f" -- Target bits per weight: {job['bits']} (decoder), {job['head_bits']} (head)")
188
+ print(f" -- Max shard size: {job['shard_size']} MB")
189
+ else:
190
+ print(f" -- Measurement will be saved to {job['output_measurement']}")
191
+ print(f" !! Conversion script will end after measurement pass")
192
+
193
+ # Make sure subfolders exist
194
+
195
+ if job["compile_full"] is not None:
196
+ print(f" -- Full model will be compiled to: {job['compile_full']}")
197
+ if os.path.exists(job["compile_full"]):
198
+ if not os.path.isdir(job["compile_full"]):
199
+ print(f" ## Error: Output path {job['compile_full']} exists but is not a directory")
200
+ sys.exit()
201
+ if len(os.listdir(job["compile_full"])) > 0:
202
+ print(f" !! Warning: Output path {job['compile_full']} exists but is not empty")
203
+
204
+ out_tensor_dir = os.path.join(job["out_dir"], "out_tensor")
205
+ if not os.path.exists(out_tensor_dir):
206
+ os.makedirs(out_tensor_dir)
207
+
208
+ # Allocate space for hidden state
209
+
210
+ max_l = max(job["measurement_length"], job["length"])
211
+ config.max_input_len = max_l
212
+ config.max_attention_size = max_l ** 2
213
+
214
+ # Create model without loading weights
215
+
216
+ model = ExLlamaV2(config)
217
+ model.load(lazy = True)
218
+
219
+ # Do the things
220
+
221
+ while True:
222
+
223
+ progress = job["progress"]
224
+
225
+ if progress == "begin":
226
+
227
+ if "reuse_measurement" in job:
228
+
229
+ print(f" -- Reusing measurement: {job['reuse_measurement']}")
230
+ job["progress"] = "optimize"
231
+ save_job()
232
+
233
+ else:
234
+
235
+ print(f" -- Tokenizing samples (measurement)...")
236
+ tokenize(job, save_job, tokenizer, measure = True)
237
+ job["progress"] = "initial_embeddings"
238
+ save_job()
239
+
240
+ if progress == "initial_embeddings":
241
+
242
+ print(f" -- Token embeddings (measurement)...")
243
+ embeddings(job, save_job, model)
244
+ job["progress"] = "measure_quant"
245
+ save_job()
246
+
247
+ if progress == "measure_quant":
248
+
249
+ print(f" -- Measuring quantization impact...")
250
+ measure_quant(job, save_job, model)
251
+ if job["output_measurement"] is None:
252
+ job["progress"] = "optimize"
253
+ else:
254
+ job["progress"] = "finished"
255
+ save_job()
256
+
257
+ if progress == "optimize":
258
+
259
+ print(f" -- Optimizing...")
260
+ optimize(job, save_job)
261
+ job["progress"] = "tokens_cal"
262
+ save_job()
263
+
264
+ if progress == "tokens_cal":
265
+
266
+ print(f" -- Tokenizing samples...")
267
+ tokenize(job, save_job, tokenizer)
268
+ job["progress"] = "embeddings"
269
+ save_job()
270
+
271
+ if progress == "embeddings":
272
+ print(f" -- Token embeddings again...")
273
+ embeddings(job, save_job, model)
274
+ job["progress"] = "quant"
275
+ save_job()
276
+
277
+ if progress == "quant":
278
+
279
+ print(f" -- Quantizing...")
280
+ quant(job, save_job, model)
281
+ job["progress"] = "compile"
282
+ save_job()
283
+
284
+ if progress == "compile":
285
+
286
+ print(f" -- Compiling output file...")
287
+ compile_model(job, save_job, model)
288
+ job["progress"] = "finished"
289
+ save_job()
290
+
291
+ if progress == "finished": break
292
+
293
+ print(f" -- Finished")
output.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:91c6074aeefde1b028e4c9c3b1c1873ae243adeabcf040d72febe3320c7c273d
3
+ size 6867124388
pytorch_model.bin.index.json ADDED
@@ -0,0 +1,298 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "total_size": 26953662464
4
+ },
5
+ "weight_map": {
6
+ "lm_head.weight": "pytorch_model-00003-of-00003.bin",
7
+ "model.embed_tokens.weight": "pytorch_model-00001-of-00003.bin",
8
+ "model.layers.0.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
9
+ "model.layers.0.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
10
+ "model.layers.0.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
11
+ "model.layers.0.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
12
+ "model.layers.0.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
13
+ "model.layers.0.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
14
+ "model.layers.0.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
15
+ "model.layers.0.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
16
+ "model.layers.0.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
17
+ "model.layers.1.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
18
+ "model.layers.1.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
19
+ "model.layers.1.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
20
+ "model.layers.1.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
21
+ "model.layers.1.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
22
+ "model.layers.1.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
23
+ "model.layers.1.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
24
+ "model.layers.1.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
25
+ "model.layers.1.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
26
+ "model.layers.10.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
27
+ "model.layers.10.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
28
+ "model.layers.10.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
29
+ "model.layers.10.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
30
+ "model.layers.10.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
31
+ "model.layers.10.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
32
+ "model.layers.10.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
33
+ "model.layers.10.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
34
+ "model.layers.10.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
35
+ "model.layers.11.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
36
+ "model.layers.11.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
37
+ "model.layers.11.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
38
+ "model.layers.11.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
39
+ "model.layers.11.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
40
+ "model.layers.11.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
41
+ "model.layers.11.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
42
+ "model.layers.11.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
43
+ "model.layers.11.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
44
+ "model.layers.12.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
45
+ "model.layers.12.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
46
+ "model.layers.12.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
47
+ "model.layers.12.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
48
+ "model.layers.12.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
49
+ "model.layers.12.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
50
+ "model.layers.12.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
51
+ "model.layers.12.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
52
+ "model.layers.12.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
53
+ "model.layers.13.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
54
+ "model.layers.13.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
55
+ "model.layers.13.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
56
+ "model.layers.13.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
57
+ "model.layers.13.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
58
+ "model.layers.13.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
59
+ "model.layers.13.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
60
+ "model.layers.13.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
61
+ "model.layers.13.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
62
+ "model.layers.14.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
63
+ "model.layers.14.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
64
+ "model.layers.14.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
65
+ "model.layers.14.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
66
+ "model.layers.14.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
67
+ "model.layers.14.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
68
+ "model.layers.14.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
69
+ "model.layers.14.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
70
+ "model.layers.14.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
71
+ "model.layers.15.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
72
+ "model.layers.15.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
73
+ "model.layers.15.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
74
+ "model.layers.15.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
75
+ "model.layers.15.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
76
+ "model.layers.15.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
77
+ "model.layers.15.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
78
+ "model.layers.15.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
79
+ "model.layers.15.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
80
+ "model.layers.16.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
81
+ "model.layers.16.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
82
+ "model.layers.16.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
83
+ "model.layers.16.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
84
+ "model.layers.16.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
85
+ "model.layers.16.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
86
+ "model.layers.16.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
87
+ "model.layers.16.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
88
+ "model.layers.16.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
89
+ "model.layers.17.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
90
+ "model.layers.17.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
91
+ "model.layers.17.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
92
+ "model.layers.17.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
93
+ "model.layers.17.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
94
+ "model.layers.17.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
95
+ "model.layers.17.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
96
+ "model.layers.17.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
97
+ "model.layers.17.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
98
+ "model.layers.18.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
99
+ "model.layers.18.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
100
+ "model.layers.18.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
101
+ "model.layers.18.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
102
+ "model.layers.18.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
103
+ "model.layers.18.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
104
+ "model.layers.18.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
105
+ "model.layers.18.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
106
+ "model.layers.18.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
107
+ "model.layers.19.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
108
+ "model.layers.19.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
109
+ "model.layers.19.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
110
+ "model.layers.19.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
111
+ "model.layers.19.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
112
+ "model.layers.19.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
113
+ "model.layers.19.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
114
+ "model.layers.19.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
115
+ "model.layers.19.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
116
+ "model.layers.2.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
117
+ "model.layers.2.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
118
+ "model.layers.2.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
119
+ "model.layers.2.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
120
+ "model.layers.2.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
121
+ "model.layers.2.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
122
+ "model.layers.2.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
123
+ "model.layers.2.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
124
+ "model.layers.2.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
125
+ "model.layers.20.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
126
+ "model.layers.20.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
127
+ "model.layers.20.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
128
+ "model.layers.20.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
129
+ "model.layers.20.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
130
+ "model.layers.20.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
131
+ "model.layers.20.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
132
+ "model.layers.20.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
133
+ "model.layers.20.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
134
+ "model.layers.21.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
135
+ "model.layers.21.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
136
+ "model.layers.21.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
137
+ "model.layers.21.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
138
+ "model.layers.21.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
139
+ "model.layers.21.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
140
+ "model.layers.21.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
141
+ "model.layers.21.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
142
+ "model.layers.21.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
143
+ "model.layers.22.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
144
+ "model.layers.22.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
145
+ "model.layers.22.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
146
+ "model.layers.22.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
147
+ "model.layers.22.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
148
+ "model.layers.22.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
149
+ "model.layers.22.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
150
+ "model.layers.22.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
151
+ "model.layers.22.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
152
+ "model.layers.23.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
153
+ "model.layers.23.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
154
+ "model.layers.23.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
155
+ "model.layers.23.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
156
+ "model.layers.23.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
157
+ "model.layers.23.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
158
+ "model.layers.23.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
159
+ "model.layers.23.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
160
+ "model.layers.23.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
161
+ "model.layers.24.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
162
+ "model.layers.24.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
163
+ "model.layers.24.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
164
+ "model.layers.24.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
165
+ "model.layers.24.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
166
+ "model.layers.24.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
167
+ "model.layers.24.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
168
+ "model.layers.24.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
169
+ "model.layers.24.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
170
+ "model.layers.25.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
171
+ "model.layers.25.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
172
+ "model.layers.25.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
173
+ "model.layers.25.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
174
+ "model.layers.25.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
175
+ "model.layers.25.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
176
+ "model.layers.25.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
177
+ "model.layers.25.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
178
+ "model.layers.25.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
179
+ "model.layers.26.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
180
+ "model.layers.26.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
181
+ "model.layers.26.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
182
+ "model.layers.26.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
183
+ "model.layers.26.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
184
+ "model.layers.26.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
185
+ "model.layers.26.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
186
+ "model.layers.26.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
187
+ "model.layers.26.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
188
+ "model.layers.27.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
189
+ "model.layers.27.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
190
+ "model.layers.27.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
191
+ "model.layers.27.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
192
+ "model.layers.27.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
193
+ "model.layers.27.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
194
+ "model.layers.27.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
195
+ "model.layers.27.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
196
+ "model.layers.27.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
197
+ "model.layers.28.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
198
+ "model.layers.28.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
199
+ "model.layers.28.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
200
+ "model.layers.28.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
201
+ "model.layers.28.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
202
+ "model.layers.28.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
203
+ "model.layers.28.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
204
+ "model.layers.28.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
205
+ "model.layers.28.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
206
+ "model.layers.29.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
207
+ "model.layers.29.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
208
+ "model.layers.29.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
209
+ "model.layers.29.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
210
+ "model.layers.29.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
211
+ "model.layers.29.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
212
+ "model.layers.29.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
213
+ "model.layers.29.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
214
+ "model.layers.29.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
215
+ "model.layers.3.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
216
+ "model.layers.3.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
217
+ "model.layers.3.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
218
+ "model.layers.3.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
219
+ "model.layers.3.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
220
+ "model.layers.3.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
221
+ "model.layers.3.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
222
+ "model.layers.3.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
223
+ "model.layers.3.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
224
+ "model.layers.30.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
225
+ "model.layers.30.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
226
+ "model.layers.30.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
227
+ "model.layers.30.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
228
+ "model.layers.30.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
229
+ "model.layers.30.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
230
+ "model.layers.30.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
231
+ "model.layers.30.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
232
+ "model.layers.30.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
233
+ "model.layers.31.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
234
+ "model.layers.31.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
235
+ "model.layers.31.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
236
+ "model.layers.31.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
237
+ "model.layers.31.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
238
+ "model.layers.31.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
239
+ "model.layers.31.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
240
+ "model.layers.31.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
241
+ "model.layers.31.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
242
+ "model.layers.4.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
243
+ "model.layers.4.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
244
+ "model.layers.4.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
245
+ "model.layers.4.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
246
+ "model.layers.4.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
247
+ "model.layers.4.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
248
+ "model.layers.4.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
249
+ "model.layers.4.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
250
+ "model.layers.4.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
251
+ "model.layers.5.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
252
+ "model.layers.5.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
253
+ "model.layers.5.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
254
+ "model.layers.5.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
255
+ "model.layers.5.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
256
+ "model.layers.5.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
257
+ "model.layers.5.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
258
+ "model.layers.5.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
259
+ "model.layers.5.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
260
+ "model.layers.6.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
261
+ "model.layers.6.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
262
+ "model.layers.6.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
263
+ "model.layers.6.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
264
+ "model.layers.6.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
265
+ "model.layers.6.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
266
+ "model.layers.6.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
267
+ "model.layers.6.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
268
+ "model.layers.6.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
269
+ "model.layers.7.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
270
+ "model.layers.7.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
271
+ "model.layers.7.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
272
+ "model.layers.7.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
273
+ "model.layers.7.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
274
+ "model.layers.7.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
275
+ "model.layers.7.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
276
+ "model.layers.7.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
277
+ "model.layers.7.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
278
+ "model.layers.8.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
279
+ "model.layers.8.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
280
+ "model.layers.8.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
281
+ "model.layers.8.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
282
+ "model.layers.8.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
283
+ "model.layers.8.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
284
+ "model.layers.8.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
285
+ "model.layers.8.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
286
+ "model.layers.8.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
287
+ "model.layers.9.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
288
+ "model.layers.9.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
289
+ "model.layers.9.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
290
+ "model.layers.9.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
291
+ "model.layers.9.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
292
+ "model.layers.9.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
293
+ "model.layers.9.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
294
+ "model.layers.9.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
295
+ "model.layers.9.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
296
+ "model.norm.weight": "pytorch_model-00003-of-00003.bin"
297
+ }
298
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "</s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": "<unk>",
17
+ "unk_token": {
18
+ "content": "<unk>",
19
+ "lstrip": false,
20
+ "normalized": false,
21
+ "rstrip": false,
22
+ "single_word": false
23
+ }
24
+ }
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
3
+ size 499723
tokenizer_config.json ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "bos_token": {
5
+ "__type": "AddedToken",
6
+ "content": "<s>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false
11
+ },
12
+ "clean_up_tokenization_spaces": false,
13
+ "eos_token": {
14
+ "__type": "AddedToken",
15
+ "content": "</s>",
16
+ "lstrip": false,
17
+ "normalized": false,
18
+ "rstrip": false,
19
+ "single_word": false
20
+ },
21
+ "legacy": false,
22
+ "model_max_length": 1000000000000000019884624838656,
23
+ "pad_token": null,
24
+ "sp_model_kwargs": {},
25
+ "spaces_between_special_tokens": false,
26
+ "tokenizer_class": "LlamaTokenizer",
27
+ "unk_token": {
28
+ "__type": "AddedToken",
29
+ "content": "<unk>",
30
+ "lstrip": false,
31
+ "normalized": false,
32
+ "rstrip": false,
33
+ "single_word": false
34
+ },
35
+ "use_default_system_prompt": true
36
+ }