--- base_model: - meta-llama/Meta-Llama-3-8B-Instruct library_name: transformers tags: - mergekit - merge --- # Llama-3-6B-Instruct-pruned *Experimental* Using [PruneMe](https://github.com/arcee-ai/PruneMe) to find minimal average distance. Thank you for awesome toolkit @arcee-ai ! distance *It shows pruning the 22-30 layer is the best option, but I'm worried about drasitical change between 22 to 23.* ### Disclaimer I haven't done any post-training (called 'healing' process as the [paper](https://arxiv.org/abs/2403.17887) suggests), will do it later but no guarantee at all. This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the passthrough merge method. ### Models Merged The following models were included in the merge: * [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) ### Configuration The following YAML configuration was used to produce this model: ```yaml dtype: bfloat16 merge_method: passthrough slices: - sources: - layer_range: [0, 21] model: model: path: meta-llama/Meta-Llama-3-8B-Instruct - sources: - layer_range: [29, 32] model: model: path: meta-llama/Meta-Llama-3-8B-Instruct ```