QuantFactory
/

Bamboo-base-v0.1-GGUF

Feature Extraction

Inference Endpoints

Model card Files Files and versions Community

munish0838 commited on Mar 28

Commit

ccaf93f

•

1 Parent(s): f613205

Create README.md

Files changed (1) hide show

README.md +36 -0

README.md ADDED Viewed

	@@ -0,0 +1,36 @@

+---
+license: apache-2.0
+language:
+- en
+base_model: PowerInfer/Bamboo-base-v0_1
+---
+# Bamboo-base-v0.1-GGUF
+- Model creator: [PowerInfer](https://huggingface.co/PowerInfer)
+- Original model: [Bamboo base v0.1](https://huggingface.co/PowerInfer/Bamboo-base-v0_1)
+## Description
+Sparse computing is increasingly recognized as an important direction to improve the computational efficiency (e.g., inference speed) of large language models (LLM).
+Recent studies ([Zhang el al., 2021](https://arxiv.org/abs/2110.01786); [Liu et al., 2023](https://openreview.net/pdf?id=wIPIhHd00i); [Mirzadeh et al., 2023](https://arxiv.org/abs/2310.04564)) reveal that LLMs inherently exhibit properties conducive to sparse computation when employing the ReLU activation function.
+This insight opens up new avenues for inference speed, akin to MoE's selective activation.
+By dynamically choosing model parameters for computation, we can substantially boost inference speed.
+However, the widespread adoption of ReLU-based models in the LLM field remains limited.
+Here we introduce a new 7B ReLU-based LLM, Bamboo (Github link: [https://github.com/SJTU-IPADS/Bamboo](https://github.com/SJTU-IPADS/Bamboo)),
+which boasts nearly 85% sparsity and performance levels on par with [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1).
+## Citation
+Please kindly cite using the following BibTeX:
+```
+@misc{bamboo,
+    title={Bamboo: Harmonizing Sparsity and Performance in Large Language Models},
+    author={Yixin Song, Haotong Xie, Zeyu Mi, Li Ma, Haibo Chen},
+    year={2024}
+}
+```