Feature Extraction
Transformers
GGUF
English
Inference Endpoints
munish0838 commited on
Commit
ccaf93f
1 Parent(s): f613205

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -0
README.md ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+ license: apache-2.0
4
+ language:
5
+ - en
6
+ base_model: PowerInfer/Bamboo-base-v0_1
7
+ ---
8
+
9
+ # Bamboo-base-v0.1-GGUF
10
+
11
+ - Model creator: [PowerInfer](https://huggingface.co/PowerInfer)
12
+ - Original model: [Bamboo base v0.1](https://huggingface.co/PowerInfer/Bamboo-base-v0_1)
13
+
14
+ ## Description
15
+
16
+ Sparse computing is increasingly recognized as an important direction to improve the computational efficiency (e.g., inference speed) of large language models (LLM).
17
+
18
+ Recent studies ([Zhang el al., 2021](https://arxiv.org/abs/2110.01786); [Liu et al., 2023](https://openreview.net/pdf?id=wIPIhHd00i); [Mirzadeh et al., 2023](https://arxiv.org/abs/2310.04564)) reveal that LLMs inherently exhibit properties conducive to sparse computation when employing the ReLU activation function.
19
+ This insight opens up new avenues for inference speed, akin to MoE's selective activation.
20
+ By dynamically choosing model parameters for computation, we can substantially boost inference speed.
21
+
22
+ However, the widespread adoption of ReLU-based models in the LLM field remains limited.
23
+ Here we introduce a new 7B ReLU-based LLM, Bamboo (Github link: [https://github.com/SJTU-IPADS/Bamboo](https://github.com/SJTU-IPADS/Bamboo)),
24
+ which boasts nearly 85% sparsity and performance levels on par with [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1).
25
+
26
+ ## Citation
27
+
28
+ Please kindly cite using the following BibTeX:
29
+
30
+ ```
31
+ @misc{bamboo,
32
+ title={Bamboo: Harmonizing Sparsity and Performance in Large Language Models},
33
+ author={Yixin Song, Haotong Xie, Zeyu Mi, Li Ma, Haibo Chen},
34
+ year={2024}
35
+ }
36
+ ```