|
--- |
|
pipeline_tag: text-generation |
|
inference: true |
|
widget: |
|
- text: 'def print_hello_world():' |
|
example_title: Hello world |
|
group: Python |
|
license: bigscience-openrail-m |
|
pretrain-datasets: |
|
- books |
|
- arxiv |
|
- c4 |
|
- falcon-refinedweb |
|
- wiki |
|
- github-issues |
|
- stack_markdown |
|
- self-made dataset of permissive github code |
|
datasets: |
|
- bigcode/the-stack-dedup |
|
- rombodawg/2XUNCENSORED_MegaCodeTraining188k |
|
- bigcode/commitpackft |
|
library_name: llama.cpp |
|
tags: |
|
- code |
|
language: |
|
- en |
|
--- |
|
|
|
# Refact 1.6B FIM GGUF |
|
|
|
## Introduction |
|
|
|
The Refact 1.6B FIM GGUF model is a state-of-the-art AI-powered coding assistant developed by Small Magellanic Cloud AI Ltd. This model is designed to assist developers with code completion, refactoring, and chat-based interactions, excelling in code-related natural language understanding and generation tasks. |
|
|
|
## Quantized Model Files |
|
|
|
The model comes in various quantized versions to suit different computational needs: |
|
|
|
- **refact-1.6B-fim-q4_0.gguf**: A 4-bit quantized model with a file size of 878 MB. |
|
- **refact-1.6B-fim-q5_0.gguf**: A 5-bit quantized model with a file size of 1.1 GB. |
|
- **refact-1.6B-fim-q8_0.gguf**: An 8-bit quantized model with a file size of 1.6 GB. |
|
|
|
## Features and Usage |
|
|
|
The model is versatile and can be employed for: |
|
|
|
- Code completion |
|
- Code refactoring |
|
- Chat-based interactions |
|
|
|
### Example Usage |
|
|
|
Here's a sample shell command to invoke the model: |
|
|
|
```sh |
|
# Sample shell command to use the model |
|
./main -m models/smallcloudai/Refact-1_6B-fim/ggml-model-f16.gguf -n 300 -p "write a function to multiply two integers in python" --temp 1.0 --top-p 1.0 --top-k 1 --repeat_penalty 1.0 |
|
``` |
|
|
|
## Performance Metrics |
|
|
|
The model outperforms many existing models in both code completion and chat-based interactions, as evidenced by the HumanEval results. |
|
|
|
| Model | Size | HumanEval pass@1 | HumanEval pass@10 | |
|
|----------------------|-------|------------------|-------------------| |
|
| **Refact-1.6-fim** | 1.6b | 32.0% | 53.0% | |
|
| StableCode | 3b | 20.2% | 33.8% | |
|
| ReplitCode v1 | 3b | 21.9% | N/A | |
|
|
|
## Installation and Setup |
|
|
|
The model can be integrated into your IDE via the [Refact plugin](https://refact.ai/). For self-hosting, an [open-source Docker container](https://github.com/smallcloudai/refact) is available. |
|
|
|
## Limitations and Bias |
|
|
|
The model primarily focuses on English text, which may result in lower performance for non-English languages. |
|
|
|
## Technical Specifications |
|
|
|
- **Architecture**: LLAMA-like model with multi-query attention |
|
- **Training Tokens**: 1.2T for pretraining, 40B for fine-tuning |
|
- **Precision**: bfloat16 |
|
- **Training Time**: 28 days |
|
|
|
## License |
|
|
|
The model is licensed under the BigScience OpenRAIL-M v1 license agreement. |
|
|
|
## Citation |
|
|
|
If you use this model in your work, please cite it by linking back to the following page for proper attribution: |
|
|
|
[Refact 1.6B FIM Model](https://huggingface.co/smallcloudai/Refact-1_6B-fim) |
|
|
|
## Acknowledgments |
|
|
|
Special thanks to [ds5t5](https://github.com/ggerganov/llama.cpp/pull/3329) for their contribution in implementing the source for converting the model's tensors from Hugging Face to GGUF format. Their work has been instrumental in enhancing the model's versatility. |
|
|
|
### Example Command for Testing |
|
|
|
To test the model against Hugging Face, you can use the following command: |
|
|
|
```sh |
|
# Example command for testing against Hugging Face |
|
python3 convert-refact-hf-to-gguf.py ./Refact-1_6B-fim 1 |
|
|
|
./main -m ./Refact-1_6B-fim/ggml-model-f16.gguf -n 300 -p "write a function to multiply two integers in python" --temp 1.0 --top-p 1.0 --top-k 1 --repeat_penalty 1.0 |
|
``` |
|
|
|
This resolves llama.cpp issue [#3061](https://github.com/ggerganov/llama.cpp/issues/3061). |
|
|