MaziyarPanahi's picture
Update README.md (#5)
02a408c verified
|
raw
history blame
4.45 kB
---
license: apache-2.0
base_model: HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1
tags:
- orpo
- GGUF
- quantized
- 2-bit
- 3-bit
- 4-bit
- 5-bit
- 6-bit
- 8-bit
- 16-bit
- GGUF
- mixtral
- moe
language:
- en
datasets:
- argilla/distilabel-capybara-dpo-7k-binarized
inference: false
model_creator: MaziyarPanahi
model_name: zephyr-orpo-141b-A35b-v0.1-GGUF
pipeline_tag: text-generation
quantized_by: MaziyarPanahi
library_name: GGUF
---
<img src="https://huggingface.co/HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1/resolve/main/logo.png" alt="Zephyr 141B Logo" width="400" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
# zephyr-orpo-141b-A35b-v0.1-GGUF
On April 11th, [@HuggingFaceH4](https://huggingface.co/HuggingFaceH4) released a fine-tuned model called [HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1](https://huggingface.co/HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1) based on `Mixtral-8x22B-v0.1` model.
- **Model type:** A Mixture of Experts (MoE) model with 141B total parameters and 35B active parameters. Fine-tuned on a mix of publicly available, synthetic datasets.
- **Language(s) (NLP):** Primarily English.
- **License:** Apache 2.0
- **Finetuned from model:** [mistral-community/Mixtral-8x22B-v0.1](https://huggingface.co/mistral-community/Mixtral-8x22B-v0.1)
## Load sharded model
`llama_load_model_from_file` will detect the number of files and will load additional tensors from the rest of files.
```sh
llama.cpp/main -m zephyr-orpo-141b-A35b-v0.1.Q2_K-00001-of-00005.gguf --file prompt.txt -n 1024 -e
```
The output from `Q2_K` quantized model:
```
<|system|>
You are a helpful assistant.
<|user|>
Hello, how are you?
<|assistant|>
I'm doing great. How can I help you today?
<|user|>
Show me how to build a website in 10 simple steps
<|assistant|>
Of course, I'd be happy to guide you through the process of building a website in 10 simple steps.
1. Determine the purpose of your website: Decide what your website will be used for, whether it's for business, personal use, or a blog.
2. Choose a domain name: Your domain name is the address that people will use to find your website. It should be easy to remember and relevant to your website's purpose.
3. Select a web hosting provider: A web hosting provider will store your website files and make them accessible to users around the world. Research different providers and choose one that fits your needs.
4. Choose a platform: There are many platforms available for building a website, including WordPress, Wix, and Squarespace. Consider your technical skills and the features you need when choosing a platform.
5. Plan your website structure: Decide on the pages you want to include on your website and how they will be organized. This will help guide the design process.
6. Design your website: Use a template or create your own design for your website. Make sure it's visually appealing and easy to navigate.
7. Add content to your website: Write the text and create images or videos for your website. Make sure the content is relevant to your website's purpose and engaging for users.
8. Optimize your website for search engines: Make sure your website is easy to find by using keywords in your content and metadata. This will help your website rank higher in search engine results.
9. Test your website: Test your website to make sure everything is working properly and that there are no broken links or errors.
10. Launch your website: Once everything is working properly, it's time to launch your website. Share the link with your audience and start promoting your website.
I hope this guide helps you build a successful website. Let me know if you have any questions. [end of text]
llama_print_timings: load time = 11670.53 ms
llama_print_timings: sample time = 16.30 ms / 422 runs ( 0.04 ms per token, 25894.34 tokens per second)
llama_print_timings: prompt eval time = 5084.73 ms / 78 tokens ( 65.19 ms per token, 15.34 tokens per second)
llama_print_timings: eval time = 279055.53 ms / 421 runs ( 662.84 ms per token, 1.51 tokens per second)
llama_print_timings: total time = 284314.00 ms / 499 tokens
Log end
```
What's inside the `prompt.txt`:
```
<|system|>
You are a helpful assistant.</s>
<|user|>
Hello, how are you?</s>
<|assistant|>
I'm doing great. How can I help you today?</s>
<|user|>
Show me how to build a website in 10 simple steps</s>
<|assistant|>
```