File size: 1,860 Bytes
a77b0e9 41db3dd a77b0e9 de6982a 41db3dd |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
---
tags:
- uqff
- mistral.rs
base_model: meta-llama/Llama-3.2-11B-Vision-Instruct
base_model_relation: quantized
---
<!-- Autogenerated from user input. -->
# `meta-llama/Llama-3.2-11B-Vision-Instruct`, UQFF quantization
Run with [mistral.rs](https://github.com/EricLBuehler/mistral.rs). Documentation: [UQFF docs](https://github.com/EricLBuehler/mistral.rs/blob/master/docs/UQFF.md).
1) **Flexible** ๐: Multiple quantization formats in *one* file format with *one* framework to run them all.
2) **Reliable** ๐: Compatibility ensured with *embedded* and *checked* semantic versioning information from day 1.
3) **Easy** ๐ค: Download UQFF models *easily* and *quickly* from Hugging Face, or use a local file.
3) **Customizable** ๐ ๏ธ: Make and publish your own UQFF files in minutes.
## Examples
|Quantization type(s)|Example|
|--|--|
|FP8|`./mistralrs-server -i vision-plain -m EricB/Llama-3.2-11B-Vision-Instruct-UQFF -a vllama --from-uqff llam3.2-vision-instruct-f8e4m3.uqff`|
|HQQ4|`./mistralrs-server -i vision-plain -m EricB/Llama-3.2-11B-Vision-Instruct-UQFF -a vllama --from-uqff llam3.2-vision-instruct-hqq4.uqff`|
|HQQ8|`./mistralrs-server -i vision-plain -m EricB/Llama-3.2-11B-Vision-Instruct-UQFF -a vllama --from-uqff llam3.2-vision-instruct-hqq8.uqff`|
|Q3K|`./mistralrs-server -i vision-plain -m EricB/Llama-3.2-11B-Vision-Instruct-UQFF -a vllama --from-uqff llam3.2-vision-instruct-q3k.uqff`|
|Q4K|`./mistralrs-server -i vision-plain -m EricB/Llama-3.2-11B-Vision-Instruct-UQFF -a vllama --from-uqff llam3.2-vision-instruct-q4k.uqff`|
|Q5K|`./mistralrs-server -i vision-plain -m EricB/Llama-3.2-11B-Vision-Instruct-UQFF -a vllama --from-uqff llam3.2-vision-instruct-q5k.uqff`|
|Q8_0|`./mistralrs-server -i vision-plain -m EricB/Llama-3.2-11B-Vision-Instruct-UQFF -a vllama --from-uqff llam3.2-vision-instruct-q8_0.uqff`|
|