Edit model card

llama-3-neural-chat-v1-8b

Description

This repo contains GGUF format model files for Locutusque's llama-3-neural-chat-v1-8b .

Provided files

Name Quant method Bits Size Max RAM required Use case
llama-3-neural-chat-v1-8b.Q2_K.gguf Q2_K 2 2.72 GB 5.22 GB significant quality loss - not recommended for most purposes
llama-3-neural-chat-v1-8b.Q3_K_M.gguf Q3_K_M 3 3.52 GB 6.02 GB very small, high quality loss
llama-3-neural-chat-v1-8b.Q4_K_S.gguf Q4_K_S 4 4.14 GB 6.64 GB small, greater quality loss
llama-3-neural-chat-v1-8b.Q4_K_M.gguf Q4_K_M 4 4.37 GB 6.87 GB medium, balanced quality - recommended
llama-3-neural-chat-v1-8b.Q5_K_M.gguf Q5_K_M 5 5.13 GB 7.63 GB large, very low quality loss - recommended
llama-3-neural-chat-v1-8b.Q6_K.gguf Q6_K 6 5.94 GB 8.44 GB very large, extremely low quality loss
llama-3-neural-chat-v1-8b.Q8_0.gguf Q8_0 8 7.70 GB 10.20 GB very large, extremely low quality loss - not recommended
Downloads last month
69
GGUF
Model size
8.03B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference API
Unable to determine this model's library. Check the docs .

Model tree for seyf1elislam/llama-3-neural-chat-v1-8b-GGUF

Quantized
(2)
this model