Transformers
GGUF
English
tinyllama
jbochi commited on
Commit
4d32359
1 Parent(s): 52e7645

Upload tinyllama-1.1b-chat-v1.0.Q4_1.gguf

Browse files

I've been working on adding [GGUF support to MLX](https://github.com/ml-explore/mlx/pull/350), and Q4_1 seems like the format that's the most aligned with MLX quantization. The quantization error is also a bit better than Q4_0 (tested with [gguf-tools](https://github.com/antirez/gguf-tools/pull/9))

.gitattributes CHANGED
@@ -45,3 +45,4 @@ tinyllama-1.1b-chat-v1.0.Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
45
  tinyllama-1.1b-chat-v1.0.Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
46
  tinyllama-1.1b-chat-v1.0.Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
47
  tinyllama-1.1b-chat-v1.0.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
 
 
45
  tinyllama-1.1b-chat-v1.0.Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
46
  tinyllama-1.1b-chat-v1.0.Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
47
  tinyllama-1.1b-chat-v1.0.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
48
+ tinyllama-1.1b-chat-v1.0.Q4_1.gguf filter=lfs diff=lfs merge=lfs -text
tinyllama-1.1b-chat-v1.0.Q4_1.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:54509f708568d36d4f3186433525340fcf47ab441f3faa87d826af04a3538268
3
+ size 702350688