Upload tinyllama-1.1b-chat-v1.0.Q4_1.gguf

I've been working on adding [GGUF support to MLX](https://github.com/ml-explore/mlx/pull/350), and Q4_1 seems like the format that's the most aligned with MLX quantization. The quantization error is also a bit better than Q4_0 (tested with [gguf-tools](https://github.com/antirez/gguf-tools/pull/9))

Files changed (2) hide show

.gitattributes +1 -0
tinyllama-1.1b-chat-v1.0.Q4_1.gguf +3 -0

.gitattributes CHANGED Viewed

@@ -45,3 +45,4 @@ tinyllama-1.1b-chat-v1.0.Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
 tinyllama-1.1b-chat-v1.0.Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
 tinyllama-1.1b-chat-v1.0.Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
 tinyllama-1.1b-chat-v1.0.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text

 tinyllama-1.1b-chat-v1.0.Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
 tinyllama-1.1b-chat-v1.0.Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
 tinyllama-1.1b-chat-v1.0.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
+tinyllama-1.1b-chat-v1.0.Q4_1.gguf filter=lfs diff=lfs merge=lfs -text

tinyllama-1.1b-chat-v1.0.Q4_1.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:54509f708568d36d4f3186433525340fcf47ab441f3faa87d826af04a3538268
+size 702350688