Mikko Juola
Upload Q4_K_M and Q6_K, also add a README.md
47653af
|
raw
history blame
808 Bytes

This repository contains .gguf files for:

https://huggingface.co/grimulkan/aurelian-alpha0.1-70b-rope8-32K-fp16

Made with llama.cpp commit e18f7345a300920e234f732077bda660cc6cda9c

IMPORTANT: Linear Rope Scaling = 8 (IMPORTANT: use a factor of 8 even if you are not using the full 32K context length). The setting typically defaults to 1, so you need to change it.

md5sums

  • aurelian-alpha0.1_Q4_K_M.gguf 27ba8b8dc99776cc48d667d1766f8771
  • aurelian-alpha0.1_Q6_K.gguf ab36ed3f2cfd2f833cb814304a5cbe50

The aurelian-alpha0.1_Q6_K.gguf is just barely over 50G, HuggingFace's file limit, so it is in two parts.

On a UNIX-like system, you can use cat to piece it together:

cat aurelian-alpha0.1_Q6_K.gguf-split-a aurelian-alpha0.1_Q6_K.gguf-split-b > aurelian-alpha0.1_Q6_K.gguf