Edit model card
MidnightMiqu

Midnight-Miqu-103B-v1.0 - GGUF

These are GGUF quants of sophosympatheia/Midnight-Miqu-103B-v1.0

Details about the model and the merge info can be found at the above mode page.

Note: I'd recommend checking out mradermacher/Midnight-Miqu-103B-v1.0-GGUF quants as well. He has IQ quants which are likely better than my non-IQ ones.

GGUF File sizes

Name Disk Size (GB)
Midnight-Miqu-103B-v1.0-Q2_K.gguf 35.31
Midnight-Miqu-103B-v1.0-IQ3_XXS.gguf 39.14
Midnight-Miqu-103B-v1.0-Q3_K_XS.gguf 39.08
Midnight-Miqu-103B-v1.0-Q3_K_S.gguf 41.40
Midnight-Miqu-103B-v1.0-Q3_K_M.gguf 46.20
Midnight-Miqu-103B-v1.0-Q3_K_L.gguf 50.35
Midnight-Miqu-103B-v1.0-Q4_0.gguf 54.13
Midnight-Miqu-103B-v1.0-Q4_K_S.gguf 54.55
Midnight-Miqu-103B-v1.0-Q4_K_M.gguf 57.64
Midnight-Miqu-103B-v1.0-Q5_0.gguf 66.12
Midnight-Miqu-103B-v1.0-Q5_K_S.gguf 66.12
Midnight-Miqu-103B-v1.0-Q5_K_M.gguf 67.92
Midnight-Miqu-103B-v1.0-Q6_K.gguf 78.85
Midnight-Miqu-103B-v1.0-Q8_0.gguf 102.13

Joining split files

Note: HF does not support uploading files larger than 50GB. Therefore I have uploaded some quants as split files.

For split files, please download all parts of the file.

To join the files, do the following. The below example is for the Q6_K quant.

Linux and macOS:

cat Midnight-Miqu-103B-v1.0-Q6_K.gguf-part-* > Midnight-Miqu-103B-v1.0-Q6_K.gguf && rm Midnight-Miqu-103B-v1.0-Q6_K.gguf-part-*

Windows command line:

COPY /B Midnight-Miqu-103B-v1.0-Q6_K.gguf-part-a + Midnight-Miqu-103B-v1.0-Q6_K.gguf-part-b Midnight-Miqu-103B-v1.0-Q6_K.gguf
del Midnight-Miqu-103B-v1.0-Q6_K.gguf-part-a Midnight-Miqu-103B-v1.0-Q6_K.gguf-part-b

Split details

For reference, below are the commands used to create the splits:

split -b 40G -a 1 Midnight-Miqu-103B-v1.0-Q3_K_L.gguf Midnight-Miqu-103B-v1.0-Q3_K_L-part-
split -b 40G -a 1 Midnight-Miqu-103B-v1.0-Q4_0.gguf Midnight-Miqu-103B-v1.0-Q4_0-part-
split -b 40G -a 1 Midnight-Miqu-103B-v1.0-Q4_K_M.gguf Midnight-Miqu-103B-v1.0-Q4_K_M-part-
split -b 40G -a 1 Midnight-Miqu-103B-v1.0-Q4_K_S.gguf Midnight-Miqu-103B-v1.0-Q4_K_S-part-
split -b 40G -a 1 Midnight-Miqu-103B-v1.0-Q5_0.gguf Midnight-Miqu-103B-v1.0-Q5_0-part-
split -b 40G -a 1 Midnight-Miqu-103B-v1.0-Q5_K_M.gguf Midnight-Miqu-103B-v1.0-Q5_K_M-part-
split -b 40G -a 1 Midnight-Miqu-103B-v1.0-Q5_K_S.gguf Midnight-Miqu-103B-v1.0-Q5_K_S-part-
split -b 40G -a 1 Midnight-Miqu-103B-v1.0-Q6_K.gguf Midnight-Miqu-103B-v1.0-Q6_K-part-
split -b 40G -a 1 Midnight-Miqu-103B-v1.0-Q8_0.gguf Midnight-Miqu-103B-v1.0-Q8_0-part-

Quant Details

For reference, this is the script used for quantization.

#!/bin/bash

# Activate the conda environment
source ~/miniconda3/etc/profile.d/conda.sh
conda activate llamacpp

# Define MODEL_NAME above the loop
MODEL_NAME="Midnight-Miqu-103B-v1.0"

# Define the output directory
outputDir="${MODEL_NAME}-GGUF"

# Create the output directory if it doesn't exist
mkdir -p "${outputDir}"

# Make the F32 quant
f32file="/mnt/storage/models/GGUF/${MODEL_NAME}-F32.gguf"
if [ -f "${f32file}" ]; then
    echo "Skipping f32 as ${f32file} already exists."
else
    python convert.py "~/src/models/${MODEL_NAME}" --outfile "${f32file}" --outtype "f32"
fi

# Define the array of quantization strings
quants=("Q2_K" "IQ3_XXS" "Q3_K_L" "Q3_K_M" "Q3_K_S" "Q3_K_XS" "Q4_0" "Q4_K_M" "Q4_K_S" "Q5_0" "Q5_K_M" "Q5_K_S" "Q6_K" "Q8_0")

# Loop through the quants array
for quant in "${quants[@]}"; do
    outfile="${outputDir}/${MODEL_NAME}-${quant}.gguf"
    
    # Check if the outfile already exists
    if [ -f "${outfile}" ]; then
        echo "Skipping ${quant} as ${outfile} already exists."
    else
        # Run the command with the current quant string
        ./quantize "${f32file}" "${outfile}" "${quant}"
        
        echo "Processed ${quant} and generated ${outfile}"
    fi
done
Downloads last month
135
GGUF
Model size
103B params
Architecture
llama

2-bit

3-bit

Inference API
Unable to determine this model’s pipeline type. Check the docs .

Collection including Dracones/Midnight-Miqu-103B-v1.0-GGUF