benk04/NoromaidxOpenGPT4-2-3.75bpw-h6-exl2

Exllamav2 3.75bpw quantization of NoromaidxOpenGPT4-2 from NeverSleep, quantized with default calibration dataset. Included is measurement json file, so you can do your own quants.

This bpw is the perfect size for 24GB GPUs, and can fit 32k context. Make sure to enable 4-bit cache option or you'll run into OOM errors.

Notes: This model is one of the better mixtral derivatives for rp, and I recommend using it with the Alpaca preset in SillyTavern.

Original Card

Description

This repo contains fp16 files of NoromaidxOpenGPT4-2.

The model was created by merging Noromaid-8x7b-Instruct with Open_Gpt4_8x7B_v0.2 the exact same way Rombodawg done his merge.

The only difference between NoromaidxOpenGPT4-1 and NoromaidxOpenGPT4-2 is that the first iteration use Mixtral-8x7B as a base for the merge (f16), where the second use Open_Gpt4_8x7B_v0.2 as a base (bf16).

After further testing and usage, the two model was released, because they each have their own qualities.

You can download the imatrix file to do many other quant HERE.

Prompt template:

Alpaca

### Instruction:
{system prompt}

### Input:
{prompt}

### Response:
{output}

Mistral

[INST] {prompt} [/INST]

Merge Details

Merge Method

This model was merged using the TIES merge method using rombodawg/Open_Gpt4_8x7B_v0.2 as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: mistralai/Mixtral-8x7B-Instruct-v0.1
    parameters:
      density: .5
      weight: 1
  - model: NeverSleep/Noromaid-v0.1-mixtral-8x7b-Instruct-v3
    parameters:
      density: .5
      weight: .7
merge_method: ties
base_model: rombodawg/Open_Gpt4_8x7B_v0.2
parameters:
  normalize: true
  int8_mask: true
dtype: bfloat16

Support

If you want to support us, you can here.

benk04
/

NoromaidxOpenGPT4-2-3.75bpw-h6-exl2