Swallow-70b-NVE-RP / README.md
nitky's picture
Upload 22 files
281fa5a verified
|
raw
history blame
6.67 kB
metadata
base_model:
  - tokyotech-llm/Swallow-70b-NVE-instruct-hf
  - dreamgen/opus-v0.5-70b
  - GOAT-AI/GOAT-70B-Storytelling
  - Doctor-Shotgun/lzlv-limarpv3-l2-70b
  - alac/Waxwing-Storytelling-70B-LoRA
tags:
  - mergekit
  - merge
language:
  - en
  - ja
library_name: transformers
pipeline_tag: text-generation
license: llama2
model_type: llama

Swallow-70b-NVE-RP

Important Notice:

For personal and academic use only.

Description

This model is suitable for role-playing and storytelling, but it's not a great model for multi-turn chat.

This was created for personal and academic use only. This merge model uses only fine-tune models of Llama2, but some of the models used include those whose licenses for commercial use are unclear.

If there is a license problem, the rights holder should contact me directly. No license changes will be made due to contact from others.

Test environment

This model was tested using text-generation-webui. I use preset simple-1 and Null preset for Generation.

Recommendation

Use simple-1 settings:

  • temperature: 0.7
  • top_p: 0.9
  • repetition_penalty: 1.15
  • top_k: 20

Tested temperature Range

  • temperature: 0.3 - 1.0

Tested repetition_penalty Range

  • repetition_penalty: 1.0 - 1.15

Prompt template

Swallow Style (Alpaca format)

以下に、あるタスクを説明する指示があり、それに付随する入力が更なる文脈を提供しています。リクエストを適切に完了するための回答を記述してください。

### 指示:
{instruction}

### 応答:

Although not fully tested, Doctor-Shotgun/lzlv-limarpv3-l2-70b and alac/Waxwing-Storytelling-70B-LoRA prompt styles are also available.

Use the instruct model

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "nitky/Swallow-70b-NVE-RP"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, low_cpu_mem_usage=True, device_map="auto", load_in_4bit = True)


PROMPT_DICT = {
    "prompt_input": (
        "以下に、あるタスクを説明する指示があり、それに付随する入力が更なる文脈を提供しています。"
        "リクエストを適切に完了するための回答を記述してください。\n\n"
        "### 指示:\n{instruction}\n\n### 入力:\n{input}\n\n### 応答:"

    ),
    "prompt_no_input": (
        "以下に、あるタスクを説明する指示があります。"
        "リクエストを適切に完了するための回答を記述してください。\n\n"
        "### 指示:\n{instruction}\n\n### 応答:"
    ),
}

def create_prompt(instruction, input=None):
    """
    Generates a prompt based on the given instruction and an optional input.
    If input is provided, it uses the 'prompt_input' template from PROMPT_DICT.
    If no input is provided, it uses the 'prompt_no_input' template.

    Args:
        instruction (str): The instruction describing the task.
        input (str, optional): Additional input providing context for the task. Default is None.

    Returns:
        str: The generated prompt.
    """
    if input:
        # Use the 'prompt_input' template when additional input is provided
        return PROMPT_DICT["prompt_input"].format(instruction=instruction, input=input)
    else:
        # Use the 'prompt_no_input' template when no additional input is provided
        return PROMPT_DICT["prompt_no_input"].format(instruction=instruction)

# Example usage
instruction_example = "以下のトピックに関する詳細な情報を提供してください。"
input_example = "東京工業大学の主なキャンパスについて教えてください"
prompt = create_prompt(instruction_example, input_example)

input_ids = tokenizer.encode(
    prompt,
    add_special_tokens=False,
    return_tensors="pt"
)

tokens = model.generate(
    input_ids.to(device=model.device),
    max_new_tokens=200,
    temperature=0.7,
    top_p=0.9,
    repetition_penalty=1.15,
    top_k=20,
    do_sample=True,
)

out = tokenizer.decode(tokens[0], skip_special_tokens=True)
print(out)

Merge Details

Merge Method

This model was merged using the DARE TIES and the SLERP merge method using tokyotech-llm/Swallow-70b-NVE-instruct-hf as a base.

Models Merged

The following models were included in the merge:

Configuration

Command example:

# please change the path and options according to your environment
mergekit-mega --cuda --lora-merge-cache ~/text-generation-webui/loras/models--alac--Waxwing-Storytelling-70B-LoRA Swallow-70b-NVE-RP.yml ~/text-generation-webui/models

The following YAML configuration was used to produce this model:

models:
  - model: tokyotech-llm/Swallow-70b-NVE-instruct-hf
    # no parameters necessary for base model
  - model: GOAT-AI/GOAT-70B-Storytelling # storytelling
    parameters:
      density: 1
      weight: 0.25
  - model: dreamgen/opus-v0.5-70b # creative roleplay
    parameters:
      density: 1
      weight: 0.25
merge_method: dare_ties
base_model: tokyotech-llm/Swallow-70b-NVE-instruct-hf
dtype: bfloat16
name: Swallow-70b-NVE-RP-base
---
models:
  - model: tokyotech-llm/Swallow-70b-NVE-instruct-hf
    # no parameters necessary for base model
  - model: Doctor-Shotgun/lzlv-limarpv3-l2-70b # roleplay configuration
    parameters:
      density: 1
      weight: 0.25
merge_method: dare_ties
base_model: tokyotech-llm/Swallow-70b-NVE-instruct-hf
dtype: bfloat16
name: Swallow-70b-NVE-RP-flavor
---
slices:
  - sources:
      - model: Swallow-70b-NVE-RP-base
        layer_range: [0, 80]
      - model: Swallow-70b-NVE-RP-flavor
        layer_range: [0, 80]
merge_method: slerp
base_model: Swallow-70b-NVE-RP-base
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5 # fallback for rest of tensors
dtype: bfloat16
name: Swallow-70b-NVE-RP