|
--- |
|
library_name: transformers |
|
license: apache-2.0 |
|
datasets: |
|
- OpenAssistant/oasst2 |
|
- nvidia/HelpSteer |
|
language: |
|
- en |
|
- ja |
|
tags: |
|
- mistral |
|
- steerlm |
|
base_model: mistral-community/Mistral-7B-v0.2 |
|
--- |
|
|
|
# KARAKURI LM 7B APM v0.2 |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
- **Developed by:** [KARAKURI Inc.](https://about.karakuri.ai/) |
|
- **Model type:** Causal decoder-only transformer language model |
|
- **Languages**: Primarily English |
|
- **License:** Apache 2.0 |
|
- **Finetuned from model:** [mistral-community/Mistral-7B-v0.2](https://huggingface.co/mistral-community/Mistral-7B-v0.2) |
|
- **Contact**: For questions and comments about the model, please email `[email protected]` |
|
|
|
## Usage |
|
|
|
KARAKURI LM 7B APM v0.2 is a attribute prediction model that rates model responses on various aspects that makes a response desirable. |
|
|
|
Given a conversation with multiple turns between user and assistant, the model rates the following attributes (between 0 and 4) for every assistant turn. |
|
|
|
- helpfulness: Overall helpfulness of the response to the prompt. |
|
- correctness: Inclusion of all pertinent facts without errors. |
|
- coherence: Consistency and clarity of expression. |
|
- complexity: Intellectual depth required to write response (i.e. whether the response can be written by anyone with basic language competency or requires deep domain expertise). |
|
- verbosity: Amount of detail included in the response, relative to what is asked for in the prompt. |
|
- quality: Perceived goodness of response. |
|
- toxicity: Undesirable elements such as vulgar, harmful or potentially biased response. |
|
- humor: Sense of humor within response. |
|
- creativity: Willingness to generate non-conventional response. |
|
|
|
The first five are derived from HelpSteer, while the remaining four are derived from OASST2. |
|
|
|
You can run the model using the 🤗 Transformers: |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model_id = "karakuri-ai/karakuri-lm-7b-apm-v0.2" |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_id, |
|
torch_dtype="auto", |
|
device_map="auto", |
|
) |
|
|
|
messages = [ |
|
{"role": "user", "content": "Hello!"}, |
|
{"role": "assistant", "content": "Hello! How can I help you today?"}, |
|
] |
|
tokenizer.apply_chat_template( |
|
messages, |
|
label="helpsteer", |
|
tokenize=False, |
|
add_generation_prompt=True, |
|
) |
|
# <bos>[INST] Hello! [/INST] Hello! How can I help you today? [ATTR_1] |
|
|
|
input_ids = tokenizer.apply_chat_template( |
|
messages, |
|
label="helpsteer", |
|
add_generation_prompt=True, |
|
return_tensors="pt", |
|
).to(model.device) |
|
outputs = model.generate(input_ids, max_new_tokens=32) |
|
tokenizer.decode(outputs[0][input_ids.shape[-1]:]) |
|
# helpfulness: 2 correctness: 1 coherence: 2 complexity: 1 verbosity: 1 [/ATTR_1]<eos> |
|
|
|
messages += [ |
|
{"role": "label", "content": "helpfulness: 2 correctness: 1 coherence: 2 complexity: 1 verbosity: 1"}, |
|
{"role": "user", "content": "Thank you!"}, |
|
{"role": "assistant", "content": "You're welcome! I'm happy to help however I can."}, |
|
] |
|
tokenizer.apply_chat_template( |
|
messages, |
|
label="helpsteer", |
|
tokenize=False, |
|
add_generation_prompt=True, |
|
) |
|
# <bos>[INST] Hello! [/INST] Hello! How can I help you today? [ATTR_1] helpfulness: 2 correctness: 1 coherence: 2 complexity: 1 verbosity: 1 [/ATTR_1]<eos>[INST] Thank you! [/INST] You're welcome! I'm happy to help however I can. [ATTR_1] |
|
|
|
messages = [ |
|
{"role": "user", "content": "Hello!"}, |
|
{"role": "assistant", "content": "Hello! How can I help you today?"}, |
|
] |
|
tokenizer.apply_chat_template( |
|
messages, |
|
label="oasst", |
|
tokenize=False, |
|
add_generation_prompt=True, |
|
) |
|
# <bos>[INST] Hello! [/INST] Hello! How can I help you today? [ATTR_2] |
|
|
|
input_ids = tokenizer.apply_chat_template( |
|
messages, |
|
label="oasst", |
|
add_generation_prompt=True, |
|
return_tensors="pt", |
|
).to(model.device) |
|
outputs = model.generate(input_ids, max_new_tokens=32) |
|
tokenizer.decode(outputs[0][input_ids.shape[-1]:]) |
|
# quality: 3 toxicity: 1 humor: 1 creativity: 1 [/ATTR_2]<eos> |
|
``` |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
- [OASST2](https://huggingface.co/datasets/OpenAssistant/oasst2) |
|
- [HelpSteer](https://huggingface.co/datasets/nvidia/HelpSteer) |
|
|
|
### Training Infrastructure |
|
|
|
- **Hardware**: The model was trained on single node of an Amazon EC2 trn1.32xlarge instance. |
|
- **Software**: We use code based on [neuronx-nemo-megatron](https://github.com/aws-neuron/neuronx-nemo-megatron). |
|
|
|
## Citation |
|
|
|
``` |
|
@misc{karakuri_lm_7b_apm_v02, |
|
author = { {KARAKURI} {I}nc. }, |
|
title = { {KARAKURI} {LM} 7{B} {APM} v0.2 }, |
|
year = { 2024 }, |
|
url = { https://huggingface.co/karakuri-ai/karakuri-lm-7b-apm-v0.2 }, |
|
publisher = { Hugging Face }, |
|
journal = { Hugging Face repository } |
|
} |
|
``` |