Update README.md

5022c0b verified 5 months ago

4.8 kB

	---
	library_name: transformers
	license: apache-2.0
	datasets:
	- OpenAssistant/oasst2
	- nvidia/HelpSteer
	language:
	- en
	- ja
	tags:
	- mistral
	- steerlm
	base_model: mistral-community/Mistral-7B-v0.2
	---

	# KARAKURI LM 7B APM v0.2

	## Model Details

	### Model Description

	- Developed by: [KARAKURI Inc.](https://about.karakuri.ai/)
	- Model type: Causal decoder-only transformer language model
	- Languages: Primarily English
	- License: Apache 2.0
	- Finetuned from model: [mistral-community/Mistral-7B-v0.2](https://huggingface.co/mistral-community/Mistral-7B-v0.2)
	- Contact: For questions and comments about the model, please email `[email protected]`

	## Usage

	KARAKURI LM 7B APM v0.2 is a attribute prediction model that rates model responses on various aspects that makes a response desirable.

	Given a conversation with multiple turns between user and assistant, the model rates the following attributes (between 0 and 4) for every assistant turn.

	- helpfulness: Overall helpfulness of the response to the prompt.
	- correctness: Inclusion of all pertinent facts without errors.
	- coherence: Consistency and clarity of expression.
	- complexity: Intellectual depth required to write response (i.e. whether the response can be written by anyone with basic language competency or requires deep domain expertise).
	- verbosity: Amount of detail included in the response, relative to what is asked for in the prompt.
	- quality: Perceived goodness of response.
	- toxicity: Undesirable elements such as vulgar, harmful or potentially biased response.
	- humor: Sense of humor within response.
	- creativity: Willingness to generate non-conventional response.

	The first five are derived from HelpSteer, while the remaining four are derived from OASST2.

	You can run the model using the 🤗 Transformers:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_id = "karakuri-ai/karakuri-lm-7b-apm-v0.2"
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	torch_dtype="auto",
	device_map="auto",
	)

	messages = [
	{"role": "user", "content": "Hello!"},
	{"role": "assistant", "content": "Hello! How can I help you today?"},
	]
	tokenizer.apply_chat_template(
	messages,
	label="helpsteer",
	tokenize=False,
	add_generation_prompt=True,
	)
	# <bos>[INST] Hello! [/INST] Hello! How can I help you today? [ATTR_1]

	input_ids = tokenizer.apply_chat_template(
	messages,
	label="helpsteer",
	add_generation_prompt=True,
	return_tensors="pt",
	).to(model.device)
	outputs = model.generate(input_ids, max_new_tokens=32)
	tokenizer.decode(outputs[0][input_ids.shape[-1]:])
	# helpfulness: 2 correctness: 1 coherence: 2 complexity: 1 verbosity: 1 [/ATTR_1]<eos>

	messages += [
	{"role": "label", "content": "helpfulness: 2 correctness: 1 coherence: 2 complexity: 1 verbosity: 1"},
	{"role": "user", "content": "Thank you!"},
	{"role": "assistant", "content": "You're welcome! I'm happy to help however I can."},
	]
	tokenizer.apply_chat_template(
	messages,
	label="helpsteer",
	tokenize=False,
	add_generation_prompt=True,
	)
	# <bos>[INST] Hello! [/INST] Hello! How can I help you today? [ATTR_1] helpfulness: 2 correctness: 1 coherence: 2 complexity: 1 verbosity: 1 [/ATTR_1]<eos>[INST] Thank you! [/INST] You're welcome! I'm happy to help however I can. [ATTR_1]

	messages = [
	{"role": "user", "content": "Hello!"},
	{"role": "assistant", "content": "Hello! How can I help you today?"},
	]
	tokenizer.apply_chat_template(
	messages,
	label="oasst",
	tokenize=False,
	add_generation_prompt=True,
	)
	# <bos>[INST] Hello! [/INST] Hello! How can I help you today? [ATTR_2]

	input_ids = tokenizer.apply_chat_template(
	messages,
	label="oasst",
	add_generation_prompt=True,
	return_tensors="pt",
	).to(model.device)
	outputs = model.generate(input_ids, max_new_tokens=32)
	tokenizer.decode(outputs[0][input_ids.shape[-1]:])
	# quality: 3 toxicity: 1 humor: 1 creativity: 1 [/ATTR_2]<eos>
	```

	## Training Details

	### Training Data

	- [OASST2](https://huggingface.co/datasets/OpenAssistant/oasst2)
	- [HelpSteer](https://huggingface.co/datasets/nvidia/HelpSteer)

	### Training Infrastructure

	- Hardware: The model was trained on single node of an Amazon EC2 trn1.32xlarge instance.
	- Software: We use code based on [neuronx-nemo-megatron](https://github.com/aws-neuron/neuronx-nemo-megatron).

	## Citation

	```
	@misc{karakuri_lm_7b_apm_v02,
	author = { {KARAKURI} {I}nc. },
	title = { {KARAKURI} {LM} 7{B} {APM} v0.2 },
	year = { 2024 },
	url = { https://huggingface.co/karakuri-ai/karakuri-lm-7b-apm-v0.2 },
	publisher = { Hugging Face },
	journal = { Hugging Face repository }
	}
	```