metadata
language:
- ja
tags:
- japanese-stablelm
- causal-lm
pipeline_tag: text-generation
base_model: stabilityai/japanese-stablelm-base-gamma-7b
datasets: argilla/ultrafeedback-binarized-preferences-cleaned
license: apache-2.0
extra_gated_fields:
Name: text
Email: text
Country: text
Organization or Affiliation: text
I allow Stability AI to contact me about information related to its models and research: checkbox
Reproduced Japanese Stable LM Instruct Gamma 7B
Model Description
This is a reproduction of 7B-parameter decoder-only Japanese language model fine-tuned on instruction-following datasets, built on top of the base model Japanese Stable LM Base Gamma 7B.
This model is trained with notus code base.
If you are in search of the official model, please check Japanese Stable LM Instruct Gamma 7B.
Model Details
Training Datasets
- Japanese translation of the Databricks Dolly-15k dataset
- Japanese translation of the subset of the Anthropic HH dataset
- Wikinews subset of the izumi-lab/llm-japanese-dataset
Benchmarks
The result is evaluated by Nejumi-leaderboard Neo.
llm-jp-eval:
AVG EL FA MC MR NLI QA RC chabsa_set_f1 jamp_exact_match janli_exact_match jcommonsenseqa_exact_match jemhopqa_char_f1 jnli_exact_match jsem_exact_match jsick_exact_match jsquad_char_f1 niilc_char_f1 0.1691 0.0 0.0 0.24 0.0 0.286 0.1688 0.4887 0.0 0.3 0.56 0.24 0.1334 0.08 0.28 0.21 0.4887 0.2042 Japanese Mt-Bench:
coding extraction humanities math reasoning roleplay stem writing 1.3 1.75 2.35 1.45 3.4 5.8 4.3 3.1 Overall Average: 0.266