Update README.md

636f3b5 verified 8 months ago

4.36 kB

	---
	base_model:
	- karakuri-ai/karakuri-lm-70b-chat-v0.1
	library_name: transformers
	tags:
	- mergekit
	- merge
	license: llama2
	language:
	- ja
	---
	# karakuri-lm-upscaled-103b-v0.1
	[GGUF版はこちら/Click here for the GGUF version](https://huggingface.co/Aratako/karakuri-lm-chat-upscaled-103b-v0.1-GGUF)

	## 概要/Description

	[karakuri-ai/karakuri-lm-70b-chat-v0.1](https://huggingface.co/karakuri-ai/karakuri-lm-70b-chat-v0.1)を自身でフランケンマージし、103bまで拡張したモデルです。ライセンスに関しては元モデルと同一です。

	[wolfram/miqu-1-103b](https://huggingface.co/wolfram/miqu-1-103b)と同じマージ手法を用いています。

	This is a 103b frankenmerge of [karakuri-ai/karakuri-lm-70b-chat-v0.1](https://huggingface.co/karakuri-ai/karakuri-lm-70b-chat-v0.1) created by interleaving layers of [karakuri-ai/karakuri-lm-70b-chat-v0.1](https://huggingface.co/karakuri-ai/karakuri-lm-70b-chat-v0.1) with itself using [mergekit](https://github.com/cg123/mergekit). Please refer to the original model regarding the license.

	Inspired by [wolfram/miqu-1-103b](https://huggingface.co/wolfram/miqu-1-103b).

	## ライセンス/License
	元モデルのライセンスを継承します。元モデルのライセンスを引用します。

	This model inherits the license of the original model. I will quote the license of the original model.

	>Llama 2 is licensed under the LLAMA 2 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.
	>
	>Subject to the license above, and except for commercial purposes, you are free to share and adapt KARAKURI LM, provided that you must, in a recognizable and appropriate manner, (i) state that you are using KARAKURI LM developed by KARAKURI Inc., when you publish or make available to third parties KARAKURI LM, its derivative works or modification, or any output or results of KARAKURI LM or its derivative works or modification, and (ii) indicate your contributions, if you modified any material of KARAKURI LM.
	>
	>
	>If you plan to use KARAKURI LM for commercial purposes, please contact us beforehand. You are not authorized to use KARAKURI LM for commercial purposes unless we expressly grant you such rights.
	>
	>If you have any questions regarding the interpretation of above terms, please also feel free to contact us.

	## ベンチマーク/Benchmark
	ベースとしたkarakuri-ai/karakuri-lm-70b-chat-v0.1と本モデルの[japanese-mt-bench](https://github.com/Stability-AI/FastChat/tree/jp-stable/fastchat/llm_judge)の結果は以下の通りです。
	（シングルターン, 4ビット量子化）

	平均スコアは低くなっていますが、本モデルの出力は元モデルより長くなっていることが目視で確認され、ベンチマーク設定の関係上出力が途中で途切れてしまい低評価をされることが多い印象でした。（主にHumanitiesやWriting）

	こちらを加味すると総合的な性能は同等かあるいはやや高いのではと考察しています。
	\|Model\|Size\|Coding\|Extraction\|Humanities\|Math\|Reasoning\|Roleplay\|STEM\|Writing\|avg_score\|
	\|---\|---\|---\|---\|---\|---\|---\|---\|---\|---\|---\|
	\| karakuri-lm-70b-chat-v0.1 \| 70B \| 4.8 \| 7.4 \| 9.3 \| 2.8 \| 5.9 \| 8.2 \| 9.3 \| 9.3 \| 7.125 \|
	\| This model \| 103B \| 3.3 \| 8.0 \| 8.5 \| 3.4 \| 6.8 \| 7.6 \| 9.0 \| 8.2 \| 6.850 \|

	![レーダーチャート](./japanese_mt_bench.png)

	ベンチマークに使用したプロンプト
	```
	<s>[INST] <<SYS>>
	あなたは誠実で優秀な日本人のアシスタントです。
	<</SYS>>

	{instruction} [ATTR] helpfulness: 4 correctness: 4 coherence: 4 complexity: 4 verbosity: 4 quality: 4 toxicity: 0 humor: 0 creativity: 0 [/ATTR] [/INST]
	```

	## Merge Details
	### Merge Method

	This model was merged using the passthrough merge method.

	### Models Merged

	The following models were included in the merge:
	* ./karakuri-lm-70b-chat-v0.1

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	merge_method: passthrough
	slices:
	- sources:
	- model: ./karakuri-lm-70b-chat-v0.1
	layer_range: [0, 40]
	- sources:
	- model: ./karakuri-lm-70b-chat-v0.1
	layer_range: [20, 60]
	- sources:
	- model: ./karakuri-lm-70b-chat-v0.1
	layer_range: [40, 80]
	dtype: bfloat16

	```