Adding Evaluation Results

d560a3e verified 9 months ago

7.34 kB

	---
	language:
	- th
	- en
	license: apache-2.0
	library_name: transformers
	tags:
	- openthaigpt
	- llama
	datasets:
	- kobkrit/rd-taxqa
	- iapp_wiki_qa_squad
	- Thaweewat/alpaca-cleaned-52k-th
	- Thaweewat/instruction-wild-52k-th
	- Thaweewat/databricks-dolly-15k-th
	- Thaweewat/hc3-24k-th
	- Thaweewat/gpteacher-20k-th
	- Thaweewat/onet-m6-social
	- Thaweewat/alpaca-finance-43k-th
	pipeline_tag: text-generation
	model-index:
	- name: openthaigpt-1.0.0-beta-7b-chat-ckpt-hf
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: AI2 Reasoning Challenge (25-Shot)
	type: ai2_arc
	config: ARC-Challenge
	split: test
	args:
	num_few_shot: 25
	metrics:
	- type: acc_norm
	value: 44.97
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: HellaSwag (10-Shot)
	type: hellaswag
	split: validation
	args:
	num_few_shot: 10
	metrics:
	- type: acc_norm
	value: 70.19
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU (5-Shot)
	type: cais/mmlu
	config: all
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 36.22
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: TruthfulQA (0-shot)
	type: truthful_qa
	config: multiple_choice
	split: validation
	args:
	num_few_shot: 0
	metrics:
	- type: mc2
	value: 49.99
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: Winogrande (5-shot)
	type: winogrande
	config: winogrande_xl
	split: validation
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 69.38
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GSM8k (5-shot)
	type: gsm8k
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 1.36
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf
	name: Open LLM Leaderboard
	---

	# 🇹🇭 OpenThaiGPT 1.0.0-beta
	<img src="https://1173516064-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvvbWvIIe82Iv1yHaDBC5%2Fuploads%2Fb8eiMDaqiEQL6ahbAY0h%2Fimage.png?alt=media&token=6fce78fd-2cca-4c0a-9648-bd5518e644ce
	https://openthaigpt.aieat.or.th/" width="200px">

	🇹🇭 OpenThaiGPT Version 1.0.0-beta is a Thai language 7B-parameter LLaMA v2 Chat model finetuned to follow Thai translated instructions and extend more than 24,500 most popular Thai words vocabularies into LLM's dictionary for turbo speed.

	## Upgrade from OpenThaiGPT 1.0.0-alpha
	- Add more than 24,500 most popular Thai words vocabularies into LLM's dictionary and re-pretrain embedding layers which make it generate Thai text 10 times faster than previous version.

	## Support
	- Official website: https://openthaigpt.aieat.or.th
	- Facebook page: https://web.facebook.com/groups/openthaigpt
	- A Discord server for discussion and support [here](https://discord.gg/rUTp6dfVUF)
	- E-mail: [email protected]

	## License
	Source Code: License Apache Software License 2.0.<br>
	Weight: Research and Commercial uses.<br>

	## Code and Weight
	Colab Demo: https://colab.research.google.com/drive/1kDQidCtY9lDpk49i7P3JjLAcJM04lawu?usp=sharing<br>
	Finetune Code: https://github.com/OpenThaiGPT/openthaigpt-finetune-010beta<br>
	Inference Code: https://github.com/OpenThaiGPT/openthaigpt<br>
	Weight (Huggingface Checkpoint): https://huggingface.co/openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf

	## Sponsors
	Pantip.com, ThaiSC<br>
	<img src="https://1173516064-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvvbWvIIe82Iv1yHaDBC5%2Fuploads%2FiWjRxBQgo0HUDcpZKf6A%2Fimage.png?alt=media&token=4fef4517-0b4d-46d6-a5e3-25c30c8137a6" width="100px">
	<img src="https://1173516064-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvvbWvIIe82Iv1yHaDBC5%2Fuploads%2Ft96uNUI71mAFwkXUtxQt%2Fimage.png?alt=media&token=f8057c0c-5c5f-41ac-bb4b-ad02ee3d4dc2" width="100px">

	### Powered by
	OpenThaiGPT Volunteers, Artificial Intelligence Entrepreneur Association of Thailand (AIEAT), and Artificial Intelligence Association of Thailand (AIAT)

	<img src="https://1173516064-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvvbWvIIe82Iv1yHaDBC5%2Fuploads%2F6yWPXxdoW76a4UBsM8lw%2Fimage.png?alt=media&token=1006ee8e-5327-4bc0-b9a9-a02e93b0c032" width="100px">
	<img src="https://1173516064-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvvbWvIIe82Iv1yHaDBC5%2Fuploads%2FBwsmSovEIhW9AEOlHTFU%2Fimage.png?alt=media&token=5b550289-e9e2-44b3-bb8f-d3057d74f247" width="100px">

	### Authors
	* Kobkrit Viriyayudhakorn ([email protected])
	* Sumeth Yuenyong ([email protected])
	* Thaweewat Rugsujarit ([email protected])
	* Jillaphat Jaroenkantasima ([email protected])
	* Norapat Buppodom ([email protected])
	* Koravich Sangkaew ([email protected])
	* Peerawat Rojratchadakorn ([email protected])
	* Surapon Nonesung ([email protected])
	* Chanon Utupon ([email protected])
	* Sadhis Wongprayoon ([email protected])
	* Nucharee Thongthungwong ([email protected])
	* Chawakorn Phiantham ([email protected])
	* Patteera Triamamornwooth ([email protected])
	* Nattarika Juntarapaoraya ([email protected])
	* Kriangkrai Saetan ([email protected])
	* Pitikorn Khlaisamniang ([email protected])

	<i>Disclaimer: Provided responses are not guaranteed.</i>
	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_openthaigpt__openthaigpt-1.0.0-beta-7b-chat-ckpt-hf)

	\| Metric \|Value\|
	\|---------------------------------\|----:\|
	\|Avg. \|45.35\|
	\|AI2 Reasoning Challenge (25-Shot)\|44.97\|
	\|HellaSwag (10-Shot) \|70.19\|
	\|MMLU (5-Shot) \|36.22\|
	\|TruthfulQA (0-shot) \|49.99\|
	\|Winogrande (5-shot) \|69.38\|
	\|GSM8k (5-shot) \| 1.36\|