Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,81 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
datasets:
|
4 |
+
- kobkrit/rd-taxqa
|
5 |
+
- iapp_wiki_qa_squad
|
6 |
+
- Thaweewat/alpaca-cleaned-52k-th
|
7 |
+
- Thaweewat/instruction-wild-52k-th
|
8 |
+
- Thaweewat/databricks-dolly-15k-th
|
9 |
+
- Thaweewat/hc3-24k-th
|
10 |
+
- Thaweewat/gpteacher-20k-th
|
11 |
+
- Thaweewat/onet-m6-social
|
12 |
+
- Thaweewat/alpaca-finance-43k-th
|
13 |
+
language:
|
14 |
+
- th
|
15 |
+
- en
|
16 |
+
library_name: transformers
|
17 |
+
pipeline_tag: text-generation
|
18 |
+
tags:
|
19 |
+
- openthaigpt
|
20 |
+
- llama
|
21 |
---
|
22 |
+
|
23 |
+
# 🇹🇭 OpenThaiGPT 1.0.0-beta
|
24 |
+
<img src="https://1173516064-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvvbWvIIe82Iv1yHaDBC5%2Fuploads%2Fb8eiMDaqiEQL6ahbAY0h%2Fimage.png?alt=media&token=6fce78fd-2cca-4c0a-9648-bd5518e644ce
|
25 |
+
https://openthaigpt.aieat.or.th/" width="200px">
|
26 |
+
|
27 |
+
🇹🇭 OpenThaiGPT Version 1.0.0-beta is a Thai language 7B-parameter LLaMA v2 Chat model finetuned to follow Thai translated instructions and extend more than 24,500 most popular Thai words vocabularies into LLM's dictionary for turbo speed. For more information, please visit the project’s website | Github.
|
28 |
+
|
29 |
+
## Upgrade from OpenThaiGPT 1.0.0-alpha
|
30 |
+
- Add more than 24,500 most popular Thai words vocabularies into LLM's dictionary and re-pretrain embedding layers which make it generate Thai text 10 times faster than previous version.
|
31 |
+
|
32 |
+
## Support
|
33 |
+
- Official website: https://openthaigpt.aieat.or.th
|
34 |
+
- Facebook page: https://web.facebook.com/groups/openthaigpt
|
35 |
+
- A Discord server for discussion and support [here](https://discord.gg/rUTp6dfVUF)
|
36 |
+
- E-mail: [email protected]
|
37 |
+
|
38 |
+
## License
|
39 |
+
**Source Code**: License Apache Software License 2.0.<br>
|
40 |
+
**Weight**: Research and **Commercial uses**.<br>
|
41 |
+
|
42 |
+
## Code and Weight
|
43 |
+
**Colab Demo**: https://colab.research.google.com/drive/1kDQidCtY9lDpk49i7P3JjLAcJM04lawu?usp=sharing<br>
|
44 |
+
**Finetune Code**: https://github.com/OpenThaiGPT/openthaigpt-finetune-010beta<br>
|
45 |
+
**Inference Code**: https://github.com/OpenThaiGPT/openthaigpt<br>
|
46 |
+
**Weight (Lora Adapter)**: https://huggingface.co/openthaigpt/openthaigpt-1.0.0-alpha-7b-chat<br>
|
47 |
+
**Weight (Huggingface Checkpoint)**: https://huggingface.co/openthaigpt/openthaigpt-1.0.0-alpha-7b-chat-ckpt-hf<br>
|
48 |
+
**Weight (GGML)**: https://huggingface.co/openthaigpt/openthaigpt-1.0.0-alpha-7b-chat-ggml<br>
|
49 |
+
**Weight (Quantized 4bit GGML)**: https://huggingface.co/openthaigpt/openthaigpt-1.0.0-alpha-7b-chat-ggml-q4
|
50 |
+
|
51 |
+
|
52 |
+
## Sponsors
|
53 |
+
Pantip.com, ThaiSC<br>
|
54 |
+
<img src="https://1173516064-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvvbWvIIe82Iv1yHaDBC5%2Fuploads%2FiWjRxBQgo0HUDcpZKf6A%2Fimage.png?alt=media&token=4fef4517-0b4d-46d6-a5e3-25c30c8137a6" width="100px">
|
55 |
+
<img src="https://1173516064-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvvbWvIIe82Iv1yHaDBC5%2Fuploads%2Ft96uNUI71mAFwkXUtxQt%2Fimage.png?alt=media&token=f8057c0c-5c5f-41ac-bb4b-ad02ee3d4dc2" width="100px">
|
56 |
+
|
57 |
+
### Powered by
|
58 |
+
OpenThaiGPT Volunteers, Artificial Intelligence Entrepreneur Association of Thailand (AIEAT), and Artificial Intelligence Association of Thailand (AIAT)
|
59 |
+
|
60 |
+
<img src="https://1173516064-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvvbWvIIe82Iv1yHaDBC5%2Fuploads%2F6yWPXxdoW76a4UBsM8lw%2Fimage.png?alt=media&token=1006ee8e-5327-4bc0-b9a9-a02e93b0c032" width="100px">
|
61 |
+
<img src="https://1173516064-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvvbWvIIe82Iv1yHaDBC5%2Fuploads%2FBwsmSovEIhW9AEOlHTFU%2Fimage.png?alt=media&token=5b550289-e9e2-44b3-bb8f-d3057d74f247" width="100px">
|
62 |
+
|
63 |
+
### Authors
|
64 |
+
* Kobkrit Viriyayudhakorn ([email protected])
|
65 |
+
* Sumeth Yuenyong ([email protected])
|
66 |
+
* Thaweewat Rugsujarit ([email protected])
|
67 |
+
* Jillaphat Jaroenkantasima ([email protected])
|
68 |
+
* Norapat Buppodom ([email protected])
|
69 |
+
* Koravich Sangkaew ([email protected])
|
70 |
+
* Peerawat Rojratchadakorn ([email protected])
|
71 |
+
* Surapon Nonesung ([email protected])
|
72 |
+
* Chanon Utupon ([email protected])
|
73 |
+
* Sadhis Wongprayoon ([email protected])
|
74 |
+
* Nucharee Thongthungwong ([email protected])
|
75 |
+
* Chawakorn Phiantham ([email protected])
|
76 |
+
* Patteera Triamamornwooth ([email protected])
|
77 |
+
* Nattarika Juntarapaoraya ([email protected])
|
78 |
+
* Kriangkrai Saetan ([email protected])
|
79 |
+
* Pitikorn Khlaisamniang ([email protected])
|
80 |
+
|
81 |
+
<i>Disclaimer: Provided responses are not guaranteed.</i>
|