OrionStarAI
/

Orion-14B-Chat

Text Generation

Inference Endpoints

Model card Files Files and versions Community

renillhuang commited on Jan 25

Commit

a9189fd

•

1 Parent(s): 00b0e32

Update README_ja.md

Files changed (1) hide show

README_ja.md +33 -4

README_ja.md CHANGED Viewed

@@ -32,7 +32,7 @@
 - [📖 モデル紹介](#model-introduction)
 - [🔗 モデルダウンロード](#model-download)
 - [🔖 モデルベンチマーク](#model-benchmark)
-- [📊 モデル推論](#model-inference)
 - [📜 声明とライセンス](#declarations-license)
 - [🥇 企業紹介](#company-introduction)
@@ -262,9 +262,38 @@ CUDA_VISIBLE_DEVICES=0 python demo/text_generation_base.py --model OrionStarAI/O
 # チャットモデル
 CUDA_VISIBLE_DEVICES=0 python demo/text_generation.py --model OrionStarAI/Orion-14B-Chat --tokenizer OrionStarAI/Orion-14B-Chat --prompt hi
 ```
-## 4.4 例の出力
-### 4.4.1 カジュアルチャット
 `````
 User: Hello
@@ -286,7 +315,7 @@ User: Tell me a joke.
 Orion-14B: Sure, here's a classic one-liner: Why don't scientists trust atoms? Because they make up everything.
 `````
-### 4.4.2. 日本語と韓国語のチャット
 `````
 User：自己を紹介してください

 - [📖 モデル紹介](#model-introduction)
 - [🔗 モデルダウンロード](#model-download)
 - [🔖 モデルベンチマーク](#model-benchmark)
+- [📊 モデル推論](#model-inference)[<img src="./assets/imgs/vllm.png" alt="vllm" height="20"/>](#vllm) [<img src="./assets/imgs/llama_cpp.png" alt="llamacpp" height="20"/>](#llama-cpp)
 - [📜 声明とライセンス](#declarations-license)
 - [🥇 企業紹介](#company-introduction)
 # チャットモデル
 CUDA_VISIBLE_DEVICES=0 python demo/text_generation.py --model OrionStarAI/Orion-14B-Chat --tokenizer OrionStarAI/Orion-14B-Chat --prompt hi
 ```
+## 4.4. Inference by vllm
+- Project URL<br>
+  https://github.com/vllm-project/vllm
+- Pull Request<br>
+  https://github.com/vllm-project/vllm/pull/2539
+<a name="llama-cpp"></a><br>
+## 4.5. Inference by llama.cpp
+- Project URL<br>
+  https://github.com/ggerganov/llama.cpp
+- Pull Request<br>
+  https://github.com/ggerganov/llama.cpp/pull/5118
+- How to convert to GGUF model
+  ```shell
+  python convert-hf-to-gguf.py path/to/Orion-14B-Chat --outfile chat.gguf
+  ```
+- How to run generation
+  ```shell
+  ./main --frequency-penalty 0.5 --frequency-penalty 0.5 --top-k 5 --top-p 0.9 -m chat.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 400 -e
+  ```
+## 4.6 例の出力
+### 4.6.1 カジュアルチャット
 `````
 User: Hello
 Orion-14B: Sure, here's a classic one-liner: Why don't scientists trust atoms? Because they make up everything.
 `````
+### 4.6.2. 日本語と韓国語のチャット
 `````
 User：自己を紹介してください