rbgo commited on
Commit
4b0db67
1 Parent(s): 5de8b0b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -0
README.md CHANGED
@@ -79,3 +79,40 @@ Models are released as sharded safetensors files.
79
  <!-- README_AWQ.md-provided-files end -->
80
 
81
  <!-- README_AWQ.md-text-generation-webui start -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
79
  <!-- README_AWQ.md-provided-files end -->
80
 
81
  <!-- README_AWQ.md-text-generation-webui start -->
82
+
83
+ <!-- How to use start -->
84
+ ## How to use
85
+ You will need the following software packages and python libraries:
86
+ ```json
87
+ build:
88
+ cuda_version: "12.1.1"
89
+ system_packages:
90
+ - "libssl-dev"
91
+ python_packages:
92
+ - "torch==2.1.2"
93
+ - "vllm==0.2.6"
94
+ - "transformers==4.36.2"
95
+ - "accelerate==0.25.0"
96
+ ```
97
+
98
+
99
+ Here is the code for <b>app.py</b>
100
+ ```python
101
+ from vllm import LLM, SamplingParams
102
+
103
+ class InferlessPythonModel:
104
+ def initialize(self):
105
+
106
+ self.sampling_params = SamplingParams(temperature=0.7, top_p=0.95,max_tokens=256)
107
+ self.llm = LLM(model="Inferless/SOLAR-10.7B-Instruct-v1.0-GPTQ", quantization="gptq", dtype="float16")
108
+
109
+ def infer(self, inputs):
110
+ prompts = inputs["prompt"]
111
+ result = self.llm.generate(prompts, self.sampling_params)
112
+ result_output = [[[output.outputs[0].text,output.outputs[0].token_ids] for output in result]
113
+
114
+ return {'generated_result': result_output[0]}
115
+
116
+ def finalize(self):
117
+ pass
118
+ ```