Choosing sampling or greedy.
#240
by
Tristo
- opened
On bigscience/bloom on the huggingface thing, there is an option for sampling or greedy. When we use the API, it automatically uses greedy (I assume). how do I switch that to sampling?
import requests
API_URL = "https://api-inference.huggingface.co/models/bigscience/bloom"
headers = {"Authorization": "Bearer token"}
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()
output = query({
"inputs": "Can you please let us know more details about your ",
})
print(output)