Text Generation
Transformers
PyTorch
TensorBoard
Safetensors
bloom
Eval Results
text-generation-inference
Inference Endpoints

Choosing sampling or greedy.

#240
by Tristo - opened

On bigscience/bloom on the huggingface thing, there is an option for sampling or greedy. When we use the API, it automatically uses greedy (I assume). how do I switch that to sampling?

import requests

API_URL = "https://api-inference.huggingface.co/models/bigscience/bloom"
headers = {"Authorization": "Bearer token"}

def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()

output = query({
"inputs": "Can you please let us know more details about your ",

})

print(output)

Sign up or log in to comment