Every Inference gives the prompt as part of output also, any way to remove that?
#26
by
vermanic
- opened
Hey, i have this general problem of any model on HF outputting with the input prompt always, any way to exclude that?
as I just need the output.
Code:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
checkpoint = "HuggingFaceH4/starchat-beta"
device = "cuda" if torch.cuda.is_available() else "cpu" # "cuda:X" for GPU usage or "cpu" for CPU usage
class Model:
def __init__(self):
print("Running in " + device)
self.tokenizer = AutoTokenizer.from_pretrained(checkpoint)
self.model = AutoModelForCausalLM.from_pretrained(checkpoint, device_map='auto')
def infer(self, input_text, token_count):
inputs = self.tokenizer.encode(input_text, return_tensors="pt").to(device)
outputs = self.model.generate(inputs, max_new_tokens=token_count)
return self.tokenizer.decode(outputs[0])
Also, max_new_tokens
means the amount of tokens with which I want the model to respond with, right?
vermanic
changed discussion title from
Every Inference gives the prompt as part of output also, any way to fix this?
to Every Inference gives the prompt as part of output also, any way to remove that?
Resolved by:
return self.tokenizer.decode(outputs[0])[len(input_text):]
vermanic
changed discussion status to
closed