Does it work with local open interpreter, and how many gigs of ram is required?
#24
by
aiworld44
- opened
I tried to do interpreter --local --model mistralai/Mistral-7B-Instruct-v0.1. didnt work
Here is my code. You need to locally save the model in a subfolder ( ./Mistral/ depending on your .py file)
It does work for 1-3 queries. Until it breaks down. As there is absofucking no documentation of how to implement the workflow of Interference to a local pipeline this is the best I got. If people are interested in reverse engineering it. Shot me a message. As it stands now, this is an add front to promote paid services, let's change that.
import gradio as gr
from transformers import pipeline, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("./Mistral/")
pipe = pipeline("text-generation", model="./Mistral/", max_new_tokens=512)
chat_history_tokens = []
def generate(chatlog, is_finished):
global chat_history_tokens
# Get the latest message from chat
new_message = chatlog[-1]['content'] if isinstance(chatlog, list) else chatlog
# Tokenize new message and extend chat history
new_message_tokens = tokenizer.encode(new_message, add_special_tokens=False)
chat_history_tokens = new_message_tokens # We only keep the last message now
# Decode tokens to string for the prompt
prompt = tokenizer.decode(chat_history_tokens)
try:
print("Debug: Sending this prompt to the model:", prompt)
outputs = pipe(prompt, pad_token_id=tokenizer.eos_token_id)
print("Debug: Model's raw output:", outputs)
# Cleanup the generated text
generated_text = outputs[0]['generated_text'].replace(prompt, "").strip()
generated_text = generated_text.replace("Answer:", "").replace("A:", "").strip()
print("Debug: Generated Text After Cleanup:", generated_text)
# Tokenize the model's reply and add it to the history
bot_reply_tokens = tokenizer.encode(generated_text, add_special_tokens=False)
chat_history_tokens.extend(bot_reply_tokens)
except Exception as e:
print("Debug: Caught an exception:", str(e))
return str(e)
return generated_text
iface = gr.ChatInterface(fn=generate)
iface.launch()