How to use in llama.cpp server

#15

by subbur - opened May 14

Discussion

subbur

May 14

How to use this chat template in llama.cpp server, should I copy paste, in Prompt template box?

zihanliu

NVIDIA org May 15

•

edited May 15

Hi,
Unfortunately, I am not very familiar with llama.cpp server. I guess a new chat template needs to be implemented based on the prompt template we provide in the model card. You can also check the sample code for more details of the prompt template.

omeryentur

May 16

First of all, you need to convert it to gguf format. You can do it with this notebook https://colab.research.google.com/drive/1P646NEg33BZy4BfLDNpTz0V0lwIU3CHu#scrollTo=fD24jJxq7t3k. Afterwards, you can install llama.cpp and run the model. ./main -m model.gguf -n 256 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment