Problem with 'google/gemma-2-2b-it''s API for Chat completion
Hi !
I am in front of a big problem, while it seems that the API google/gemma-2-2b-it
(Official Hugging Face documentation for 'Chat Completion : curl 'https://api-inference.huggingface.co/models/google/gemma-2-2b-it/v1/chat/completions' \ -H "Authorization: Bearer hf_***" \ -H 'Content-Type: application/json' \ -d '{ "model": "google/gemma-2-2b-it", "messages": [{"role": "user", "content": "What is the capital of France?"}], "max_tokens": 500, "stream": false }'
) is not working for "Chat Completion".
The address 'https://api-inference.huggingface.co/models/google/gemma-2-2b-it/v1/chat/completions' points to ```// 20240918223200
// https://api-inference.huggingface.co/models/google/gemma-2-2b-it/v1/chat/completions
{
"error": "Model google/gemma-2-2b-it/v1/chat/completions does not exist"
}```.
Which correct API could i use in order to call properly the google/gemma-2-2b-it
Chat completion please ?
Thx !
I was able to reproduce the issue. To resolve it, please use the following API endpoint: https://api-inference.huggingface.co/models/google/gemma-2-2b-it and refer to the corrected code below:
Thank you.
Thx @GopiUppari for your answer.
Yeah it works for me this way, anyway it appears that this solution reproduce a 'text-to-text' AI API call.
Unfortunately it doesn't reproduce a 'Chat completion', or a conversation with google/gemma-2-2b-it
.
It doesn't accept the "messages": [ { "role": "user", "content": "What is the best approach for integrating AI and blockchain technologies in a decentralized application?" } ],
option from { "model": "google/gemma-2-2b-it", "messages": [ { "role": "user", "content": "What is the best approach for integrating AI and blockchain technologies in a decentralized application?" } ], "max_tokens": 500, "temperature": 0.7, "top_p": 0.95, "repetition_penalty": 1.15, "stream": false }
body-request pattern.
Indeed, the 'Chat completion' documentation says that curl 'https://api-inference.huggingface.co/models/google/gemma-2-2b-it/v1/chat/completions' \ -H "Authorization: Bearer hf_***" \ -H 'Content-Type: application/json' \ -d '{ "model": "google/gemma-2-2b-it", "messages": [{"role": "user", "content": "What is the capital of France?"}], "max_tokens": 500, "stream": false }
should work, but it didn't due to https://api-inference.huggingface.co/models/google/gemma-2-2b-it/v1/chat/completions
API which doesn't exist.
How can I use the conversationnal API call of google/gemma-2-2b-it
please ?
Thx !
In the documentation, passing the chat template format to the tokenizer.apply_chat_template
function returns a string format (<class 'str'>
) that the model can interpret. You can use this same formatted string in the curl
command to ensure the model understands the input correctly.
Thank you.
Thx for your answer @GopiUppari !