Suggestion

#3
by KingNish - opened
Poscye org

The Hugging Face InferenceClient is a quick and efficient way to obtain model responses, It also provides many models and also didn't require any storage.
Sample Space - https://huggingface.co/spaces/ehristoforu/mixtral-46.7b-chat

By the way Nice Project

Poscye org

Hello @KingNish ,

What a honor!

I found it very amazing you like it our space!

With @Lumpen1 we work on this also we have more spaces ideas to put in here!

Cool thank you for the suggestion we are very noob using gradio. But with this ZeroGPU advantage that HF give us we take the wheels.

client = InferenceClient(
    "mistralai/Mixtral-8x7B-Instruct-v0.1"
)

is this support JSON Schema or Grammars

Nevertheless if wanna be friends or work together we are open to that!

Poscye org
โ€ข
edited May 28

In this scenario, we are letting the llama-cpp-agent framework handle the client-inference, through the use of a provider selection.
Currently, it supports LlamaCPPPython (as shown in this example), LlamaCPPServer, VLLMServer and TGIServer.

Poscye org
client = InferenceClient(
    "mistralai/Mixtral-8x7B-Instruct-v0.1"
)

is this support JSON Schema or Grammars

Sadly, No.

Nevertheless if wanna be friends or work together we are open to that!

I'm definitely interested in exploring ways we can collaborate and share ideas.

Poscye org

Nice we can talk on discord of llama-cpp-agent if you like https://discord.gg/DwVpftn4
We are always open and friendly.

Poscye org

Also let me know if wanna join this org... The main purpose was to experiment in spaces with ZeroGPU

Poscye org

I would like to join this org.

Sign up or log in to comment