Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
flashback29 
posted an update May 26
Post
309
Just subscribed the PRO monthly, but still got rate limited when making the inference API call


const api_url = "https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3-8B";
  const payload = JSON.stringify({
    "query": input,
  });

  const body = {
      "headers" : {"Authorization": `Bearer ${API_TOKEN}`},
      "wait_for_model": true,
      "use_gpu": false,
      "method" : "POST",
      "contentType" : "application/json",
      "payload" : payload
    };
  
  var xmlHttp = new XMLHttpRequest();
  xmlHttp.open("POST", api_url, false);
  xmlHttp.send(body);
  return xmlHttp.responseText;

Need some help

How many requests did you do before getting rate limited? (cc @Narsil and @osanseviero )

·

Maybe 10s or 20s