Edit model card

Model Card for Model ID

Model Details

Model Description

  • Developed by: Declan Bracken, Armando Ordorica, Michael Santorelli, Paul Zhou
  • Model type: Transformer
  • Language(s) (NLP): English
  • Finetuned from model: BERT_base_uncased

Model Sources [optional]

  • Repository: [More Information Needed]
  • Paper [optional]: [More Information Needed]
  • Demo [optional]: [More Information Needed]

Uses

Direct Use

Create a custom class to load in the model, the label encoder, and the BERT tokenizer used for training (bert-base-uncased) as below. use the tokenizer to tokenize any input string you'd like, then pass it through the model to get outputs.

class BERTClassifier: def init(self, model_identifier): # Load the tokenizer from bert base uncased self.tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

    # Load the config
    config = AutoConfig.from_pretrained(model_identifier)

    # Load the model
    self.model = BertForSequenceClassification.from_pretrained(model_identifier, config=config)
    self.model.eval()  # Set the model to evaluation mode

    # Load the label encoder
    encoder_url = f'https://huggingface.co/{model_identifier}/resolve/main/model_encoder.pkl'
    self.labels = pickle.loads(requests.get(encoder_url).content)

def predict_category(self, text):
    # Tokenize the text
    inputs = self.tokenizer(text, return_tensors='pt', truncation=True, padding=True)

    # Predict
    with torch.no_grad():
        outputs = self.model(**inputs)

    # Get the prediction index
    prediction_idx = torch.argmax(outputs.logits, dim=1).item()

    # Decode the prediction index to get the label
    prediction_label = self.labels[prediction_idx]  # Use indexing for a NumPy array

    return prediction_label
Downloads last month
4
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.