Edit model card

Using NeyabAI:

Direct Use:

import torch
from transformers import GPT2LMHeadModel, GPT2TokenizerFast

model = GPT2LMHeadModel.from_pretrained("XsoraS/NeyabAI")
tokenizer = GPT2TokenizerFast.from_pretrained("XsoraS/NeyabAI")
def generate_response(prompt):
    inputs = tokenizer(prompt, return_tensors='pt') # You can add .to(torch.device("cuda")) to use GPU acceleration.
    return tokenizer.decode(model.generate(inputs.input_ids, max_length=512, do_sample=True,top_p=0.8, temperature=0.7, num_return_sequences=1,attention_mask=None)[0],skip_special_tokens=True)

prompt = "Hello"
response = ' '.join(map(str, str(generate_response("### Human: "+prompt+" \n### AI:")).replace("</s>","").split()))
print(response)

Fine-Tuning:

This repository demonstrates how to fine-tune the NeyabAI(GPT-2) language model on a custom dataset using PyTorch and Hugging Face's Transformers library. The code provides an end-to-end example, from loading the dataset to training the model and evaluating its performance.

Requirements

  • Python 3.6+
  • PyTorch
  • Transformers (Hugging Face)
  • NumPy

You can install the required packages using pip:

pip install torch transformers numpy

Fine-Tuning Script

The following script outlines the steps for fine-tuning GPT-2 on a custom dataset:

from transformers import GPT2LMHeadModel, GPT2TokenizerFast, AdamW
import torch
from torch.utils.data import DataLoader, TensorDataset
import numpy as np

# Load pre-trained model and tokenizer
model_name = "XsoraS/NeyabAI"
model = GPT2LMHeadModel.from_pretrained(model_name)
tokenizer = GPT2TokenizerFast.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token

# Example dataset
dataset = ["Your custom dataset goes here."]  # Replace with your actual dataset

# Tokenization function
def tokenize_function(examples):
    return tokenizer(examples, padding='max_length', truncation=True, max_length=512)

# Tokenize the dataset
tokenized_inputs = [tokenize_function(text) for text in dataset]
input_ids = [input['input_ids'] for input in tokenized_inputs]
attention_masks = [input['attention_mask'] for input in tokenized_inputs]

# Convert to torch tensors
input_ids = torch.tensor(input_ids)
attention_masks = torch.tensor(attention_masks)
labels = input_ids.clone()

# Create DataLoader
batch_size = 8
dataset = TensorDataset(input_ids, attention_masks, labels)
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)

# Configure device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model = model.half()

# Set up optimizer
optimizer = AdamW(model.parameters(), lr=3e-5)

# Define accuracy calculation
def calculate_accuracy(preds, labels):
    pred_flat = np.argmax(preds, axis=-1).flatten()
    labels_flat = labels.flatten()
    return np.sum(pred_flat == labels_flat) / len(labels_flat)

# Training loop (simplified)
for epoch in range(3):  # Adjust the number of epochs as needed
    for batch in dataloader:
        batch = tuple(t.to(device) for t in batch)
        input_ids, attention_masks, labels = batch

        outputs = model(input_ids, attention_mask=attention_masks, labels=labels)
        loss = outputs.loss
        logits = outputs.logits

        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

        preds = logits.detach().cpu().numpy()
        label_ids = labels.to('cpu').numpy()
        acc = calculate_accuracy(preds, label_ids)

        print(f"Loss: {loss.item()}, Accuracy: {acc}")

print("Training complete!")

Notes

  • Dataset: Replace the dataset variable with your actual dataset.
  • Max Length: Adjust the max_length parameter in the tokenize_function as needed based on the length of your input texts.
  • Batch Size and Learning Rate: You may need to tune the batch_size and learning rate (lr) according to your dataset and hardware capabilities.
  • Epochs: Adjust the number of epochs based on your convergence criteria.

Acknowledgments

  • This project uses the Transformers library by Hugging Face.
  • Inspired by various fine-tuning examples and tutorials from the Hugging Face community.
Downloads last month
18
Safetensors
Model size
124M params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.