k050506koch's picture
Create README.md
809aeca verified
|
raw
history blame
2.31 kB

Model Card for V2 Models

Model Description

This repository contains multiple models trained using the GPT-2 architecture for generating creative stories, superhero names, and abilities. The models are designed to assist in generating narrative content based on user prompts.

Model Variants

  • Story Model: Generates stories based on prompts.
  • Name Model: Generates superhero names based on story context.
  • Abilities Model: Generates superhero abilities based on story context.
  • Midjourney Model: Generates mid-journey prompts for storytelling.

Training Data

The models were trained on a custom dataset stored in batch_ds_v2.txt, which includes various story prompts, superhero names, and abilities. The dataset was preprocessed to extract relevant parts for training.

Training Procedure

  • Framework: PyTorch with Hugging Face Transformers
  • Model: GPT-2
  • Training Arguments:
    • Learning Rate: 1e-4
    • Number of Epochs: 15
    • Max Steps: 5000
    • Batch Size: Auto-detected
    • Gradient Clipping: 1.0
    • Logging Steps: 1

Evaluation

The models were evaluated based on their ability to generate coherent and contextually relevant text. Specific metrics were not provided, but qualitative assessments were made during development.

Inference

To use the models for inference, you can send a POST request to the /generate/<model_path> endpoint of the Flask application. The input should be a JSON object containing the input_text key.

Example Request

json
{
"input_text": "[Ivan Ivanov, Lead Software Engineer, Superhero for Justice, Writing code, fixing issues, solving problems, Masculine, Long Hair, Adult]<endoftext>"
}

Example Response

The response will contain the generated text based on the input prompt.

Limitations

  • The models may generate biased or nonsensical outputs based on the training data.
  • They may not always understand complex prompts or context, leading to irrelevant or inaccurate responses.
  • The models are sensitive to input phrasing; slight changes in the prompt can yield different results.

License

This model is released under the MIT License. Please refer to the LICENSE file for more details.

Citation

If you use this model in your research or applications, please cite it as follows: