MichalMlodawski
/

nsfw-text-detection-large

Text Classification

Inference Endpoints

Model card Files Files and versions Community

nsfw-text-detection-large / README.md

MichalMlodawski's picture

MichalMlodawski

Upload 10 files

fc4f693 verified about 2 months ago

|

history blame contribute delete

No virus

4.96 kB

	---
	license: cc-by-nc-nd-4.0
	language:
	- en

	model-index:
	- name: roberta-large Image Prompt Classifier
	results:
	- task:
	type: text-classification
	dataset:
	name: nsfw-text-detection
	type: custom
	metrics:
	- name: Accuracy
	type: self-reported
	value: 93%
	- name: Precision
	type: self-reported
	value: 88%
	- name: Recall
	type: self-reported
	value: 90%
	---

	# roberta-large Image Prompt Classifier

	## Model Overview

	This model is a fine-tuned version of `roberta-large` designed specifically for classifying image generation prompts into three distinct categories: SAFE, QUESTIONABLE, and UNSAFE. Leveraging the robust capabilities of the `roberta-large` architecture, this model ensures high accuracy and reliability in identifying the nature of prompts used for generating images.

	## Model Details

	- Model Name: roberta-large Image Prompt Classifier
	- Base Model: [roberta-large](https://huggingface.co/roberta-large)
	- Fine-tuned By: Michał Młodawski
	- Categories:
	- `0`: SAFE
	- `1`: QUESTIONABLE
	- `2`: UNSAFE

	## Use Cases

	This model is particularly useful for platforms and applications involving AI-generated content, where it is crucial to filter and classify prompts to maintain content safety and appropriateness. Some potential applications include:

	- Content Moderation: Automatically classify and filter prompts to prevent the generation of inappropriate or harmful images.
	- User Safety: Enhance user experience by ensuring that generated content adheres to safety guidelines.
	- Compliance: Help platforms comply with regulatory requirements by identifying and flagging potentially unsafe prompts.

	## How It Works

	The model takes an input prompt and classifies it into one of three categories:

	1. SAFE: Prompts that are deemed appropriate and free from harmful content.
	2. QUESTIONABLE: Prompts that may require further review due to potential ambiguity or slight risk.
	3. UNSAFE: Prompts that are likely to generate inappropriate or harmful content.

	The classification is based on the semantic understanding and contextual analysis provided by the `roberta-large` architecture, fine-tuned on a curated dataset tailored for this specific task.

	## Performance

	Extensive testing and validation have been conducted to ensure the model's performance meets high standards of accuracy and reliability. Key metrics include:

	- Accuracy: 93%
	- Precision: 88%
	- Recall: 90%

	These metrics highlight the model's capability to effectively distinguish between different categories of prompts with high confidence.

	## Getting Started

	To start using the roberta-large Image Prompt Classifier, you can easily integrate it into your existing workflows with the following steps:

	### Installation

	pip install transformers==4.37.2
	pip install torch==2.3.1

	### Usage

	import os
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch


	model_repo = "MichalMlodawski/nsfw-text-detection-large"

	tokenizer = AutoTokenizer.from_pretrained(model_path)
	model = AutoModelForSequenceClassification.from_pretrained(model_path)

	def classify_text(text):
	inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512)

	with torch.no_grad():
	outputs = model(**inputs)

	logits = outputs.logits
	predicted_class = torch.argmax(logits, dim=1).item()

	return predicted_class

	if __name__ == "__main__":
	examples = [
	{"text": "wild west cowboy caravan through Arizona Desert , Men Woman and children, horse drawn carriages", "expected_class": 0},
	{"text": "A person with questionable intentions in a dimly lit alley.", "expected_class": 1},
	{"text": "An inappropriate scene involving explicit content.", "expected_class": 2}
	]

	for example in examples:
	text = example["text"]
	expected_class = example["expected_class"]
	predicted_class = classify_text(text)
	print(f"Prompt: {text}")
	print(f"Predicted class: {predicted_class}")
	print(f"Expected class: {expected_class}")
	print("---")

	### Disclaimer

	The dataset includes actual prompts for image generation that could be perceived as abusive, offensive, or obscene. Furthermore, the examples and data might contain unfavorable information about certain businesses. We merely collect this data and do not assume any legal responsibility.
	Please note: A portion of the data was created using Large Language Models (LLM).