metadata
library_name: transformers
license: cc-by-4.0
datasets:
- hendrycks/ethics
Model Card for Model ID
Fine-tuned version of Phi-3-mini-4k-instruct on a subset of the hendrycks/ethics dataset
How to Get Started with the Model
Use the code below to get started with the model.
Install the latest version of the following python libraries:
-torch
-accelerate
-peft
-bitsandbytes
Run the model
from transformers import AutoModelForCausalLM
from peft import PeftModel
base_model = AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
peft_model_id = "fc91/phi3-mini-instruct-full_ethics-lora_v2.5"
model = PeftModel.from_pretrained(base_model, peft_model_id)
Run the model with a quantization configuration
import torch, accelerate, peft
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, pipeline
from peft import PeftModel
# Set up quantization configuration
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=getattr(torch, "float16")
)
# Load the base model with quantization
base_model = AutoModelForCausalLM.from_pretrained(
"microsoft/Phi-3-mini-4k-instruct",
quantization_config=quantization_config,
device_map="auto",
attn_implementation='eager',
torch_dtype="auto",
trust_remote_code=True,
)
peft_model_id = "fc91/phi3-mini-instruct-full_ethics-lora_v2.5"
model = PeftModel.from_pretrained(base_model, peft_model_id)
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
messages = [
{"role": "system", "content": "You are a helpful AI assistant sensitive to ethical concerns. Carefully read and interpret the user prompt under a [SPECIFY ETHICAL THEORY] perspective. Does it represent an 'ethical' or an 'unethical' [SPECIFY ETHICAL THEORY] reply? Respond ONLY with 'ethical' or 'unethical"},
{"role": "user", "content": [PROVIDE USER CONTENT]},
{"role": "assistant", "content": "The user reply is..."},
]
pipe = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
)
generation_args = {
"max_new_tokens": 1000,
"return_full_text": False,
"temperature": 0.5,
"do_sample": False,
}
# Run inference
output = pipe(messages, **generation_args)
print(output[0]['generated_text'])
Training Details
Training Data
The following subsets of the above dataset were leveraged:
-commonsense/train (13.9k random samples)
-commonsense/validation (3.6k random samples)
-deontology/train (18.2k random samples)
-deontology/validation (2.8k random samples)
-justice/train (21k random samples)
-utilitarianism/train (21k random samples)
Training Procedure
Training Hyperparameters
per_device_train_batch_size=64
per_device_eval_batch_size=64
gradient_accumulation_steps=2
gradient_checkpointing=True
warmup_steps=100
num_train_epochs=1
learning_rate=0.00005
weight_decay=0.01
optim="adamw_hf"
fp16=True
Speeds, Sizes, Times
The overall training took 5 hours and 24 minutes.
Evaluation
Training Loss = 0.210800
Validation Loss = 0.234834
Testing Data, Factors & Metrics
Testing Data
The following subsets of the above dataset were leveraged:
-commonsense/test (2.5k random samples)
-deontology/test (2.5k random samples)
-justice/test (2.5k random samples)
-utilitarianism/test (2.5k random samples)
Hardware
6xNVIDIA A100-SXM4-40GB