|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- pankajmathur/WizardLM_Orca |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
--- |
|
--- |
|
|
|
base_model: mistralai/Mistral-7B-v0.1 |
|
(=checkpoint-v1) |
|
|
|
base_model: mistralai/Mistral-7B-v0.2 |
|
(>=checkpoint-v2) |
|
--- |
|
|
|
## Reverse Instruct LoRA Adapter |
|
|
|
This LoRA Adapter is fine tuned to reverse engineer the original prompt of a given LLM output/response. |
|
|
|
## Response Format |
|
|
|
"[INST]\n### System:\n{system}\n### Instruction:\n{instruction}\n[/INST]\n" |
|
|
|
|
|
## Prompt Template |
|
|
|
"\n### System:\nYou craft instructions for generating the given output through reverse engineering.\n### Instruction:\nDecipher the steps used to produce the given output and articulate a refined set of instructions (System & Instruction).\n### OUTPUT:\n {output}" |
|
|
|
(use the template without the " ") |
|
|
|
## Training Dataset |
|
|
|
About 21k items of the following datasets were used. (mostly coding-like tasks were removed) |
|
|
|
```bash |
|
wget https://raw.githubusercontent.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM/main/data/alpaca_gpt4_data.json |
|
wget https://raw.githubusercontent.com/teknium1/GPTeacher/main/Roleplay%20Supplemental/roleplay-instruct-v2.1.json |
|
wget https://huggingface.co/datasets/pankajmathur/WizardLM_Orca/resolve/main/wizardlm_orca.json |
|
``` |
|
|
|
## Training Procedure |
|
|
|
```bash |
|
CUDA_VISIBLE_DEVICES=0 WANDB_DISABLED=True python LLaMA-Factory/src/train_bash.py \ |
|
--stage sft \ |
|
--model_name_or_path model_name_or_path \ |
|
--checkpoint_dir checkpoint_dir \ |
|
--flash_attn \ |
|
--shift_attn \ |
|
--neftune_noise_alpha 5 \ |
|
--do_train \ |
|
--dataset default \ |
|
--template vanilla \ |
|
--finetuning_type lora \ |
|
--lora_target q_proj,v_proj \ |
|
--output_dir path_to_sft_checkpoint \ |
|
--overwrite_cache \ |
|
--per_device_train_batch_size 1 \ |
|
--gradient_accumulation_steps 1 \ |
|
--lr_scheduler_type cosine \ |
|
--logging_steps 10 \ |
|
--save_steps 100 \ |
|
--learning_rate 5e-5 \ |
|
--num_train_epochs 3.0 \ |
|
--plot_loss \ |
|
--fp16 \ |
|
--overwrite_output_dir \ |
|
--cutoff_len 2048 \ |
|
--quantization_bit 4 |
|
``` |
|
|
|
## Training Time |
|
|
|
- v1: ~12h on Kaggle's P100 GPU |
|
- v2: >30h on Kaggle's T4 x2 |
|
|
|
### Framework versions |
|
|
|
- LLaMA-Factory |