metadata
license: llama3
Llama 3 70B Instruct no refusal
This is a model that uses the orthogonal feature ablation as featured in this paper.
Calibration data:
- 256 prompts from jondurbin/airoboros-2.2
- 256 prompts from AdvBench
- The direction is extracted between layer 40 and 41
I haven't tested the model but like the 8B model, may still refuse some instructions. Use this model responsibly, I decline any liability resulting of the use of this model.
I will post the code later.