Mechanistic Interpretability Benchmark

university

AI & ML interests

Principled evaluation of mechanistic interpretability methods.

models

None public yet

datasets

None public yet