|
--- |
|
license: apache-2.0 |
|
--- |
|
|
|
Stance detection model distilled from a news dataset label by a larger model. The larger model was trained on a combination of |
|
stance datasets in the literature: |
|
|
|
FNC-1 (Pomerleau and Rao, 2017), |
|
Perspectrum (Chen et al., 2019), |
|
ARC (Habernal et al., 2017), |
|
Emergent (Ferreira and Vlachos, 2016) |
|
NewsClaims (Reddy et al., 2021)7. |
|
|
|
|
|
Achieves a .0.5712643678160919 f1-score on hand labeled indomain news data |
|
|
|
|
|
To run: |
|
|
|
``` |
|
from transformers import AutoTokenizer, T5ForConditionalGeneration |
|
model = T5ForConditionalGeneration.from_pretrained('alex2awesome/stance-detection-t5') |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path) |
|
tokenizer_with_prefix_space = AutoTokenizer.from_pretrained(model_name_or_path, add_prefix_space=True) |
|
def get_tokens_as_tuple(word): |
|
return tuple(tokenizer_with_prefix_space([word], add_special_tokens=False).input_ids[0]) |
|
|
|
input_ids = tokenizer(text, return_tensors="pt").input_ids |
|
y_pred_gen_output = model.generate( |
|
input_ids, |
|
renormalize_logits=True, |
|
sequence_bias= { |
|
get_tokens_as_tuple('__Affirm__'): 0.143841, |
|
get_tokens_as_tuple('__Discuss__'): -0.294732, |
|
get_tokens_as_tuple('__Neutral__'): -0.103820, |
|
get_tokens_as_tuple('__Refute__'): 0.637734, |
|
}, |
|
) |
|
|
|
You can tweak the class weights yourself if you want. |
|
``` |
|
|