roberta-large-condaqa-neg-tag-token-classification-v2
This model is a fine-tuned version of roberta-large on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.0443
- Precision: 0.0
- Recall: 0.0
- F1: 0.0
- Accuracy: 0.9928
Model description
Negation detector. A roberta-large used for detecting negation words in sentences. A negation word will get label "Y".
Intended uses & limitations
Because the negation style in training dataset(2250 items) is not enough, maybe some kinds of negated sentences will get all "N" label.
Training and evaluation data
Using negation annotation and sentence from CondaQA and cd-sco. You can get the CondaQA dataset through both github and huggingface. As for github: https://github.com/AbhilashaRavichander/CondaQA (CondaQA) and https://github.com/mosharafhossain/negation-cue (cd-sco data). Common negation cues in CondaQA: ['halt', 'inhospitable', 'unhappy', 'unserviceable', 'dislike', 'unaware', 'unfavorable', 'barely', 'unseen', 'unoccupied', 'unreliability', 'insulator', 'stop', 'indistinguishable', 'unrestricted', 'unfairly', 'unsupervised', 'unicameral', 'forbid', 'unforgettable', 'reject', 'uneducated', 'unlimited', 'illegal', 'uncertainty', 'nonhuman', 'unborn', 'unshaven', 'uncanny', 'incomplete', 'unsure', 'unconscious', 'atypical', 'indirectly', 'unloaded', 'disadvantage', 'contrary', 'infrequent', 'unofficial', 'few', 'untouched', 'refuse', 'inequitable', 'disproportionate', 'unexpected', 'displeased', 'unpaved', 'unwieldy', 'not at all', 'absent', 'unnoticed', 'unpleasant', 'unsafe', 'unsigned', 'not', 'inaccurate', 'cannot', 'involuntary', 'unequipped', 'illiterate', 'cease', 'disagreeable', 'prohibit', 'unable', 'unstable', 'uninhabited', 'unclean', 'useless', 'disapprove', 'insensitive', 'in the absence of', 'impractical', 'unorthodox', 'untreated', 'unsuccessful', 'unwitting', 'unfashionable', 'disagreement', 'unmyelinated', 'unfortunate', 'unknown', 'ineffective', 'a lack of', 'instead of', 'refused', 'illegitimate', 'little', 'unpaid', 'fail', 'unintentionally', 'unglazed', "didn't", 'unprocessed', 'inability', 'undeveloped', 'exclude', 'neither', 'except', 'unequivocal', 'unconventional', 'incorrectly', 'unconditional', 'prevent', 'dissimilar', 'uncommon', 'inorganic', 'unquestionable', 'uncoated', 'unassisted', 'unprecedented', 'nonviolent', 'unarmed', 'unpopular', 'inadequate', 'uncomfortable', 'unwilling', 'unaffected', 'unfaithful', 'nobody', 'loss', 'without', 'undamaged', 'nothing', 'could not', 'impossible to', 'unaccompanied', 'unlike', 'oppose', 'compromising', 'unmarried', 'rarely', 'unlighted', 'inexperienced', 'rather than', 'unrelated', 'untied', 'dishonest', 'insecure', 'uneven', 'harmless', 'avoid', 'with the exception of', 'no', 'undefeated', 'no longer', 'inadvertently', 'absence', 'lack', 'unconnected', 'unfinished', 'invalid', 'unnecessary', 'invisibility', 'unusual', 'none', 'incredulous', 'impossible', 'never', 'untrained', 'incorrect', 'immobility', 'unclear', 'impartial', 'unlucky', 'deny', 'uncertain', 'hardly', 'unsaturated', 'informal', 'irregular', 'dissatisfaction'] More information needed
Training procedure
Use code from huggingface source(token-classification).
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10.0
Training results
Framework versions
- Transformers 4.25.0.dev0
- Pytorch 1.10.1
- Datasets 2.6.1
- Tokenizers 0.13.1
- Downloads last month
- 3