AndyChiang's picture
Create README.md
9d0d17d verified
|
raw
history blame
5.65 kB
metadata
license: mit
language: en
tags:
  - Pre-CoFactv3
  - Text-Classification
datasets:
  - FACTIFY5WQA
metrics:
  - accuracy
pipeline_tag: text-classification
library_name: transformers
base_model: microsoft/deberta-v3-large
widget:
  - text: >-
      BREAKING: Another nearly 1.9 million Americans filed for unemployment
      insurance last week, the Department of Labor said. https://t.co/dVwyI6avmx
      [SEP] By Anneken Tappe, CNN BusinessUpdated 11:50 AM ET, Thu June 4, 2020
      New York (CNN Business)Millions of Americans again filed for unemployment
      benefits last week, as the coronavirus recession drags on.
    example_title: Support
  - text: >-
      Micah Richards spent an entire season at Aston Vila without playing a
      single game. [SEP] Despite speculation that Richards would leave Aston
      Villa before the transfer deadline for the 2018~19 season , he remained at
      the club , although he is not being considered for first team selection.
    example_title: Neutral
  - text: >-
      Mahatma Gandhi having breakfast with British official inside the jail.
      [SEP] A photo is being shared on Facebook with a claim that Gandhi was
      having breakfast with British officials inside the jail while people are
      fighting for Independence.  Let’s try to check the authenticity of the
      image in the post. Claim: Mahatma Gandhi having breakfast with British
      official inside the jail. Fact:  The photo was not taken inside the jail. 
      It was taken during a breakfast meeting between Gandhi and Mountbatten at
      Viceroy’s House in April 1947.  Hence the claim made in the post is FALSE.
      When the image in the post is run Google Reverse Image Search, a link to
      Getty Images website containing the same image can be found in the search
      results.  In that website, the image has a description which reads,
      “Breakfast meeting between Mahatma Gandhi and Viceroy of India, Lord
      Mountbatten 1947”.  Also, in the book ‘India Remembered’ written by Pamela
      Mountbatten (the daughter of Lord Mountbatten), the same image can be
      found in the ‘A Huge Task’ chapter.  She writes that the photo was taken
      on 1st April 1947 at the Viceroy’s House.  The Viceroy invited Gandhi for
      breakfast to discuss the transfer of power, declared by England’s PM
      Clement R.  Atlee in February 1947.  So, the photo was not taken inside
      the jail. To sum it up, the photo was taken in April 1947 at the Viceroy’s
      house, not inside the jail. Did you watch our Facebook live on Fake News
      (Misinformation).
    example_title: Refute

Pre-CoFactv3-Text-Classification

Model description

This is a Text Classification model for AAAI 2024 Workshop Paper: “Team Trifecta at Factify5WQA: Setting the Standard in Fact Verification with Fine-Tuning”

Its input are claim and evidence, and output is the predicted label, which falls into one of the categories: Support, Neutral, or Refute.

It is fine-tuned by FACTIFY5WQA dataset based on microsoft/deberta-v3-large model.

For more details, you can see our paper or GitHub.

How to use?

  1. Download the model by hugging face transformers.
from transformers import AutoModelForSequenceClassification, AutoTokenizer

model = AutoModelForSequenceClassification.from_pretrained("AndyChiang/Pre-CoFactv3-Text-Classification")
tokenizer = AutoTokenizer.from_pretrained("AndyChiang/Pre-CoFactv3-Text-Classification")
  1. Create a pipeline.
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
  1. Use the pipeline to predict the label.
label = classifier("Micah Richards spent an entire season at Aston Vila without playing a single game. [SEP] Despite speculation that Richards would leave Aston Villa before the transfer deadline for the 2018~19 season , he remained at the club , although he is not being considered for first team selection.")
print(label)

Dataset

We utilize the dataset FACTIFY5WQA provided by the AAAI-24 Workshop Factify 3.0.

This dataset is designed for fact verification, with the task of determining the veracity of a claim based on the given evidence.

  • claim: the statement to be verified.
  • evidence: the facts to verify the claim.
  • question: the questions generated from the claim by the 5W framework (who, what, when, where, and why).
  • claim_answer: the answers derived from the claim.
  • evidence_answer: the answers derived from the evidence.
  • label: the veracity of the claim based on the given evidence, which is one of three categories: Support, Neutral, or Refute.
Training Validation Testing Total
Support 3500 750 750 5000
Neutral 3500 750 750 5000
Refute 3500 750 750 5000
Total 10500 2250 2250 15000

Fine-tuning

Fine-tuning is conducted by the Hugging Face Trainer API on the Text Classification task.

Training hyperparameters

The following hyperparameters were used during training:

  • Pre-train language model: microsoft/deberta-v3-large
  • Optimizer: adam
  • Learning rate: 0.00001
  • Max token of input: 650
  • Batch size: 4
  • Epoch: 12
  • Device: NVIDIA RTX A5000

Testing

In the case of the Text Classification task, accuracy serves as the evaluation metric.

Accuracy
0.8502

Other models

AndyChiang/Pre-CoFactv3-Question-Answering

Citation