Edit model card

roberta-large for Extractive QA

This is the roberta-large model, fine-tuned using the SQuAD2.0 dataset. It's been trained on question-answer pairs, including unanswerable questions, for the task of Question Answering.

Overview

Language model: roberta-large
Language: English
Downstream-task: Extractive QA
Training data: SQuAD 2.0
Eval data: SQuAD 2.0
Code: See an example extractive QA pipeline built with Haystack
Infrastructure: 4x Tesla v100

Hyperparameters

base_LM_model = "roberta-large"

Using a distilled model instead

Please note that we have also released a distilled version of this model called deepset/roberta-base-squad2-distilled. The distilled model has a comparable prediction quality and runs at twice the speed of the large model.

Usage

In Haystack

Haystack is an AI orchestration framework to build customizable, production-ready LLM applications. You can use this model in Haystack to do extractive question answering on documents. To load and run the model with Haystack:

# After running pip install haystack-ai "transformers[torch,sentencepiece]"

from haystack import Document
from haystack.components.readers import ExtractiveReader

docs = [
    Document(content="Python is a popular programming language"),
    Document(content="python ist eine beliebte Programmiersprache"),
]

reader = ExtractiveReader(model="deepset/roberta-large-squad2")
reader.warm_up()

question = "What is a popular programming language?"
result = reader.run(query=question, documents=docs)
# {'answers': [ExtractedAnswer(query='What is a popular programming language?', score=0.5740374326705933, data='python', document=Document(id=..., content: '...'), context=None, document_offset=ExtractedAnswer.Span(start=0, end=6),...)]}

For a complete example with an extractive question answering pipeline that scales over many documents, check out the corresponding Haystack tutorial.

In Transformers

from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline

model_name = "deepset/roberta-large-squad2"

# a) Get predictions
nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
QA_input = {
    'question': 'Why is model conversion important?',
    'context': 'The option to convert models between FARM and transformers gives freedom to the user and let people easily switch between frameworks.'
}
res = nlp(QA_input)

# b) Load model & tokenizer
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

Authors

Branden Chan: [email protected]
Timo Möller: [email protected]
Malte Pietsch: [email protected]
Tanay Soni: [email protected]

About us

deepset is the company behind the production-ready open-source AI framework Haystack.

Some of our other work:

Get in touch and join the Haystack community

For more info on Haystack, visit our GitHub repo and Documentation.

We also have a Discord community open to everyone!

By the way: we're hiring!

Downloads last month: 16,333

Safetensors

Model size

354M params

Tensor type

I64

F32

Inference Examples

Question Answering

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for deepset/roberta-large-squad2

Base model

FacebookAI/roberta-large

Finetuned

(273)

this model

Finetunes

5 models

Dataset used to train deepset/roberta-large-squad2

Spaces using deepset/roberta-large-squad2 9

Evaluation results

Exact Match on squad_v2
validation set self-reported

85.168
F1 on squad_v2
validation set self-reported

88.349
Exact Match on squad
validation set self-reported

87.162
F1 on squad
validation set self-reported

93.603
Exact Match on adversarial_qa
validation set self-reported

35.900
F1 on adversarial_qa
validation set self-reported

48.923
Exact Match on squad_adversarial
validation set self-reported

81.142
F1 on squad_adversarial
validation set self-reported

87.099
Exact Match on squadshifts amazon
test set self-reported

72.453
F1 on squadshifts amazon
test set self-reported

86.325

View on Papers With Code