Cross-Encoder for Quora Duplicate Questions Detection
This model was trained using SentenceTransformers Cross-Encoder class.
This model uses roberta-large.
Training Data
This model was trained on the Quora Duplicate Questions dataset.
The model will predict a score between 0 and 1: How likely the two given questions are duplicates.
Note: The model is not suitable to estimate the similarity of questions, e.g. the two questions "How to learn Java" and "How to learn Python" will result in a rahter low score, as these are not duplicates.
Usage and Performance
The trained model can be used like this:
from sentence_transformers import CrossEncoder
model = CrossEncoder('model_name')
scores = model.predict([('Question 1', 'Question 2'), ('Question 3', 'Question 4')])
print(scores)
- Downloads last month
- 135
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.