---
base_model: BAAI/bge-small-en
datasets: []
language: []
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
- dot_accuracy@1
- dot_accuracy@3
- dot_accuracy@5
- dot_accuracy@10
- dot_precision@1
- dot_precision@3
- dot_precision@5
- dot_precision@10
- dot_recall@1
- dot_recall@3
- dot_recall@5
- dot_recall@10
- dot_ndcg@10
- dot_mrr@10
- dot_map@100
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:1010
- loss:MultipleNegativesRankingLoss
widget:
- source_sentence: How does Prompt-RAG differ from traditional vector embedding-based
methodologies?
sentences:
- Prompt-RAG differs from traditional vector embedding-based methodologies by adopting
a more direct and flexible retrieval process based on natural language prompts,
eliminating the need for a vector database or an algorithm for indexing and selecting
vectors.
- By introducing a pre-aligned phrase prior to the standard SFT stage, LLMs are
guided to concentrate on the aligned knowledge, thereby unlocking their internal
alignment abilities and improving their performance.
- The accuracy of GPT 3.5 on 2500 overall TeleQnA questions related to 3GPP documents
is 60.1, while the accuracy of GPT 3.5 + Telco-RAG is 6.9 points higher.
- source_sentence: Explain the concept of in-context learning as described in the
paper 'An explanation of in-context learning as implicit Bayesian inference'.
sentences:
- The main theme of the paper is that language models can learn to perform many
tasks in a zero-shot setting, without any explicit supervision.
- In-context learning, as explained in the paper, is a process where a language
model uses the context provided in the input to make predictions or generate outputs
without explicit training on the specific task. The paper argues that this process
can be understood as an implicit form of Bayesian inference.
- The paper was presented in the 55th Annual Meeting of the Association for Computational
Linguistics.
- source_sentence: What is the purpose of the survey conducted by Huang et al. (2023)?
sentences:
- The purpose of the survey conducted by Huang et al. (2023) is to provide a comprehensive
overview of hallucination in large language models, including its principles,
taxonomy, challenges, and open questions.
- The study of Human and American Translation Learning contributes to language development
by understanding the cognitive processes involved in translating between languages,
which can lead to improved teaching methods and translation technology.
- Using profile data, triplet examples are constructed in the format of (š„š, š„ šā,
š„ š+). The anchor example š„š is constructed as the combination of the content
šš and the corresponding label šš.
- source_sentence: Who is the first author of the paper and what is their last name?
sentences:
- The key findings are that Vul-RAG achieves the highest accuracy and pairwise accuracy
among all baselines, substantially outperforming the best baseline LLMAO. It also
achieves the best trade-off between recall and precision.
- The first author of the paper is Nandan Thakur. Their last name is Thakur.
- The paper was presented at the 2022 Conference on Empirical Methods in Natural
Language Processing (EMNLP).
- source_sentence: Compare the top-5 retrieval accuracy of BM25 + MQ and SERM + BF
for the NQ Dataset and HotpotQA.
sentences:
- For the NQ Dataset, SERM + BF has a top-5 retrieval accuracy of 88.22, which is
significantly higher than BM25 + MQ's accuracy of 25.19. For HotpotQA, SERM +
BF was not tested, but BM25 + MQ has a top-5 retrieval accuracy of 49.52.
- The paper was presented at the 17th Annual International ACM-SIGIR Conference
on Research and Development in Information Retrieval.
- The proof for Equation 5 progresses from Equation 20 to Equation 22 by applying
the transformation motivated by Xie et al. [2021] and introducing the term p(R,
x1:iā1|z) to the equation.
model-index:
- name: SentenceTransformer based on BAAI/bge-small-en
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: Unknown
type: unknown
metrics:
- type: cosine_accuracy@1
value: 0.01782178217821782
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.04356435643564356
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.06534653465346535
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.12475247524752475
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.01782178217821782
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.015841584158415842
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.016039603960396043
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.015841584158415842
name: Cosine Precision@10
- type: cosine_recall@1
value: 1.839902956558168e-05
name: Cosine Recall@1
- type: cosine_recall@3
value: 4.498766525563503e-05
name: Cosine Recall@3
- type: cosine_recall@5
value: 7.262670252004521e-05
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.00015079859335392304
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.016300874257683427
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.04234598459845988
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.0018766020656866668
name: Cosine Map@100
- type: dot_accuracy@1
value: 0.01782178217821782
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.04356435643564356
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.06534653465346535
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.12475247524752475
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.01782178217821782
name: Dot Precision@1
- type: dot_precision@3
value: 0.015841584158415842
name: Dot Precision@3
- type: dot_precision@5
value: 0.016039603960396043
name: Dot Precision@5
- type: dot_precision@10
value: 0.015841584158415842
name: Dot Precision@10
- type: dot_recall@1
value: 1.839902956558168e-05
name: Dot Recall@1
- type: dot_recall@3
value: 4.498766525563503e-05
name: Dot Recall@3
- type: dot_recall@5
value: 7.262670252004521e-05
name: Dot Recall@5
- type: dot_recall@10
value: 0.00015079859335392304
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.016300874257683427
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.04234598459845988
name: Dot Mrr@10
- type: dot_map@100
value: 0.0018766020656866668
name: Dot Map@100
- type: cosine_accuracy@1
value: 0.019801980198019802
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.040594059405940595
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.06534653465346535
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.12673267326732673
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.019801980198019802
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.01485148514851485
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.014851485148514853
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.016831683168316833
name: Cosine Precision@10
- type: cosine_recall@1
value: 1.9670857914229207e-05
name: Cosine Recall@1
- type: cosine_recall@3
value: 3.554268094376118e-05
name: Cosine Recall@3
- type: cosine_recall@5
value: 6.67664165823309e-05
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.0001670844654494185
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.01679069935920913
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.04252396668238257
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.002057887757857092
name: Cosine Map@100
- type: dot_accuracy@1
value: 0.019801980198019802
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.040594059405940595
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.06534653465346535
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.12673267326732673
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.019801980198019802
name: Dot Precision@1
- type: dot_precision@3
value: 0.01485148514851485
name: Dot Precision@3
- type: dot_precision@5
value: 0.014851485148514853
name: Dot Precision@5
- type: dot_precision@10
value: 0.016831683168316833
name: Dot Precision@10
- type: dot_recall@1
value: 1.9670857914229207e-05
name: Dot Recall@1
- type: dot_recall@3
value: 3.554268094376118e-05
name: Dot Recall@3
- type: dot_recall@5
value: 6.67664165823309e-05
name: Dot Recall@5
- type: dot_recall@10
value: 0.0001670844654494185
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.01679069935920913
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.04252396668238257
name: Dot Mrr@10
- type: dot_map@100
value: 0.002057887757857092
name: Dot Map@100
- type: cosine_accuracy@1
value: 0.01881188118811881
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.03762376237623762
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.06435643564356436
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.1306930693069307
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.01881188118811881
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.013861386138613862
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.015841584158415842
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.01722772277227723
name: Cosine Precision@10
- type: cosine_recall@1
value: 1.8836739119030395e-05
name: Cosine Recall@1
- type: cosine_recall@3
value: 3.852282962664283e-05
name: Cosine Recall@3
- type: cosine_recall@5
value: 7.907232140954174e-05
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.00018073758516299118
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.01704492626324548
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.04188786735816444
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.002251865468050825
name: Cosine Map@100
- type: dot_accuracy@1
value: 0.01881188118811881
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.03762376237623762
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.06435643564356436
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.1306930693069307
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.01881188118811881
name: Dot Precision@1
- type: dot_precision@3
value: 0.013861386138613862
name: Dot Precision@3
- type: dot_precision@5
value: 0.015841584158415842
name: Dot Precision@5
- type: dot_precision@10
value: 0.01722772277227723
name: Dot Precision@10
- type: dot_recall@1
value: 1.8836739119030395e-05
name: Dot Recall@1
- type: dot_recall@3
value: 3.852282962664283e-05
name: Dot Recall@3
- type: dot_recall@5
value: 7.907232140954174e-05
name: Dot Recall@5
- type: dot_recall@10
value: 0.00018073758516299118
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.01704492626324548
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.04188786735816444
name: Dot Mrr@10
- type: dot_map@100
value: 0.002251865468050825
name: Dot Map@100
- type: cosine_accuracy@1
value: 0.01881188118811881
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.03663366336633663
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.06435643564356436
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.1306930693069307
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.01881188118811881
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.013531353135313529
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.015643564356435644
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.01722772277227723
name: Cosine Precision@10
- type: cosine_recall@1
value: 1.8836739119030395e-05
name: Cosine Recall@1
- type: cosine_recall@3
value: 3.715905688573237e-05
name: Cosine Recall@3
- type: cosine_recall@5
value: 7.929088142504806e-05
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.0001757722267344924
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.01701867523723249
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.0418477919220494
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.0022453604762727357
name: Cosine Map@100
- type: dot_accuracy@1
value: 0.01881188118811881
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.03663366336633663
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.06435643564356436
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.1306930693069307
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.01881188118811881
name: Dot Precision@1
- type: dot_precision@3
value: 0.013531353135313529
name: Dot Precision@3
- type: dot_precision@5
value: 0.015643564356435644
name: Dot Precision@5
- type: dot_precision@10
value: 0.01722772277227723
name: Dot Precision@10
- type: dot_recall@1
value: 1.8836739119030395e-05
name: Dot Recall@1
- type: dot_recall@3
value: 3.715905688573237e-05
name: Dot Recall@3
- type: dot_recall@5
value: 7.929088142504806e-05
name: Dot Recall@5
- type: dot_recall@10
value: 0.0001757722267344924
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.01701867523723249
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.0418477919220494
name: Dot Mrr@10
- type: dot_map@100
value: 0.0022453604762727357
name: Dot Map@100
---
# SentenceTransformer based on BAAI/bge-small-en
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-small-en](https://huggingface.co/BAAI/bge-small-en). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [BAAI/bge-small-en](https://huggingface.co/BAAI/bge-small-en)
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 384 tokens
- **Similarity Function:** Cosine Similarity
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### Full Model Architecture
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the š¤ Hub
model = SentenceTransformer("Areeb-02/bge-small-en-MultiplrRankingLoss-30-Rag-paper-dataset")
# Run inference
sentences = [
'Compare the top-5 retrieval accuracy of BM25 + MQ and SERM + BF for the NQ Dataset and HotpotQA.',
"For the NQ Dataset, SERM + BF has a top-5 retrieval accuracy of 88.22, which is significantly higher than BM25 + MQ's accuracy of 25.19. For HotpotQA, SERM + BF was not tested, but BM25 + MQ has a top-5 retrieval accuracy of 49.52.",
'The proof for Equation 5 progresses from Equation 20 to Equation 22 by applying the transformation motivated by Xie et al. [2021] and introducing the term p(R, x1:iā1|z) to the equation.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```
## Evaluation
### Metrics
#### Information Retrieval
* Evaluated with [InformationRetrievalEvaluator
](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
| Metric | Value |
|:--------------------|:-----------|
| cosine_accuracy@1 | 0.0178 |
| cosine_accuracy@3 | 0.0436 |
| cosine_accuracy@5 | 0.0653 |
| cosine_accuracy@10 | 0.1248 |
| cosine_precision@1 | 0.0178 |
| cosine_precision@3 | 0.0158 |
| cosine_precision@5 | 0.016 |
| cosine_precision@10 | 0.0158 |
| cosine_recall@1 | 0.0 |
| cosine_recall@3 | 0.0 |
| cosine_recall@5 | 0.0001 |
| cosine_recall@10 | 0.0002 |
| cosine_ndcg@10 | 0.0163 |
| cosine_mrr@10 | 0.0423 |
| **cosine_map@100** | **0.0019** |
| dot_accuracy@1 | 0.0178 |
| dot_accuracy@3 | 0.0436 |
| dot_accuracy@5 | 0.0653 |
| dot_accuracy@10 | 0.1248 |
| dot_precision@1 | 0.0178 |
| dot_precision@3 | 0.0158 |
| dot_precision@5 | 0.016 |
| dot_precision@10 | 0.0158 |
| dot_recall@1 | 0.0 |
| dot_recall@3 | 0.0 |
| dot_recall@5 | 0.0001 |
| dot_recall@10 | 0.0002 |
| dot_ndcg@10 | 0.0163 |
| dot_mrr@10 | 0.0423 |
| dot_map@100 | 0.0019 |
#### Information Retrieval
* Evaluated with [InformationRetrievalEvaluator
](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
| Metric | Value |
|:--------------------|:-----------|
| cosine_accuracy@1 | 0.0198 |
| cosine_accuracy@3 | 0.0406 |
| cosine_accuracy@5 | 0.0653 |
| cosine_accuracy@10 | 0.1267 |
| cosine_precision@1 | 0.0198 |
| cosine_precision@3 | 0.0149 |
| cosine_precision@5 | 0.0149 |
| cosine_precision@10 | 0.0168 |
| cosine_recall@1 | 0.0 |
| cosine_recall@3 | 0.0 |
| cosine_recall@5 | 0.0001 |
| cosine_recall@10 | 0.0002 |
| cosine_ndcg@10 | 0.0168 |
| cosine_mrr@10 | 0.0425 |
| **cosine_map@100** | **0.0021** |
| dot_accuracy@1 | 0.0198 |
| dot_accuracy@3 | 0.0406 |
| dot_accuracy@5 | 0.0653 |
| dot_accuracy@10 | 0.1267 |
| dot_precision@1 | 0.0198 |
| dot_precision@3 | 0.0149 |
| dot_precision@5 | 0.0149 |
| dot_precision@10 | 0.0168 |
| dot_recall@1 | 0.0 |
| dot_recall@3 | 0.0 |
| dot_recall@5 | 0.0001 |
| dot_recall@10 | 0.0002 |
| dot_ndcg@10 | 0.0168 |
| dot_mrr@10 | 0.0425 |
| dot_map@100 | 0.0021 |
#### Information Retrieval
* Evaluated with [InformationRetrievalEvaluator
](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
| Metric | Value |
|:--------------------|:-----------|
| cosine_accuracy@1 | 0.0188 |
| cosine_accuracy@3 | 0.0376 |
| cosine_accuracy@5 | 0.0644 |
| cosine_accuracy@10 | 0.1307 |
| cosine_precision@1 | 0.0188 |
| cosine_precision@3 | 0.0139 |
| cosine_precision@5 | 0.0158 |
| cosine_precision@10 | 0.0172 |
| cosine_recall@1 | 0.0 |
| cosine_recall@3 | 0.0 |
| cosine_recall@5 | 0.0001 |
| cosine_recall@10 | 0.0002 |
| cosine_ndcg@10 | 0.017 |
| cosine_mrr@10 | 0.0419 |
| **cosine_map@100** | **0.0023** |
| dot_accuracy@1 | 0.0188 |
| dot_accuracy@3 | 0.0376 |
| dot_accuracy@5 | 0.0644 |
| dot_accuracy@10 | 0.1307 |
| dot_precision@1 | 0.0188 |
| dot_precision@3 | 0.0139 |
| dot_precision@5 | 0.0158 |
| dot_precision@10 | 0.0172 |
| dot_recall@1 | 0.0 |
| dot_recall@3 | 0.0 |
| dot_recall@5 | 0.0001 |
| dot_recall@10 | 0.0002 |
| dot_ndcg@10 | 0.017 |
| dot_mrr@10 | 0.0419 |
| dot_map@100 | 0.0023 |
#### Information Retrieval
* Evaluated with [InformationRetrievalEvaluator
](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
| Metric | Value |
|:--------------------|:-----------|
| cosine_accuracy@1 | 0.0188 |
| cosine_accuracy@3 | 0.0366 |
| cosine_accuracy@5 | 0.0644 |
| cosine_accuracy@10 | 0.1307 |
| cosine_precision@1 | 0.0188 |
| cosine_precision@3 | 0.0135 |
| cosine_precision@5 | 0.0156 |
| cosine_precision@10 | 0.0172 |
| cosine_recall@1 | 0.0 |
| cosine_recall@3 | 0.0 |
| cosine_recall@5 | 0.0001 |
| cosine_recall@10 | 0.0002 |
| cosine_ndcg@10 | 0.017 |
| cosine_mrr@10 | 0.0418 |
| **cosine_map@100** | **0.0022** |
| dot_accuracy@1 | 0.0188 |
| dot_accuracy@3 | 0.0366 |
| dot_accuracy@5 | 0.0644 |
| dot_accuracy@10 | 0.1307 |
| dot_precision@1 | 0.0188 |
| dot_precision@3 | 0.0135 |
| dot_precision@5 | 0.0156 |
| dot_precision@10 | 0.0172 |
| dot_recall@1 | 0.0 |
| dot_recall@3 | 0.0 |
| dot_recall@5 | 0.0001 |
| dot_recall@10 | 0.0002 |
| dot_ndcg@10 | 0.017 |
| dot_mrr@10 | 0.0418 |
| dot_map@100 | 0.0022 |
## Training Details
### Training Dataset
#### Unnamed Dataset
* Size: 1,010 training samples
* Columns: anchor
and positive
* Approximate statistics based on the first 1000 samples:
| | anchor | positive |
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
| type | string | string |
| details |
What is the purpose of the MultiHop-RAG dataset and what does it consist of?
| The MultiHop-RAG dataset is developed to benchmark Retrieval-Augmented Generation (RAG) for multi-hop queries. It consists of a knowledge base, a large collection of multi-hop queries, their ground-truth answers, and the associated supporting evidence. The dataset is built using an English news article dataset as the underlying RAG knowledge base.
|
| Among Google, Apple, and Nvidia, which company reported the largest profit margins in their third-quarter reports for the fiscal year 2023?
| Apple reported the largest profit margins in their third-quarter reports for the fiscal year 2023.
|
| Under what circumstances should the LLM answer the questions?
| The LLM should answer the questions based solely on the information provided in the paragraphs, and it should not use any other information.
|
* Loss: [MultipleNegativesRankingLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
```json
{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
```
### Training Hyperparameters
#### Non-Default Hyperparameters
- `eval_strategy`: steps
- `per_device_train_batch_size`: 16
- `per_device_eval_batch_size`: 16
- `num_train_epochs`: 10
- `warmup_ratio`: 0.1
- `fp16`: True
#### All Hyperparameters