---
base_model: BAAI/bge-small-en
datasets: []
language: []
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
- dot_accuracy@1
- dot_accuracy@3
- dot_accuracy@5
- dot_accuracy@10
- dot_precision@1
- dot_precision@3
- dot_precision@5
- dot_precision@10
- dot_recall@1
- dot_recall@3
- dot_recall@5
- dot_recall@10
- dot_ndcg@10
- dot_mrr@10
- dot_map@100
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:1010
- loss:MultipleNegativesRankingLoss
widget:
- source_sentence: How does Prompt-RAG differ from traditional vector embedding-based
    methodologies?
  sentences:
  - Prompt-RAG differs from traditional vector embedding-based methodologies by adopting
    a more direct and flexible retrieval process based on natural language prompts,
    eliminating the need for a vector database or an algorithm for indexing and selecting
    vectors.
  - By introducing a pre-aligned phrase prior to the standard SFT stage, LLMs are
    guided to concentrate on the aligned knowledge, thereby unlocking their internal
    alignment abilities and improving their performance.
  - The accuracy of GPT 3.5 on 2500 overall TeleQnA questions related to 3GPP documents
    is 60.1, while the accuracy of GPT 3.5 + Telco-RAG is 6.9 points higher.
- source_sentence: Explain the concept of in-context learning as described in the
    paper 'An explanation of in-context learning as implicit Bayesian inference'.
  sentences:
  - The main theme of the paper is that language models can learn to perform many
    tasks in a zero-shot setting, without any explicit supervision.
  - In-context learning, as explained in the paper, is a process where a language
    model uses the context provided in the input to make predictions or generate outputs
    without explicit training on the specific task. The paper argues that this process
    can be understood as an implicit form of Bayesian inference.
  - The paper was presented in the 55th Annual Meeting of the Association for Computational
    Linguistics.
- source_sentence: What is the purpose of the survey conducted by Huang et al. (2023)?
  sentences:
  - The purpose of the survey conducted by Huang et al. (2023) is to provide a comprehensive
    overview of hallucination in large language models, including its principles,
    taxonomy, challenges, and open questions.
  - The study of Human and American Translation Learning contributes to language development
    by understanding the cognitive processes involved in translating between languages,
    which can lead to improved teaching methods and translation technology.
  - Using profile data, triplet examples are constructed in the format of (𝑥𝑖, 𝑥 𝑖−,
    𝑥 𝑖+). The anchor example 𝑥𝑖 is constructed as the combination of the content
    𝑐𝑖 and the corresponding label 𝑙𝑖.
- source_sentence: Who is the first author of the paper and what is their last name?
  sentences:
  - The key findings are that Vul-RAG achieves the highest accuracy and pairwise accuracy
    among all baselines, substantially outperforming the best baseline LLMAO. It also
    achieves the best trade-off between recall and precision.
  - The first author of the paper is Nandan Thakur. Their last name is Thakur.
  - The paper was presented at the 2022 Conference on Empirical Methods in Natural
    Language Processing (EMNLP).
- source_sentence: Compare the top-5 retrieval accuracy of BM25 + MQ and SERM + BF
    for the NQ Dataset and HotpotQA.
  sentences:
  - For the NQ Dataset, SERM + BF has a top-5 retrieval accuracy of 88.22, which is
    significantly higher than BM25 + MQ's accuracy of 25.19. For HotpotQA, SERM +
    BF was not tested, but BM25 + MQ has a top-5 retrieval accuracy of 49.52.
  - The paper was presented at the 17th Annual International ACM-SIGIR Conference
    on Research and Development in Information Retrieval.
  - The proof for Equation 5 progresses from Equation 20 to Equation 22 by applying
    the transformation motivated by Xie et al. [2021] and introducing the term p(R,
    x1:i−1|z) to the equation.
model-index:
- name: SentenceTransformer based on BAAI/bge-small-en
  results:
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: Unknown
      type: unknown
    metrics:
    - type: cosine_accuracy@1
      value: 0.01782178217821782
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.04356435643564356
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.06534653465346535
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.12475247524752475
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.01782178217821782
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.015841584158415842
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.016039603960396043
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.015841584158415842
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 1.839902956558168e-05
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 4.498766525563503e-05
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 7.262670252004521e-05
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.00015079859335392304
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.016300874257683427
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.04234598459845988
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.0018766020656866668
      name: Cosine Map@100
    - type: dot_accuracy@1
      value: 0.01782178217821782
      name: Dot Accuracy@1
    - type: dot_accuracy@3
      value: 0.04356435643564356
      name: Dot Accuracy@3
    - type: dot_accuracy@5
      value: 0.06534653465346535
      name: Dot Accuracy@5
    - type: dot_accuracy@10
      value: 0.12475247524752475
      name: Dot Accuracy@10
    - type: dot_precision@1
      value: 0.01782178217821782
      name: Dot Precision@1
    - type: dot_precision@3
      value: 0.015841584158415842
      name: Dot Precision@3
    - type: dot_precision@5
      value: 0.016039603960396043
      name: Dot Precision@5
    - type: dot_precision@10
      value: 0.015841584158415842
      name: Dot Precision@10
    - type: dot_recall@1
      value: 1.839902956558168e-05
      name: Dot Recall@1
    - type: dot_recall@3
      value: 4.498766525563503e-05
      name: Dot Recall@3
    - type: dot_recall@5
      value: 7.262670252004521e-05
      name: Dot Recall@5
    - type: dot_recall@10
      value: 0.00015079859335392304
      name: Dot Recall@10
    - type: dot_ndcg@10
      value: 0.016300874257683427
      name: Dot Ndcg@10
    - type: dot_mrr@10
      value: 0.04234598459845988
      name: Dot Mrr@10
    - type: dot_map@100
      value: 0.0018766020656866668
      name: Dot Map@100
    - type: cosine_accuracy@1
      value: 0.019801980198019802
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.040594059405940595
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.06534653465346535
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.12673267326732673
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.019801980198019802
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.01485148514851485
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.014851485148514853
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.016831683168316833
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 1.9670857914229207e-05
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 3.554268094376118e-05
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 6.67664165823309e-05
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.0001670844654494185
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.01679069935920913
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.04252396668238257
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.002057887757857092
      name: Cosine Map@100
    - type: dot_accuracy@1
      value: 0.019801980198019802
      name: Dot Accuracy@1
    - type: dot_accuracy@3
      value: 0.040594059405940595
      name: Dot Accuracy@3
    - type: dot_accuracy@5
      value: 0.06534653465346535
      name: Dot Accuracy@5
    - type: dot_accuracy@10
      value: 0.12673267326732673
      name: Dot Accuracy@10
    - type: dot_precision@1
      value: 0.019801980198019802
      name: Dot Precision@1
    - type: dot_precision@3
      value: 0.01485148514851485
      name: Dot Precision@3
    - type: dot_precision@5
      value: 0.014851485148514853
      name: Dot Precision@5
    - type: dot_precision@10
      value: 0.016831683168316833
      name: Dot Precision@10
    - type: dot_recall@1
      value: 1.9670857914229207e-05
      name: Dot Recall@1
    - type: dot_recall@3
      value: 3.554268094376118e-05
      name: Dot Recall@3
    - type: dot_recall@5
      value: 6.67664165823309e-05
      name: Dot Recall@5
    - type: dot_recall@10
      value: 0.0001670844654494185
      name: Dot Recall@10
    - type: dot_ndcg@10
      value: 0.01679069935920913
      name: Dot Ndcg@10
    - type: dot_mrr@10
      value: 0.04252396668238257
      name: Dot Mrr@10
    - type: dot_map@100
      value: 0.002057887757857092
      name: Dot Map@100
    - type: cosine_accuracy@1
      value: 0.01881188118811881
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.03762376237623762
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.06435643564356436
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.1306930693069307
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.01881188118811881
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.013861386138613862
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.015841584158415842
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.01722772277227723
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 1.8836739119030395e-05
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 3.852282962664283e-05
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 7.907232140954174e-05
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.00018073758516299118
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.01704492626324548
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.04188786735816444
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.002251865468050825
      name: Cosine Map@100
    - type: dot_accuracy@1
      value: 0.01881188118811881
      name: Dot Accuracy@1
    - type: dot_accuracy@3
      value: 0.03762376237623762
      name: Dot Accuracy@3
    - type: dot_accuracy@5
      value: 0.06435643564356436
      name: Dot Accuracy@5
    - type: dot_accuracy@10
      value: 0.1306930693069307
      name: Dot Accuracy@10
    - type: dot_precision@1
      value: 0.01881188118811881
      name: Dot Precision@1
    - type: dot_precision@3
      value: 0.013861386138613862
      name: Dot Precision@3
    - type: dot_precision@5
      value: 0.015841584158415842
      name: Dot Precision@5
    - type: dot_precision@10
      value: 0.01722772277227723
      name: Dot Precision@10
    - type: dot_recall@1
      value: 1.8836739119030395e-05
      name: Dot Recall@1
    - type: dot_recall@3
      value: 3.852282962664283e-05
      name: Dot Recall@3
    - type: dot_recall@5
      value: 7.907232140954174e-05
      name: Dot Recall@5
    - type: dot_recall@10
      value: 0.00018073758516299118
      name: Dot Recall@10
    - type: dot_ndcg@10
      value: 0.01704492626324548
      name: Dot Ndcg@10
    - type: dot_mrr@10
      value: 0.04188786735816444
      name: Dot Mrr@10
    - type: dot_map@100
      value: 0.002251865468050825
      name: Dot Map@100
    - type: cosine_accuracy@1
      value: 0.01881188118811881
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.03663366336633663
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.06435643564356436
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.1306930693069307
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.01881188118811881
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.013531353135313529
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.015643564356435644
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.01722772277227723
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 1.8836739119030395e-05
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 3.715905688573237e-05
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 7.929088142504806e-05
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.0001757722267344924
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.01701867523723249
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.0418477919220494
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.0022453604762727357
      name: Cosine Map@100
    - type: dot_accuracy@1
      value: 0.01881188118811881
      name: Dot Accuracy@1
    - type: dot_accuracy@3
      value: 0.03663366336633663
      name: Dot Accuracy@3
    - type: dot_accuracy@5
      value: 0.06435643564356436
      name: Dot Accuracy@5
    - type: dot_accuracy@10
      value: 0.1306930693069307
      name: Dot Accuracy@10
    - type: dot_precision@1
      value: 0.01881188118811881
      name: Dot Precision@1
    - type: dot_precision@3
      value: 0.013531353135313529
      name: Dot Precision@3
    - type: dot_precision@5
      value: 0.015643564356435644
      name: Dot Precision@5
    - type: dot_precision@10
      value: 0.01722772277227723
      name: Dot Precision@10
    - type: dot_recall@1
      value: 1.8836739119030395e-05
      name: Dot Recall@1
    - type: dot_recall@3
      value: 3.715905688573237e-05
      name: Dot Recall@3
    - type: dot_recall@5
      value: 7.929088142504806e-05
      name: Dot Recall@5
    - type: dot_recall@10
      value: 0.0001757722267344924
      name: Dot Recall@10
    - type: dot_ndcg@10
      value: 0.01701867523723249
      name: Dot Ndcg@10
    - type: dot_mrr@10
      value: 0.0418477919220494
      name: Dot Mrr@10
    - type: dot_map@100
      value: 0.0022453604762727357
      name: Dot Map@100
---

# SentenceTransformer based on BAAI/bge-small-en

This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-small-en](https://huggingface.co/BAAI/bge-small-en). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

## Model Details

### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [BAAI/bge-small-en](https://huggingface.co/BAAI/bge-small-en) <!-- at revision 2275a7bdee235e9b4f01fa73aa60d3311983cfea -->
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 384 tokens
- **Similarity Function:** Cosine Similarity
<!-- - **Training Dataset:** Unknown -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->

### Model Sources

- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

### Full Model Architecture

```
SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)
```

## Usage

### Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

```bash
pip install -U sentence-transformers
```

Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Areeb-02/bge-small-en-MultiplrRankingLoss-30-Rag-paper-dataset")
# Run inference
sentences = [
    'Compare the top-5 retrieval accuracy of BM25 + MQ and SERM + BF for the NQ Dataset and HotpotQA.',
    "For the NQ Dataset, SERM + BF has a top-5 retrieval accuracy of 88.22, which is significantly higher than BM25 + MQ's accuracy of 25.19. For HotpotQA, SERM + BF was not tested, but BM25 + MQ has a top-5 retrieval accuracy of 49.52.",
    'The proof for Equation 5 progresses from Equation 20 to Equation 22 by applying the transformation motivated by Xie et al. [2021] and introducing the term p(R, x1:i−1|z) to the equation.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```

<!--
### Direct Usage (Transformers)

<details><summary>Click to see the direct usage in Transformers</summary>

</details>
-->

<!--
### Downstream Usage (Sentence Transformers)

You can finetune this model on your own dataset.

<details><summary>Click to expand</summary>

</details>
-->

<!--
### Out-of-Scope Use

*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->

## Evaluation

### Metrics

#### Information Retrieval

* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)

| Metric              | Value      |
|:--------------------|:-----------|
| cosine_accuracy@1   | 0.0178     |
| cosine_accuracy@3   | 0.0436     |
| cosine_accuracy@5   | 0.0653     |
| cosine_accuracy@10  | 0.1248     |
| cosine_precision@1  | 0.0178     |
| cosine_precision@3  | 0.0158     |
| cosine_precision@5  | 0.016      |
| cosine_precision@10 | 0.0158     |
| cosine_recall@1     | 0.0        |
| cosine_recall@3     | 0.0        |
| cosine_recall@5     | 0.0001     |
| cosine_recall@10    | 0.0002     |
| cosine_ndcg@10      | 0.0163     |
| cosine_mrr@10       | 0.0423     |
| **cosine_map@100**  | **0.0019** |
| dot_accuracy@1      | 0.0178     |
| dot_accuracy@3      | 0.0436     |
| dot_accuracy@5      | 0.0653     |
| dot_accuracy@10     | 0.1248     |
| dot_precision@1     | 0.0178     |
| dot_precision@3     | 0.0158     |
| dot_precision@5     | 0.016      |
| dot_precision@10    | 0.0158     |
| dot_recall@1        | 0.0        |
| dot_recall@3        | 0.0        |
| dot_recall@5        | 0.0001     |
| dot_recall@10       | 0.0002     |
| dot_ndcg@10         | 0.0163     |
| dot_mrr@10          | 0.0423     |
| dot_map@100         | 0.0019     |

#### Information Retrieval

* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)

| Metric              | Value      |
|:--------------------|:-----------|
| cosine_accuracy@1   | 0.0198     |
| cosine_accuracy@3   | 0.0406     |
| cosine_accuracy@5   | 0.0653     |
| cosine_accuracy@10  | 0.1267     |
| cosine_precision@1  | 0.0198     |
| cosine_precision@3  | 0.0149     |
| cosine_precision@5  | 0.0149     |
| cosine_precision@10 | 0.0168     |
| cosine_recall@1     | 0.0        |
| cosine_recall@3     | 0.0        |
| cosine_recall@5     | 0.0001     |
| cosine_recall@10    | 0.0002     |
| cosine_ndcg@10      | 0.0168     |
| cosine_mrr@10       | 0.0425     |
| **cosine_map@100**  | **0.0021** |
| dot_accuracy@1      | 0.0198     |
| dot_accuracy@3      | 0.0406     |
| dot_accuracy@5      | 0.0653     |
| dot_accuracy@10     | 0.1267     |
| dot_precision@1     | 0.0198     |
| dot_precision@3     | 0.0149     |
| dot_precision@5     | 0.0149     |
| dot_precision@10    | 0.0168     |
| dot_recall@1        | 0.0        |
| dot_recall@3        | 0.0        |
| dot_recall@5        | 0.0001     |
| dot_recall@10       | 0.0002     |
| dot_ndcg@10         | 0.0168     |
| dot_mrr@10          | 0.0425     |
| dot_map@100         | 0.0021     |

#### Information Retrieval

* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)

| Metric              | Value      |
|:--------------------|:-----------|
| cosine_accuracy@1   | 0.0188     |
| cosine_accuracy@3   | 0.0376     |
| cosine_accuracy@5   | 0.0644     |
| cosine_accuracy@10  | 0.1307     |
| cosine_precision@1  | 0.0188     |
| cosine_precision@3  | 0.0139     |
| cosine_precision@5  | 0.0158     |
| cosine_precision@10 | 0.0172     |
| cosine_recall@1     | 0.0        |
| cosine_recall@3     | 0.0        |
| cosine_recall@5     | 0.0001     |
| cosine_recall@10    | 0.0002     |
| cosine_ndcg@10      | 0.017      |
| cosine_mrr@10       | 0.0419     |
| **cosine_map@100**  | **0.0023** |
| dot_accuracy@1      | 0.0188     |
| dot_accuracy@3      | 0.0376     |
| dot_accuracy@5      | 0.0644     |
| dot_accuracy@10     | 0.1307     |
| dot_precision@1     | 0.0188     |
| dot_precision@3     | 0.0139     |
| dot_precision@5     | 0.0158     |
| dot_precision@10    | 0.0172     |
| dot_recall@1        | 0.0        |
| dot_recall@3        | 0.0        |
| dot_recall@5        | 0.0001     |
| dot_recall@10       | 0.0002     |
| dot_ndcg@10         | 0.017      |
| dot_mrr@10          | 0.0419     |
| dot_map@100         | 0.0023     |

#### Information Retrieval

* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)

| Metric              | Value      |
|:--------------------|:-----------|
| cosine_accuracy@1   | 0.0188     |
| cosine_accuracy@3   | 0.0366     |
| cosine_accuracy@5   | 0.0644     |
| cosine_accuracy@10  | 0.1307     |
| cosine_precision@1  | 0.0188     |
| cosine_precision@3  | 0.0135     |
| cosine_precision@5  | 0.0156     |
| cosine_precision@10 | 0.0172     |
| cosine_recall@1     | 0.0        |
| cosine_recall@3     | 0.0        |
| cosine_recall@5     | 0.0001     |
| cosine_recall@10    | 0.0002     |
| cosine_ndcg@10      | 0.017      |
| cosine_mrr@10       | 0.0418     |
| **cosine_map@100**  | **0.0022** |
| dot_accuracy@1      | 0.0188     |
| dot_accuracy@3      | 0.0366     |
| dot_accuracy@5      | 0.0644     |
| dot_accuracy@10     | 0.1307     |
| dot_precision@1     | 0.0188     |
| dot_precision@3     | 0.0135     |
| dot_precision@5     | 0.0156     |
| dot_precision@10    | 0.0172     |
| dot_recall@1        | 0.0        |
| dot_recall@3        | 0.0        |
| dot_recall@5        | 0.0001     |
| dot_recall@10       | 0.0002     |
| dot_ndcg@10         | 0.017      |
| dot_mrr@10          | 0.0418     |
| dot_map@100         | 0.0022     |

<!--
## Bias, Risks and Limitations

*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->

<!--
### Recommendations

*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->

## Training Details

### Training Dataset

#### Unnamed Dataset


* Size: 1,010 training samples
* Columns: <code>anchor</code> and <code>positive</code>
* Approximate statistics based on the first 1000 samples:
  |         | anchor                                                                            | positive                                                                           |
  |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
  | type    | string                                                                            | string                                                                             |
  | details | <ul><li>min: 2 tokens</li><li>mean: 21.28 tokens</li><li>max: 59 tokens</li></ul> | <ul><li>min: 2 tokens</li><li>mean: 40.15 tokens</li><li>max: 129 tokens</li></ul> |
* Samples:
  | anchor                                                                                                                                                   | positive                                                                                                                                                                                                                                                                                                                                                                   |
  |:---------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
  | <code>What is the purpose of the MultiHop-RAG dataset and what does it consist of?</code>                                                                | <code>The MultiHop-RAG dataset is developed to benchmark Retrieval-Augmented Generation (RAG) for multi-hop queries. It consists of a knowledge base, a large collection of multi-hop queries, their ground-truth answers, and the associated supporting evidence. The dataset is built using an English news article dataset as the underlying RAG knowledge base.</code> |
  | <code>Among Google, Apple, and Nvidia, which company reported the largest profit margins in their third-quarter reports for the fiscal year 2023?</code> | <code>Apple reported the largest profit margins in their third-quarter reports for the fiscal year 2023.</code>                                                                                                                                                                                                                                                            |
  | <code>Under what circumstances should the LLM answer the questions?</code>                                                                               | <code>The LLM should answer the questions based solely on the information provided in the paragraphs, and it should not use any other information.</code>                                                                                                                                                                                                                  |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
  ```json
  {
      "scale": 20.0,
      "similarity_fct": "cos_sim"
  }
  ```

### Training Hyperparameters
#### Non-Default Hyperparameters

- `eval_strategy`: steps
- `per_device_train_batch_size`: 16
- `per_device_eval_batch_size`: 16
- `num_train_epochs`: 10
- `warmup_ratio`: 0.1
- `fp16`: True

#### All Hyperparameters
<details><summary>Click to expand</summary>

- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: steps
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 16
- `per_device_eval_batch_size`: 16
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `learning_rate`: 5e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 10
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.1
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: True
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: False
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`: 
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `dispatch_batches`: None
- `split_batches`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `batch_sampler`: batch_sampler
- `multi_dataset_batch_sampler`: proportional

</details>

### Training Logs
| Epoch  | Step | Training Loss | cosine_map@100 |
|:------:|:----:|:-------------:|:--------------:|
| 0      | 0    | -             | 0.0018         |
| 1.5625 | 100  | -             | 0.0019         |
| 3.0    | 192  | -             | 0.0020         |
| 1.5625 | 100  | -             | 0.0021         |
| 3.125  | 200  | -             | 0.0020         |
| 4.6875 | 300  | -             | 0.0021         |
| 5.0    | 320  | -             | 0.0020         |
| 1.5625 | 100  | -             | 0.0020         |
| 3.125  | 200  | -             | 0.0021         |
| 4.6875 | 300  | -             | 0.0022         |
| 1.5625 | 100  | -             | 0.0021         |
| 3.125  | 200  | -             | 0.0019         |
| 4.6875 | 300  | -             | 0.0022         |
| 6.25   | 400  | -             | 0.0022         |
| 7.8125 | 500  | 0.0021        | 0.0022         |
| 9.375  | 600  | -             | 0.0023         |
| 10.0   | 640  | -             | 0.0022         |


### Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.0.1
- Transformers: 4.42.3
- PyTorch: 2.3.0+cu121
- Accelerate: 0.32.1
- Datasets: 2.20.0
- Tokenizers: 0.19.1

## Citation

### BibTeX

#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
```

#### MultipleNegativesRankingLoss
```bibtex
@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
```

<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->