philipp-zettl's picture
Add new SentenceTransformer model.
6a2981e verified
|
raw
history blame
52.4 kB
metadata
language: []
library_name: sentence-transformers
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:1793370
  - loss:CoSENTLoss
base_model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
datasets: []
metrics:
  - pearson_cosine
  - spearman_cosine
  - pearson_manhattan
  - spearman_manhattan
  - pearson_euclidean
  - spearman_euclidean
  - pearson_dot
  - spearman_dot
  - pearson_max
  - spearman_max
widget:
  - source_sentence: ek wil bietjie moderne rock hoor
    sentences:
      - request datetime
      - turn wemo on
      - query cooking
  - source_sentence: skakel af die alarm vir woensdag ses v. m.
    sentences:
      - set alarm
      - turn hue light up
      - request weather
  - source_sentence: speel my top-gegradeerde pop liedjies asseblief
    sentences:
      - greeting
      - request fact
      - request datetime
  - source_sentence: is dit warm buite
    sentences:
      - request weather
      - play music
      - request transport
  - source_sentence: maak 'n speellys van al die eminem liedjies en speel dit met skommel
    sentences:
      - search recipe
      - recommend movie
      - play music
pipeline_tag: sentence-similarity
model-index:
  - name: >-
      SentenceTransformer based on
      sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
    results:
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: MiniLM dev
          type: MiniLM-dev
        metrics:
          - type: pearson_cosine
            value: 0.807743120621169
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.8111451989044506
            name: Spearman Cosine
          - type: pearson_manhattan
            value: 0.8090992313100879
            name: Pearson Manhattan
          - type: spearman_manhattan
            value: 0.8112673840020295
            name: Spearman Manhattan
          - type: pearson_euclidean
            value: 0.8107892143621067
            name: Pearson Euclidean
          - type: spearman_euclidean
            value: 0.8137277702128023
            name: Spearman Euclidean
          - type: pearson_dot
            value: 0.7013144883870261
            name: Pearson Dot
          - type: spearman_dot
            value: 0.7113684320495312
            name: Spearman Dot
          - type: pearson_max
            value: 0.8107892143621067
            name: Pearson Max
          - type: spearman_max
            value: 0.8137277702128023
            name: Spearman Max

SentenceTransformer based on sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

This is a sentence-transformers model finetuned from sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("philipp-zettl/MiniLM-amazon_massive_intent-similarity")
# Run inference
sentences = [
    "maak 'n speellys van al die eminem liedjies en speel dit met skommel",
    'play music',
    'recommend movie',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.8077
spearman_cosine 0.8111
pearson_manhattan 0.8091
spearman_manhattan 0.8113
pearson_euclidean 0.8108
spearman_euclidean 0.8137
pearson_dot 0.7013
spearman_dot 0.7114
pearson_max 0.8108
spearman_max 0.8137

Training Details

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss loss MiniLM-dev_spearman_cosine
0.0018 100 10.7509 - -
0.0036 200 9.8726 - -
0.0054 300 8.9837 - -
0.0071 400 7.3162 - -
0.0089 500 8.2842 - -
0.0107 600 6.2254 - -
0.0125 700 6.1004 - -
0.0143 800 5.8583 - -
0.0161 900 6.3118 - -
0.0178 1000 5.7908 2.6141 0.4045
0.0196 1100 5.6907 - -
0.0214 1200 5.6743 - -
0.0232 1300 5.5022 - -
0.0250 1400 5.0283 - -
0.0268 1500 5.2936 - -
0.0285 1600 5.2928 - -
0.0303 1700 5.5088 - -
0.0321 1800 5.3125 - -
0.0339 1900 5.7931 - -
0.0357 2000 5.5979 2.3256 0.5075
0.0375 2100 5.3222 - -
0.0393 2200 5.268 - -
0.0410 2300 5.264 - -
0.0428 2400 4.9437 - -
0.0446 2500 4.9219 - -
0.0464 2600 4.8656 - -
0.0482 2700 5.2733 - -
0.0500 2800 5.0311 - -
0.0517 2900 5.302 - -
0.0535 3000 5.3347 2.1545 0.6496
0.0553 3100 5.1241 - -
0.0571 3200 5.0232 - -
0.0589 3300 4.9932 - -
0.0607 3400 4.9651 - -
0.0625 3500 4.5226 - -
0.0642 3600 4.6666 - -
0.0660 3700 4.8979 - -
0.0678 3800 4.9139 - -
0.0696 3900 4.9241 - -
0.0714 4000 5.2878 2.1118 0.6948
0.0732 4100 5.0776 - -
0.0749 4200 4.934 - -
0.0767 4300 4.9012 - -
0.0785 4400 4.8835 - -
0.0803 4500 4.5886 - -
0.0821 4600 4.7829 - -
0.0839 4700 4.8057 - -
0.0856 4800 4.8761 - -
0.0874 4900 4.6787 - -
0.0892 5000 5.313 2.1114 0.6770
0.0910 5100 5.3036 - -
0.0928 5200 5.0731 - -
0.0946 5300 5.0052 - -
0.0964 5400 4.9494 - -
0.0981 5500 4.836 - -
0.0999 5600 4.6319 - -
0.1017 5700 4.667 - -
0.1035 5800 4.9578 - -
0.1053 5900 4.9473 - -
0.1071 6000 4.9897 3.0813 0.4424
0.1088 6100 5.1704 - -
0.1106 6200 4.8472 - -
0.1124 6300 4.8296 - -
0.1142 6400 4.8287 - -
0.1160 6500 4.6539 - -
0.1178 6600 4.2599 - -
0.1196 6700 4.5506 - -
0.1213 6800 4.6585 - -
0.1231 6900 4.7248 - -
0.1249 7000 4.6389 3.1390 0.5199
0.1267 7100 4.8133 - -
0.1285 7200 4.8838 - -
0.1303 7300 4.7375 - -
0.1320 7400 4.6357 - -
0.1338 7500 4.7807 - -
0.1356 7600 4.409 - -
0.1374 7700 4.5612 - -
0.1392 7800 4.3731 - -
0.1410 7900 4.622 - -
0.1427 8000 4.5574 2.6558 0.5814
0.1445 8100 4.6542 - -
0.1463 8200 4.7831 - -
0.1481 8300 4.6775 - -
0.1499 8400 4.61 - -
0.1517 8500 4.6416 - -
0.1535 8600 4.3096 - -
0.1552 8700 4.2629 - -
0.1570 8800 4.5151 - -
0.1588 8900 4.5301 - -
0.1606 9000 4.5731 2.8939 0.5675
0.1624 9100 4.4347 - -
0.1642 9200 4.648 - -
0.1659 9300 4.6076 - -
0.1677 9400 4.4229 - -
0.1695 9500 4.4785 - -
0.1713 9600 4.4252 - -
0.1731 9700 4.0223 - -
0.1749 9800 4.1593 - -
0.1767 9900 4.2946 - -
0.1784 10000 4.4888 2.7814 0.5852
0.1802 10100 4.3605 - -
0.1820 10200 4.5952 - -
0.1838 10300 4.709 - -
0.1856 10400 4.5743 - -
0.1874 10500 4.5539 - -
0.1891 10600 4.4427 - -
0.1909 10700 4.1095 - -
0.1927 10800 4.4079 - -
0.1945 10900 4.1667 - -
0.1963 11000 4.2273 3.3803 0.5663
0.1981 11100 4.3333 - -
0.1998 11200 4.5174 - -
0.2016 11300 4.4961 - -
0.2034 11400 4.5746 - -
0.2052 11500 4.731 - -
0.2070 11600 4.4485 - -
0.2088 11700 4.4099 - -
0.2106 11800 3.8921 - -
0.2123 11900 4.2423 - -
0.2141 12000 4.2641 3.0230 0.6300
0.2159 12100 4.2052 - -
0.2177 12200 4.2757 - -
0.2195 12300 4.8586 - -
0.2213 12400 4.5872 - -
0.2230 12500 4.4273 - -
0.2248 12600 4.5728 - -
0.2266 12700 4.4607 - -
0.2284 12800 4.1361 - -
0.2302 12900 4.4781 - -
0.2320 13000 4.145 2.7088 0.6617
0.2337 13100 4.3366 - -
0.2355 13200 4.2699 - -
0.2373 13300 4.3397 - -
0.2391 13400 4.6033 - -
0.2409 13500 4.2292 - -
0.2427 13600 4.3399 - -
0.2445 13700 4.5222 - -
0.2462 13800 4.2185 - -
0.2480 13900 3.9426 - -
0.2498 14000 4.2146 2.6014 0.6724
0.2516 14100 4.2534 - -
0.2534 14200 4.1765 - -
0.2552 14300 4.117 - -
0.2569 14400 5.0908 - -
0.2587 14500 4.488 - -
0.2605 14600 4.4429 - -
0.2623 14700 4.3688 - -
0.2641 14800 4.4857 - -
0.2659 14900 4.1763 - -
0.2677 15000 4.4425 2.6388 0.6842
0.2694 15100 4.4277 - -
0.2712 15200 4.3841 - -
0.2730 15300 4.4 - -
0.2748 15400 4.55 - -
0.2766 15500 4.4769 - -
0.2784 15600 4.3918 - -
0.2801 15700 4.554 - -
0.2819 15800 4.406 - -
0.2837 15900 4.0593 - -
0.2855 16000 4.3586 2.5251 0.7238
0.2873 16100 4.2308 - -
0.2891 16200 4.469 - -
0.2908 16300 4.2312 - -
0.2926 16400 4.2695 - -
0.2944 16500 4.5821 - -
0.2962 16600 4.5623 - -
0.2980 16700 4.1865 - -
0.2998 16800 4.4228 - -
0.3016 16900 4.0553 - -
0.3033 17000 3.7183 2.6050 0.7319
0.3051 17100 4.1849 - -
0.3069 17200 4.2975 - -
0.3087 17300 4.4272 - -
0.3105 17400 4.0634 - -
0.3123 17500 4.8608 - -
0.3140 17600 4.4146 - -
0.3158 17700 4.2655 - -
0.3176 17800 4.3814 - -
0.3194 17900 4.3972 - -
0.3212 18000 3.8868 2.4737 0.7500
0.3230 18100 4.434 - -
0.3248 18200 4.2213 - -
0.3265 18300 4.4632 - -
0.3283 18400 4.4001 - -
0.3301 18500 4.8262 - -
0.3319 18600 4.5022 - -
0.3337 18700 4.4148 - -
0.3355 18800 4.2182 - -
0.3372 18900 4.2127 - -
0.3390 19000 4.051 2.4633 0.7575
0.3408 19100 3.655 - -
0.3426 19200 4.2441 - -
0.3444 19300 4.3494 - -
0.3462 19400 4.1824 - -
0.3479 19500 4.3528 - -
0.3497 19600 5.6073 - -
0.3515 19700 4.8231 - -
0.3533 19800 4.5816 - -
0.3551 19900 4.5812 - -
0.3569 20000 4.637 2.1229 0.7945
0.3587 20100 4.2619 - -
0.3604 20200 4.5645 - -
0.3622 20300 4.7248 - -
0.3640 20400 4.5665 - -
0.3658 20500 4.5628 - -
0.3676 20600 4.8494 - -
0.3694 20700 4.4338 - -
0.3711 20800 4.3256 - -
0.3729 20900 4.4388 - -
0.3747 21000 4.158 2.3475 0.7732
0.3765 21100 3.962 - -
0.3783 21200 3.931 - -
0.3801 21300 4.0345 - -
0.3818 21400 4.319 - -
0.3836 21500 4.1329 - -
0.3854 21600 4.245 - -
0.3872 21700 4.518 - -
0.3890 21800 4.4653 - -
0.3908 21900 4.2777 - -
0.3926 22000 4.3358 2.1933 0.7845
0.3943 22100 4.2291 - -
0.3961 22200 3.8067 - -
0.3979 22300 4.2039 - -
0.3997 22400 4.0104 - -
0.4015 22500 4.2346 - -
0.4033 22600 4.0056 - -
0.4050 22700 5.6038 - -
0.4068 22800 5.1185 - -
0.4086 22900 4.924 - -
0.4104 23000 4.7841 1.9839 0.7956
0.4122 23100 4.7953 - -
0.4140 23200 4.4229 - -
0.4158 23300 4.6432 - -
0.4175 23400 4.5284 - -
0.4193 23500 4.7215 - -
0.4211 23600 4.7432 - -
0.4229 23700 5.0136 - -
0.4247 23800 4.7958 - -
0.4265 23900 4.6827 - -
0.4282 24000 4.6665 1.9663 0.7870
0.4300 24100 4.5074 - -
0.4318 24200 4.4189 - -
0.4336 24300 4.4586 - -
0.4354 24400 4.6421 - -
0.4372 24500 4.4281 - -
0.4389 24600 4.5153 - -
0.4407 24700 4.9942 - -
0.4425 24800 5.11 - -
0.4443 24900 4.7071 - -
0.4461 25000 4.6257 1.9461 0.7935
0.4479 25100 4.6576 - -
0.4497 25200 4.6103 - -
0.4514 25300 4.2066 - -
0.4532 25400 4.6869 - -
0.4550 25500 4.7575 - -
0.4568 25600 4.6081 - -
0.4586 25700 4.8144 - -
0.4604 25800 5.2007 - -
0.4621 25900 4.8367 - -
0.4639 26000 4.5258 1.9131 0.7993
0.4657 26100 4.4784 - -
0.4675 26200 4.5568 - -
0.4693 26300 4.2591 - -
0.4711 26400 4.4521 - -
0.4729 26500 4.4041 - -
0.4746 26600 4.4926 - -
0.4764 26700 4.1686 - -
0.4782 26800 4.6294 - -
0.4800 26900 4.6889 - -
0.4818 27000 4.5765 1.9539 0.7961
0.4836 27100 4.3427 - -
0.4853 27200 4.5275 - -
0.4871 27300 4.4186 - -
0.4889 27400 4.0163 - -
0.4907 27500 4.3204 - -
0.4925 27600 4.179 - -
0.4943 27700 4.3838 - -
0.4960 27800 4.2631 - -
0.4978 27900 4.7177 - -
0.4996 28000 4.5161 2.0116 0.7935
0.5014 28100 4.2861 - -
0.5032 28200 4.4123 - -
0.5050 28300 4.293 - -
0.5068 28400 4.2346 - -
0.5085 28500 4.3355 - -
0.5103 28600 4.4616 - -
0.5121 28700 4.2409 - -
0.5139 28800 4.2398 - -
0.5157 28900 4.7412 - -
0.5175 29000 4.5044 2.1008 0.7859
0.5192 29100 4.4556 - -
0.5210 29200 4.2938 - -
0.5228 29300 4.4962 - -
0.5246 29400 4.477 - -
0.5264 29500 4.2602 - -
0.5282 29600 4.4231 - -
0.5300 29700 4.2165 - -
0.5317 29800 4.3729 - -
0.5335 29900 4.2414 - -
0.5353 30000 4.9937 2.0884 0.7702
0.5371 30100 4.5737 - -
0.5389 30200 4.4517 - -
0.5407 30300 4.4178 - -
0.5424 30400 4.3514 - -
0.5442 30500 3.9723 - -
0.5460 30600 4.3707 - -
0.5478 30700 4.2235 - -
0.5496 30800 4.4278 - -
0.5514 30900 4.2914 - -
0.5531 31000 4.5636 2.3277 0.7454
0.5549 31100 4.4889 - -
0.5567 31200 4.3211 - -
0.5585 31300 4.404 - -
0.5603 31400 4.2117 - -
0.5621 31500 4.1126 - -
0.5639 31600 4.1737 - -
0.5656 31700 4.203 - -
0.5674 31800 4.1093 - -
0.5692 31900 4.0702 - -
0.5710 32000 4.4189 2.6265 0.7375
0.5728 32100 4.9817 - -
0.5746 32200 4.4736 - -
0.5763 32300 4.348 - -
0.5781 32400 4.5404 - -
0.5799 32500 4.2987 - -
0.5817 32600 4.0725 - -
0.5835 32700 4.5469 - -
0.5853 32800 4.4367 - -
0.5870 32900 4.3369 - -
0.5888 33000 4.2292 2.5687 0.7213
0.5906 33100 4.7929 - -
0.5924 33200 4.4123 - -
0.5942 33300 4.1699 - -
0.5960 33400 4.4021 - -
0.5978 33500 4.5257 - -
0.5995 33600 3.7222 - -
0.6013 33700 4.0746 - -
0.6031 33800 4.1399 - -
0.6049 33900 3.9957 - -
0.6067 34000 4.093 2.4645 0.7524
0.6085 34100 4.2929 - -
0.6102 34200 4.4765 - -
0.6120 34300 4.3871 - -
0.6138 34400 4.385 - -
0.6156 34500 4.1455 - -
0.6174 34600 3.7689 - -
0.6192 34700 3.6574 - -
0.6210 34800 4.2426 - -
0.6227 34900 4.293 - -
0.6245 35000 4.1368 2.4370 0.7765
0.6263 35100 3.6174 - -
0.6281 35200 4.7763 - -
0.6299 35300 4.3121 - -
0.6317 35400 4.1886 - -
0.6334 35500 4.3538 - -
0.6352 35600 4.0285 - -
0.6370 35700 3.4691 - -
0.6388 35800 4.2732 - -
0.6406 35900 4.2052 - -
0.6424 36000 4.0452 2.4680 0.7732
0.6441 36100 3.9032 - -
0.6459 36200 4.2608 - -
0.6477 36300 4.262 - -
0.6495 36400 4.1138 - -
0.6513 36500 4.248 - -
0.6531 36600 4.1163 - -
0.6549 36700 3.6375 - -
0.6566 36800 4.0768 - -
0.6584 36900 4.0268 - -
0.6602 37000 4.0129 2.6361 0.7702
0.6620 37100 3.7976 - -
0.6638 37200 4.2518 - -
0.6656 37300 4.5011 - -
0.6673 37400 4.4488 - -
0.6691 37500 3.9798 - -
0.6709 37600 4.027 - -
0.6727 37700 4.0342 - -
0.6745 37800 3.8229 - -
0.6763 37900 4.0573 - -
0.6781 38000 4.1739 2.4511 0.7935
0.6798 38100 4.57 - -
0.6816 38200 3.9108 - -
0.6834 38300 4.3569 - -
0.6852 38400 4.3775 - -
0.6870 38500 4.2887 - -
0.6888 38600 4.144 - -
0.6905 38700 4.5112 - -
0.6923 38800 3.5093 - -
0.6941 38900 3.9626 - -
0.6959 39000 4.024 2.4241 0.7868
0.6977 39100 4.0671 - -
0.6995 39200 3.9545 - -
0.7012 39300 4.0036 - -
0.7030 39400 4.3796 - -
0.7048 39500 4.2912 - -
0.7066 39600 4.1181 - -
0.7084 39700 4.1437 - -
0.7102 39800 3.8734 - -
0.7120 39900 3.7678 - -
0.7137 40000 4.2327 2.3937 0.7956
0.7155 40100 3.8276 - -
0.7173 40200 4.2885 - -
0.7191 40300 4.019 - -
0.7209 40400 4.6898 - -
0.7227 40500 4.2398 - -
0.7244 40600 4.317 - -
0.7262 40700 4.2543 - -
0.7280 40800 4.1048 - -
0.7298 40900 3.4243 - -
0.7316 41000 4.0587 2.2848 0.8035
0.7334 41100 4.2112 - -
0.7351 41200 4.0331 - -
0.7369 41300 4.2361 - -
0.7387 41400 4.3818 - -
0.7405 41500 4.1311 - -
0.7423 41600 4.0607 - -
0.7441 41700 4.1277 - -
0.7459 41800 3.8844 - -
0.7476 41900 3.6138 - -
0.7494 42000 3.7973 2.4197 0.8045
0.7512 42100 4.0854 - -
0.7530 42200 4.0926 - -
0.7548 42300 3.9821 - -
0.7566 42400 4.5564 - -
0.7583 42500 6.1707 - -
0.7601 42600 5.4598 - -
0.7619 42700 5.2202 - -
0.7637 42800 5.1402 - -
0.7655 42900 4.8446 - -
0.7673 43000 4.5341 1.9710 0.8181
0.7691 43100 5.0068 - -
0.7708 43200 5.0099 - -
0.7726 43300 4.7986 - -
0.7744 43400 5.0468 - -
0.7762 43500 5.135 - -
0.7780 43600 4.8018 - -
0.7798 43700 4.6291 - -
0.7815 43800 4.6119 - -
0.7833 43900 4.5318 - -
0.7851 44000 3.9703 1.9790 0.8211
0.7869 44100 4.461 - -
0.7887 44200 4.5536 - -
0.7905 44300 4.411 - -
0.7922 44400 4.5796 - -
0.7940 44500 4.7385 - -
0.7958 44600 4.6635 - -
0.7976 44700 4.4808 - -
0.7994 44800 4.5565 - -
0.8012 44900 4.4707 - -
0.8030 45000 3.9981 1.9823 0.8197
0.8047 45100 4.119 - -
0.8065 45200 4.4209 - -
0.8083 45300 4.3268 - -
0.8101 45400 4.2979 - -
0.8119 45500 4.413 - -
0.8137 45600 4.3317 - -
0.8154 45700 4.3683 - -
0.8172 45800 4.0769 - -
0.8190 45900 4.304 - -
0.8208 46000 4.0985 2.0490 0.8183
0.8226 46100 3.8719 - -
0.8244 46200 4.1843 - -
0.8262 46300 4.2131 - -
0.8279 46400 4.3327 - -
0.8297 46500 3.8533 - -
0.8315 46600 5.2854 - -
0.8333 46700 5.2465 - -
0.8351 46800 5.0221 - -
0.8369 46900 4.9466 - -
0.8386 47000 5.0361 1.8252 0.8360
0.8404 47100 4.3676 - -
0.8422 47200 4.619 - -
0.8440 47300 4.6412 - -
0.8458 47400 4.7874 - -
0.8476 47500 4.663 - -
0.8493 47600 4.7068 - -
0.8511 47700 4.5889 - -
0.8529 47800 4.3468 - -
0.8547 47900 4.4393 - -
0.8565 48000 4.5488 1.9117 0.8176
0.8583 48100 4.0933 - -
0.8601 48200 3.7754 - -
0.8618 48300 4.1346 - -
0.8636 48400 4.402 - -
0.8654 48500 4.0163 - -
0.8672 48600 4.3405 - -
0.8690 48700 4.7694 - -
0.8708 48800 4.4457 - -
0.8725 48900 4.3679 - -
0.8743 49000 4.3283 1.9392 0.8251
0.8761 49100 4.6855 - -
0.8779 49200 3.881 - -
0.8797 49300 4.1392 - -
0.8815 49400 4.4343 - -
0.8833 49500 4.4822 - -
0.8850 49600 4.3977 - -
0.8868 49700 4.5944 - -
0.8886 49800 4.4176 - -
0.8904 49900 4.5269 - -
0.8922 50000 4.4267 1.8965 0.8206
0.8940 50100 4.5109 - -
0.8957 50200 4.1775 - -
0.8975 50300 4.3453 - -
0.8993 50400 4.5443 - -
0.9011 50500 4.226 - -
0.9029 50600 4.3296 - -
0.9047 50700 4.1968 - -
0.9064 50800 4.2206 - -
0.9082 50900 4.2299 - -
0.9100 51000 4.0471 2.0479 0.8146
0.9118 51100 4.0832 - -
0.9136 51200 3.7516 - -
0.9154 51300 4.0545 - -
0.9172 51400 4.1281 - -
0.9189 51500 4.2336 - -
0.9207 51600 4.2511 - -
0.9225 51700 4.2588 - -
0.9243 51800 4.0719 - -
0.9261 51900 4.1847 - -
0.9279 52000 4.1445 2.1419 0.8128
0.9296 52100 3.9735 - -
0.9314 52200 3.8635 - -
0.9332 52300 4.1738 - -
0.9350 52400 4.07 - -
0.9368 52500 4.1008 - -
0.9386 52600 3.9628 - -
0.9403 52700 4.2895 - -
0.9421 52800 4.3393 - -
0.9439 52900 2.8535 - -
0.9457 53000 2.5506 2.1743 0.8116
0.9475 53100 2.1566 - -
0.9493 53200 2.0386 - -
0.9511 53300 1.8535 - -
0.9528 53400 1.8561 - -
0.9546 53500 1.3213 - -
0.9564 53600 1.0904 - -
0.9582 53700 1.2266 - -
0.9600 53800 0.9386 - -
0.9618 53900 0.8379 - -
0.9635 54000 0.9314 2.3331 0.8071
0.9653 54100 1.1145 - -
0.9671 54200 1.4435 - -
0.9689 54300 1.3226 - -
0.9707 54400 0.6677 - -
0.9725 54500 0.7357 - -
0.9743 54600 0.6854 - -
0.9760 54700 0.8408 - -
0.9778 54800 0.6291 - -
0.9796 54900 0.8203 - -
0.9814 55000 1.6263 2.4720 0.8104
0.9832 55100 0.95 - -
0.9850 55200 0.6462 - -
0.9867 55300 1.2467 - -
0.9885 55400 1.4926 - -
0.9903 55500 1.9608 - -
0.9921 55600 1.6415 - -
0.9939 55700 1.3258 - -
0.9957 55800 1.2157 - -
0.9974 55900 1.2391 - -
0.9992 56000 1.3474 2.5008 0.8111

Framework Versions

  • Python: 3.10.14
  • Sentence Transformers: 3.0.1
  • Transformers: 4.41.2
  • PyTorch: 2.3.1+cu121
  • Accelerate: 0.33.0
  • Datasets: 2.21.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CoSENTLoss

@online{kexuefm-8847,
    title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
    author={Su Jianlin},
    year={2022},
    month={Jan},
    url={https://kexue.fm/archives/8847},
}