metadata
language:
- multilingual
- zh
- ja
- ar
- ko
- de
- fr
- es
- pt
- hi
- id
- it
- tr
- ru
- bn
- ur
- mr
- ta
- vi
- fa
- pl
- uk
- nl
- sv
- he
- sw
- ps
library_name: sentence-transformers
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- dataset_size:10K<n<100K
- loss:CoSENTLoss
base_model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
metrics:
- pearson_cosine
- spearman_cosine
- pearson_manhattan
- spearman_manhattan
- pearson_euclidean
- spearman_euclidean
- pearson_dot
- spearman_dot
- pearson_max
- spearman_max
widget:
- source_sentence: Bottomless Mug
sentences:
- You are always safe.
- That trend isn't very known yet
- Eleanor Clift göreve koşuyor.
- source_sentence: Tripp has a job.
sentences:
- They are having money problems.
- Malignite aniden ortaya çıkar.
- Mezarlar derin ormanlarda saklandı.
- source_sentence: There are rules
sentences:
- There are more villians than heros.
- The directions should be read.
- Mezarlar derin ormanlarda saklandı.
- source_sentence: K is a musician.
sentences:
- Klimt draws hotdogs.
- Ed Wood hiç mahkemeye çıkmadı.
- Çeçen Rusya yönetimi ele geçirdi.
- source_sentence: We moved closer.
sentences:
- Clinton is unaware of the process.
- Nesil deneyimleri anlamsızdır.
- Hormonların etkileri vardır.
pipeline_tag: sentence-similarity
model-index:
- name: >-
SentenceTransformer based on
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
results:
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: tr ling
type: tr_ling
metrics:
- type: pearson_cosine
value: 0.058743115070889876
name: Pearson Cosine
- type: spearman_cosine
value: 0.059526247945378225
name: Spearman Cosine
- type: pearson_manhattan
value: 0.04582145815494953
name: Pearson Manhattan
- type: spearman_manhattan
value: 0.04331287037397966
name: Spearman Manhattan
- type: pearson_euclidean
value: 0.04709170917685587
name: Pearson Euclidean
- type: spearman_euclidean
value: 0.04407504959649961
name: Spearman Euclidean
- type: pearson_dot
value: 0.08477622619519222
name: Pearson Dot
- type: spearman_dot
value: 0.08243745050110735
name: Spearman Dot
- type: pearson_max
value: 0.08477622619519222
name: Pearson Max
- type: spearman_max
value: 0.08243745050110735
name: Spearman Max
SentenceTransformer based on sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
This is a sentence-transformers model finetuned from sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 on the MoritzLaurer/multilingual-nli-26lang-2mil7 dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
- Maximum Sequence Length: 128 tokens
- Output Dimensionality: 384 tokens
- Similarity Function: Cosine Similarity
- Training Dataset:
- Languages: multilingual, zh, ja, ar, ko, de, fr, es, pt, hi, id, it, tr, ru, bn, ur, mr, ta, vi, fa, pl, uk, nl, sv, he, sw, ps
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'We moved closer.',
'Clinton is unaware of the process.',
'Nesil deneyimleri anlamsızdır.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Semantic Similarity
- Dataset:
tr_ling
- Evaluated with
EmbeddingSimilarityEvaluator
Metric | Value |
---|---|
pearson_cosine | 0.0587 |
spearman_cosine | 0.0595 |
pearson_manhattan | 0.0458 |
spearman_manhattan | 0.0433 |
pearson_euclidean | 0.0471 |
spearman_euclidean | 0.0441 |
pearson_dot | 0.0848 |
spearman_dot | 0.0824 |
pearson_max | 0.0848 |
spearman_max | 0.0824 |
Training Details
Training Dataset
MoritzLaurer/multilingual-nli-26lang-2mil7
- Dataset: MoritzLaurer/multilingual-nli-26lang-2mil7 at 510a233
- Size: 25,000 training samples
- Columns:
premise_original
,hypothesis_original
,score
,sentence1
, andsentence2
- Approximate statistics based on the first 1000 samples:
premise_original hypothesis_original score sentence1 sentence2 type string string int string string details - min: 4 tokens
- mean: 29.3 tokens
- max: 107 tokens
- min: 4 tokens
- mean: 15.62 tokens
- max: 40 tokens
- 0: ~34.50%
- 1: ~33.30%
- 2: ~32.20%
- min: 4 tokens
- mean: 28.28 tokens
- max: 101 tokens
- min: 4 tokens
- mean: 15.39 tokens
- max: 38 tokens
- Samples:
premise_original hypothesis_original score sentence1 sentence2 N, the total number of LC50 values used in calculating the CV(%) varied with organism and toxicant because some data were rejected due to water hardness, lack of concentration measurements, and/or because some of the LC50s were not calculable.
Most discarded data was rejected due to water hardness.
1
N, CV'nin hesaplanmasında kullanılan LC50 değerlerinin toplam sayısı (%) organizma ve toksik madde ile çeşitlidir, çünkü bazı veriler su sertliği, konsantrasyon ölçümlerinin eksikliği ve / veya LC50'lerin bazıları hesaplanamaz olduğu için reddedilmiştir.
Atılan verilerin çoğu su sertliği nedeniyle reddedildi.
As the home of the Venus de Milo and Mona Lisa, the Louvre drew almost unmanageable crowds until President Mitterrand ordered its re-organization in the 1980s.
The Louvre is home of the Venus de Milo and Mona Lisa.
0
Venus de Milo ve Mona Lisa'nın evi olarak Louvre, Başkan Mitterrand'ın 1980'lerde yeniden düzenlenmesini emredene kadar neredeyse yönetilemez kalabalıklar çekti.
Louvre, Venus de Milo ve Mona Lisa'nın evidir.
A year ago, the wife of the Oxford don noticed that the pattern on Kleenex quilted tissue uncannily resembled the Penrose Arrowed Rhombi tilings pattern, which Sir Roger had invented--and copyrighted--in 1974.
It has been recently found out a similarity between the pattern on the recent Kleenex quilted tissue and the one of the Penrose Arrowed Rhombi tilings.
0
Bir yıl önce Oxford'un karısı, Kleenex kapitone dokudaki desenin 1974'te Sir Roger'ın icat ettiği -ve telif hakkı olan - Penrose Arrowed Rhombi tilings desenine benzediğini fark etti.
Yakın zamanda, son Kleenex kapitone dokudaki desen ile Penrose Arrowed Rhombi döşemelerinden biri arasında bir benzerlik bulunmuştur.
- Loss:
CoSENTLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "pairwise_cos_sim" }
Evaluation Dataset
MoritzLaurer/multilingual-nli-26lang-2mil7
- Dataset: MoritzLaurer/multilingual-nli-26lang-2mil7 at 510a233
- Size: 5,000 evaluation samples
- Columns:
premise_original
,hypothesis_original
,score
,sentence1
, andsentence2
- Approximate statistics based on the first 1000 samples:
premise_original hypothesis_original score sentence1 sentence2 type string string int string string details - min: 5 tokens
- mean: 30.3 tokens
- max: 99 tokens
- min: 6 tokens
- mean: 15.11 tokens
- max: 56 tokens
- 0: ~34.50%
- 1: ~29.90%
- 2: ~35.60%
- min: 6 tokens
- mean: 29.94 tokens
- max: 106 tokens
- min: 5 tokens
- mean: 15.29 tokens
- max: 52 tokens
- Samples:
premise_original hypothesis_original score sentence1 sentence2 But the racism charge isn't quirky or wacky--it's demagogy.
The accusation of prejudice based on a pedestrian kind of hatred.
0
Ama ırkçılık suçlaması tuhaf ya da tuhaf değil, bu bir demagoji.
Yaya nefretine dayanan önyargı suçlaması.
Why would Gates allow the publication of such a book with his byline and photo on the dust jacket?
Gates' byline and photo are on the dust jacket
0
Gates neden böyle bir kitabın basılmasına izin versin ki?
Gates'in çizgisi ve fotoğrafı toz ceketin üzerinde.
I am a nonsmoker and allergic to cigarette smoke.
I do not smoke.
0
Sigara içmeyen biriyim ve sigara dumanına alerjim var.
Sigara içmiyorum.
- Loss:
CoSENTLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "pairwise_cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: epochper_device_train_batch_size
: 32per_device_eval_batch_size
: 64learning_rate
: 2e-05num_train_epochs
: 5warmup_ratio
: 0.1fp16
: Trueload_best_model_at_end
: Trueddp_find_unused_parameters
: False
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: epochprediction_loss_only
: Trueper_device_train_batch_size
: 32per_device_eval_batch_size
: 64per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 5max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Falseddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falsebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Click to expand
Epoch | Step | Training Loss | loss | tr_ling_spearman_max |
---|---|---|---|---|
0.0320 | 25 | 17.17 | - | - |
0.0639 | 50 | 16.4932 | - | - |
0.0959 | 75 | 16.5976 | - | - |
0.1279 | 100 | 15.6991 | - | - |
0.1598 | 125 | 14.876 | - | - |
0.1918 | 150 | 14.4828 | - | - |
0.2238 | 175 | 12.7061 | - | - |
0.2558 | 200 | 10.8687 | - | - |
0.2877 | 225 | 8.3797 | - | - |
0.3197 | 250 | 6.2029 | - | - |
0.3517 | 275 | 5.8228 | - | - |
0.3836 | 300 | 5.811 | - | - |
0.4156 | 325 | 5.8079 | - | - |
0.4476 | 350 | 5.8077 | - | - |
0.4795 | 375 | 5.8035 | - | - |
0.5115 | 400 | 5.8072 | - | - |
0.5435 | 425 | 5.8033 | - | - |
0.5754 | 450 | 5.8086 | - | - |
0.6074 | 475 | 5.81 | - | - |
0.6394 | 500 | 5.7949 | - | - |
0.6714 | 525 | 5.8079 | - | - |
0.7033 | 550 | 5.8057 | - | - |
0.7353 | 575 | 5.8097 | - | - |
0.7673 | 600 | 5.7986 | - | - |
0.7992 | 625 | 5.8051 | - | - |
0.8312 | 650 | 5.8041 | - | - |
0.8632 | 675 | 5.7907 | - | - |
0.8951 | 700 | 5.7991 | - | - |
0.9271 | 725 | 5.8035 | - | - |
0.9591 | 750 | 5.7945 | - | - |
0.9910 | 775 | 5.8077 | - | - |
1.0 | 782 | - | 5.8024 | 0.0330 |
1.0230 | 800 | 5.6703 | - | - |
1.0550 | 825 | 5.8052 | - | - |
1.0870 | 850 | 5.7936 | - | - |
1.1189 | 875 | 5.7924 | - | - |
1.1509 | 900 | 5.7806 | - | - |
1.1829 | 925 | 5.7835 | - | - |
1.2148 | 950 | 5.7619 | - | - |
1.2468 | 975 | 5.8038 | - | - |
1.2788 | 1000 | 5.779 | - | - |
1.3107 | 1025 | 5.7904 | - | - |
1.3427 | 1050 | 5.7696 | - | - |
1.3747 | 1075 | 5.7919 | - | - |
1.4066 | 1100 | 5.7785 | - | - |
1.4386 | 1125 | 5.7862 | - | - |
1.4706 | 1150 | 5.7703 | - | - |
1.5026 | 1175 | 5.773 | - | - |
1.5345 | 1200 | 5.7627 | - | - |
1.5665 | 1225 | 5.7596 | - | - |
1.5985 | 1250 | 5.7882 | - | - |
1.6304 | 1275 | 5.7828 | - | - |
1.6624 | 1300 | 5.771 | - | - |
1.6944 | 1325 | 5.788 | - | - |
1.7263 | 1350 | 5.7719 | - | - |
1.7583 | 1375 | 5.7846 | - | - |
1.7903 | 1400 | 5.7838 | - | - |
1.8223 | 1425 | 5.7912 | - | - |
1.8542 | 1450 | 5.7686 | - | - |
1.8862 | 1475 | 5.7938 | - | - |
1.9182 | 1500 | 5.7847 | - | - |
1.9501 | 1525 | 5.7952 | - | - |
1.9821 | 1550 | 5.7528 | - | - |
2.0 | 1564 | - | 5.7933 | 0.0682 |
2.0141 | 1575 | 5.65 | - | - |
2.0460 | 1600 | 5.7537 | - | - |
2.0780 | 1625 | 5.7098 | - | - |
2.1100 | 1650 | 5.7149 | - | - |
2.1419 | 1675 | 5.7585 | - | - |
2.1739 | 1700 | 5.7277 | - | - |
2.2059 | 1725 | 5.7482 | - | - |
2.2379 | 1750 | 5.7115 | - | - |
2.2698 | 1775 | 5.6895 | - | - |
2.3018 | 1800 | 5.7389 | - | - |
2.3338 | 1825 | 5.7161 | - | - |
2.3657 | 1850 | 5.7123 | - | - |
2.3977 | 1875 | 5.7322 | - | - |
2.4297 | 1900 | 5.7421 | - | - |
2.4616 | 1925 | 5.7615 | - | - |
2.4936 | 1950 | 5.7493 | - | - |
2.5256 | 1975 | 5.7298 | - | - |
2.5575 | 2000 | 5.7529 | - | - |
2.5895 | 2025 | 5.7318 | - | - |
2.6215 | 2050 | 5.7036 | - | - |
2.6535 | 2075 | 5.7158 | - | - |
2.6854 | 2100 | 5.7209 | - | - |
2.7174 | 2125 | 5.738 | - | - |
2.7494 | 2150 | 5.7337 | - | - |
2.7813 | 2175 | 5.713 | - | - |
2.8133 | 2200 | 5.7257 | - | - |
2.8453 | 2225 | 5.6958 | - | - |
2.8772 | 2250 | 5.7053 | - | - |
2.9092 | 2275 | 5.7246 | - | - |
2.9412 | 2300 | 5.7291 | - | - |
2.9731 | 2325 | 5.7139 | - | - |
3.0 | 2346 | - | 5.8510 | 0.0837 |
3.0051 | 2350 | 5.5715 | - | - |
3.0371 | 2375 | 5.6558 | - | - |
3.0691 | 2400 | 5.6441 | - | - |
3.1010 | 2425 | 5.6569 | - | - |
3.1330 | 2450 | 5.669 | - | - |
3.1650 | 2475 | 5.6361 | - | - |
3.1969 | 2500 | 5.6524 | - | - |
3.2289 | 2525 | 5.6773 | - | - |
3.2609 | 2550 | 5.6552 | - | - |
3.2928 | 2575 | 5.6807 | - | - |
3.3248 | 2600 | 5.6638 | - | - |
3.3568 | 2625 | 5.6582 | - | - |
3.3887 | 2650 | 5.658 | - | - |
3.4207 | 2675 | 5.6626 | - | - |
3.4527 | 2700 | 5.6802 | - | - |
3.4847 | 2725 | 5.6377 | - | - |
3.5166 | 2750 | 5.6752 | - | - |
3.5486 | 2775 | 5.6573 | - | - |
3.5806 | 2800 | 5.6963 | - | - |
3.6125 | 2825 | 5.7007 | - | - |
3.6445 | 2850 | 5.6746 | - | - |
3.6765 | 2875 | 5.6312 | - | - |
3.7084 | 2900 | 5.5596 | - | - |
3.7404 | 2925 | 5.7003 | - | - |
3.7724 | 2950 | 5.6739 | - | - |
3.8043 | 2975 | 5.655 | - | - |
3.8363 | 3000 | 5.6787 | - | - |
3.8683 | 3025 | 5.643 | - | - |
3.9003 | 3050 | 5.6412 | - | - |
3.9322 | 3075 | 5.758 | - | - |
3.9642 | 3100 | 5.6769 | - | - |
3.9962 | 3125 | 5.7206 | - | - |
4.0 | 3128 | - | 5.9125 | 0.0824 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.0.0
- Transformers: 4.41.0
- PyTorch: 2.3.0+cu121
- Accelerate: 0.30.1
- Datasets: 2.19.1
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
CoSENTLoss
@online{kexuefm-8847,
title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
author={Su Jianlin},
year={2022},
month={Jan},
url={https://kexue.fm/archives/8847},
}