kwang2049
/

TSDAE-twitterpara2nli_stsb

Feature Extraction

Inference Endpoints

Model card Files Files and versions Community

TSDAE-twitterpara2nli_stsb / README.md

kwang2049's picture

Create README.md

62f18ce about 3 years ago

|

3.08 kB

	# kwang2049/TSDAE-twitterpara2nli_stsb
	This is a model from the paper ["TSDAE: Using Transformer-based Sequential Denoising Auto-Encoder for Unsupervised Sentence Embedding Learning"](https://arxiv.org/abs/2104.06979). This model adapts the knowledge from the NLI and STSb data to the specific domain twitterpara. Training procedure of this model:
	1. Initialized with [bert-base-uncased](https://huggingface.co/bert-base-uncased);
	2. Unsupervised training on twitterpara with the TSDAE objective;
	3. Supervised training on the NLI data with cross-entropy loss;
	4. Supervised training on the STSb data with MSE loss.

	The pooling method is CLS-pooling.

	## Usage
	To use this model, an convenient way is through [SentenceTransformers](https://github.com/UKPLab/sentence-transformers). So please install it via:
	```bash
	pip install sentence-transformers
	```
	And then load the model and use it to encode sentences:
	```python
	from sentence_transformers import SentenceTransformer, models
	dataset = 'twitterpara'
	model_name_or_path = f'kwang2049/TSDAE-{dataset}2nli_stsb'
	model = SentenceTransformer(model_name_or_path)
	model[1] = models.Pooling(model[0].get_word_embedding_dimension(), pooling_mode='cls') # Note this model uses CLS-pooling
	sentence_embeddings = model.encode(['This is the first sentence.', 'This is the second one.'])
	```
	## Evaluation
	To evaluate the model against the datasets used in the paper, please install our evaluation toolkit [USEB](https://github.com/UKPLab/useb):
	```bash
	pip install useb # Or git clone and pip install .
	python -m useb.downloading all # Download both training and evaluation data
	```
	And then do the evaluation:
	```python
	from sentence_transformers import SentenceTransformer, models
	import torch
	from useb import run_on
	dataset = 'twitterpara'
	model_name_or_path = f'kwang2049/TSDAE-{dataset}2nli_stsb'
	model = SentenceTransformer(model_name_or_path)
	model[1] = models.Pooling(model[0].get_word_embedding_dimension(), pooling_mode='cls') # Note this model uses CLS-pooling
	@torch.no_grad()
	def semb_fn(sentences) -> torch.Tensor:
	return torch.Tensor(model.encode(sentences, show_progress_bar=False))
	result = run_on(
	dataset,
	semb_fn=semb_fn,
	eval_type='test',
	data_eval_path='data-eval'
	)
	```

	## Training
	Please refer to [the page of TSDAE training](https://github.com/UKPLab/sentence-transformers/tree/master/examples/unsupervised_learning/TSDAE) in SentenceTransformers.

	## Cite & Authors
	If you use the code for evaluation, feel free to cite our publication [TSDAE: Using Transformer-based Sequential Denoising Auto-Encoderfor Unsupervised Sentence Embedding Learning](https://arxiv.org/abs/2104.06979):
	```bibtex
	@article{wang-2021-TSDAE,
	title = "TSDAE: Using Transformer-based Sequential Denoising Auto-Encoderfor Unsupervised Sentence Embedding Learning",
	author = "Wang, Kexin and Reimers, Nils and Gurevych, Iryna",
	journal= "arXiv preprint arXiv:2104.06979",
	month = "4",
	year = "2021",
	url = "https://arxiv.org/abs/2104.06979",
	}
	```