scutcyr
/

dstc11-simmc2.1-scut-bds-lab

Model card Files Files and versions Community

dstc11-simmc2.1-scut-bds-lab / README.md

scutcyr

Update README.md

c482252 about 2 years ago

preview code

raw

history blame contribute delete

4.99 kB

	---
	license: apache-2.0
	---
	## Model Card: dstc11-simmc2.1-scut-bds-lab

	Team: [scut-bds-lab](https://github.com/scut-bds)

	## Recent Update
	- 👏🏻 2022.10.10: The repository `dstc11-simmc2.1-scut-bds-lab` for [DSTC11 Track1](https://github.com/facebookresearch/simmc2) is created.
	- 👏🏻 2022.10.28: The model is public on huggingface, see the link [https://huggingface.co/scutcyr/dstc11-simmc2.1-scut-bds-lab](https://huggingface.co/scutcyr/dstc11-simmc2.1-scut-bds-lab) for detail.

	## Overview
	The [SIMMC2.1](https://github.com/facebookresearch/simmc2) challenge aims to lay the foundations for the real-world assistant agents that can handle multimodal inputs, and perform multimodal actions. It has 4 tasks: Ambiguous Candidate Identification, Multimodal Coreference Resolution, Multimodal Dialog State Tracking, Response Generation. We consider the joint input of textual context, tokenized objects and scene as multi-modal input, as well as compare the performance of single task training and multi task joint training.
	As to subtask4, we also consider the system belief state (act and slot values) as the prombt for response generation. Non-visual metadata is also considered by adding the embedding to the object.

	## Model Date
	Model was originally released in October 2022.

	## Model Type
	The mt-bart, mt-bart-sys and mt-bart-sys-nvattr have the same model framework (transformer with multi-task head), which are finetuned on [SIMMC2.1](https://github.com/facebookresearch/simmc2) based on the pretrained [BART-Large](https://huggingface.co/facebook/bart-large) model. This [repository](https://github.com/scutcyr/dstc11-simmc2.1-scut-bds-lab) also contains code to finetune the model.


	## Results

	### devtest result

	\| Model \| Subtask-1 Amb. Candi. F1 \| Subtask-2 MM Coref F1 \| Subtask-3 MM DST Slot F1 \| Subtask-3 MM DST Intent F1 \| Subtask-4 Response Gen. BLEU-4 \|
	\|:----:\|:----:\|:----:\|:----:\|:----:\|:----:\|
	\| mt-bart-ensemble \| 0.68466 \| 0.77860 \| 0.91816 \| 0.97828 \| 0.34496 \|
	\| mt-bart-dstcla \| 0.67589 \| 0.78407 \| 0.92013 \| 0.97468 \| \|
	\| mt-bart-dstcla-ensemble \| 0.67777 \| 0.78640 \| 0.92055 \| 0.97456 \| \|
	\| mt-bart-sys \| \| \| \| \| 0.39064 \|
	\| mt-bart-sys-2 \| \| \| \| \| 0.3909 \|
	\| mt-bart-sys-ensemble \| \| \| \| \| 0.3894 \|
	\| mt-bart-sys-nvattr \| \| \| \| \| 0.38995 \|

	### teststd result
	The teststd result is provided in the [teststd-result](https://github.com/scutcyr/dstc11-simmc2.1-iflytek/blob/main/results/teststd-result). One subfolder corresponds to one model.


	## Using with Transformers
	(1) You should first download the model from huggingface used the scripts:
	```bash
	cd ~
	mkdir pretrained_model
	cd pretrained_model
	git lfs install
	git clone https://huggingface.co/scutcyr/dstc11-simmc2.1-scut-bds-lab
	```
	(2) Then you should clone our code use the follow scripts:
	```bash
	cd ~
	git clone https://github.com/scutcyr/dstc11-simmc2.1-scut-bds-lab.git
	```
	(3) Follow the [README](https://github.com/scutcyr/dstc11-simmc2.1-scut-bds-lab#readme) to use the model.


	## References

	```
	@inproceedings{kottur-etal-2021-simmc,
	title = "{SIMMC} 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations",
	author = "Kottur, Satwik and
	Moon, Seungwhan and
	Geramifard, Alborz and
	Damavandi, Babak",
	booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
	month = nov,
	year = "2021",
	address = "Online and Punta Cana, Dominican Republic",
	publisher = "Association for Computational Linguistics",
	url = "https://aclanthology.org/2021.emnlp-main.401",
	doi = "10.18653/v1/2021.emnlp-main.401",
	pages = "4903--4912",
	}

	@inproceedings{lee-etal-2022-learning,
	title = "Learning to Embed Multi-Modal Contexts for Situated Conversational Agents",
	author = "Lee, Haeju and
	Kwon, Oh Joon and
	Choi, Yunseon and
	Park, Minho and
	Han, Ran and
	Kim, Yoonhyung and
	Kim, Jinhyeon and
	Lee, Youngjune and
	Shin, Haebin and
	Lee, Kangwook and
	Kim, Kee-Eung",
	booktitle = "Findings of the Association for Computational Linguistics: NAACL 2022",
	month = jul,
	year = "2022",
	address = "Seattle, United States",
	publisher = "Association for Computational Linguistics",
	url = "https://aclanthology.org/2022.findings-naacl.61",
	doi = "10.18653/v1/2022.findings-naacl.61",
	pages = "813--830",
	}
	```


	## Acknowledge
	* We would like to express our gratitude to the authors of [Hugging Face's Transformers🤗](https://huggingface.co/) and its open source community for the excellent design on pretrained models usage.
	* We would like to express our gratitude to [Meta Research \| Facebook AI Research](https://github.com/facebookresearch) for the SIMMC2.1 dataset and the baseline code.
	* We would like to express our gratitude to [KAIST-AILab](https://github.com/KAIST-AILab/DSTC10-SIMMC) for the basic research framework on SIMMC2.0.