codegen-350M-multi-xlcost-v2 / README.md

Update README.md

22dc2a5 about 2 years ago

4.3 kB

	---
	language: code
	tags:
	- code
	- gpt2
	- generation
	datasets:
	- giulio98/xlcost-single-prompt
	widget:
	- text: "'''\nfunction to add two numbers\n'''\n###\n"
	example_title: "add two numbers"
	model-index:
	- name: codegen-350M-multi-xlcost
	results:
	- task:
	name: Code Generation
	type: code-generation
	dataset:
	name: "XLCost"
	type: code_eval_outputs
	metrics:
	- name: pass@1
	type: code_eval_outputs
	value: 3.325
	- name: pass@10
	type: code_eval_outputs
	value: 15
	- name: codebleu
	type: codebleu
	value: 20.18191
	---

	# CodeGen-350M-multi-xlcost-v2

	CodeGen-350M-multi-xlcost is a CodeGen model fine-tuned on the Python split of XLCost dataset using Deepspeed.

	## Usage

	You can load the CodeGen-350M-multi-xlcost-v2 model and tokenizer directly in `transformers`:

	```Python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	tokenizer = AutoTokenizer.from_pretrained("giulio98/codegen-350M-multi-xlcost-v2")
	model = AutoModelForCausalLM.from_pretrained("giulio98/codegen-350M-multi-xlcost-v2")

	text = tokenizer.eos_token + "\'\'\'\n" + "function to add two numbers" + "\n\'\'\'\n" + "###\n"
	input_ids = tokenizer(text, return_tensors="pt").input_ids

	generated_ids = model.generate(input_ids, max_length=128)
	print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))
	```
	Output:
	```Python
	'''
	function to add two numbers
	'''
	###
	def add(a, b):
	return a + b
	```
	## Training

	The model was finetuned on [XLCost-single-prompt](https://huggingface.co/datasets/giulio98/xlcost-single-prompt), an improved version of the original XLCost dataset [
	xlcost-text-to-code](https://huggingface.co/datasets/codeparrot/xlcost-text-to-code). Below the hyperparameters.

	\| Hyperparameter \| value \|
	\|---------------------------\|--------\|
	\|Per device train batch size\| 16 \|
	\|Context size\| 1024 \|
	\|Training steps\| 259\|
	\|Gradient accumulation\| 2\|
	\|Gradient checkpointing\| True\|
	\|Learning rate\|1.8e-05 \|
	\|Weight decay \| 0.1 \|
	\|Warmup steps\| 35 \|
	\|Schedule\| linear \|
	\|zero stage\| 2 \|

	Below the deepspeed configuration
	```Python
	{
	"fp16": {
	"enabled": true,
	"loss_scale": 0,
	"loss_scale_window": 1000,
	"initial_scale_power": 16,
	"hysteresis": 2,
	"min_loss_scale": 1
	},
	"optimizer": {
	"type": "AdamW",
	"params": {
	"lr": 0.000018,
	"betas": [
	0.9,
	0.999
	],
	"eps": 1e-8,
	"weight_decay": 0.1
	}
	},
	"scheduler": {
	"type": "WarmupLR",
	"params": {
	"warmup_min_lr": 0,
	"warmup_max_lr": 0.000018,
	"warmup_num_steps": 35
	}
	},
	"zero_optimization": {
	"stage": 2,
	"offload_optimizer": {
	"device": "cpu",
	"pin_memory": false
	},
	"allgather_partitions": true,
	"allgather_bucket_size": 200000000,
	"overlap_comm": true,
	"reduce_scatter": true,
	"reduce_bucket_size": 200000000,
	"contiguous_gradients": true
	},
	"gradient_accumulation_steps": 2,
	"train_batch_size": 32,
	"train_micro_batch_size_per_gpu": 16,
	"gradient_clipping": 1,
	"wall_clock_breakdown": false
	}
	```

	The training was executed on 1 x V100 (16GB) GPU for 28min 50sec

	## Performance

	We evaluated the model on the first 400 samples of XLCOST's [XLCost-single-prompt test split](https://huggingface.co/datasets/giulio98/xlcost-single-prompt/viewer/Python/test) and comparing the outputs of the generated codes with respect to the expected output using pass@k metric.

	\| Metric \| codegen-350M-multi-xlcost-v2 \| codegen-350M-multi-xlcost \| codegen-350M-mono(zero-shot) \| codegen-350M-mono (one-shot) \| codegen-350M-mono(few-shot)
	\|--------\|-----\|-----\|-----\|-----\|-----\|
	\|pass@1 \|3.325% \|3.70% \| 0.4% \| 0.35% \| 0.48% \|
	\|pass@10 \|15%\| 14.5% \| 3.5% \| 3 % \| 3.75% \|
	\|CodeBLEU \|20.18%\| None \| 15.15% \| 19.42 % \| 20.27% \|

	The [pass@k metric](https://huggingface.co/metrics/code_eval) tells the probability that at least one out of k generations passes the tests.

	## Citations
	```
	@article{Nijkamp2022ACP,
	title={A Conversational Paradigm for Program Synthesis},
	author={Nijkamp, Erik and Pang, Bo and Hayashi, Hiroaki and Tu, Lifu and Wang, Huan and Zhou, Yingbo and Savarese, Silvio and Xiong, Caiming},
	journal={arXiv preprint},
	year={2022}
	}
	```