pankajmathur
/

orca_alpaca_3b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

orca_alpaca_3b / README.md

Pankaj Mathur

Update README.md

4c98e7e over 1 year ago

|

2.34 kB

	# alpaca_orca_open_llama: An Open_LLaMA-3B model trained on Alpaca dataset using Orca Research paper approaches


	# Dataset and Training

	We train OpenLLaMa-3B model on the custom Alpaca dataset created using Orca Research Paper approaches.

	Please pay attention how System prompt is added and used for each instruction.

	The training configurations are provided in the table below.

	The training takes on 4 x A600(50G) GPUs and lasts for around 20 Hours for cost of $66.

	We used DeepSpeed with Zero-3 approaches for parallel gpu training.

	\|\|\|
	\|:-------------:\|:-------------:\|
	\|Batch Size\|16\|
	\|train_micro_batch_size_per_gpu\|2\|
	\|gradient_accumulation_steps\|2\|
	\|Learning rate\|2e-5\|
	\|Epochs\|3\|
	\|Max length\|1024\|



	# Example Usage

	Below shows an example on how to use OpenAlpaca

	```python
	import torch
	from transformers import LlamaForCausalLM, LlamaTokenizer

	# the previewed version of OpenAlpaca
	model_path = r'psmathur/alpaca_orca_open_llama_3b'
	tokenizer = LlamaTokenizer.from_pretrained(model_path)
	model = LlamaForCausalLM.from_pretrained(model_path).cuda()
	tokenizer.bos_token_id, tokenizer.eos_token_id = 1,2 # see https://github.com/openlm-research/open_llama#preview-weights-release-and-usage

	# same prompt as provided by Orca Research Paper
	system = r'You are an AI assistant. User will you give you a task. Your goal is to complete the task as faithfully as you can. While performing the task think step-by-step and justify your steps.'
	instruction = r'Use the given data to calculate the median.'
	input = r'[7, 3, 8, 2, 10]'


	prompt_no_input = f'.\n\n### Instruction:\n{instruction}\n\n### Response:'
	tokens = tokenizer.encode(prompt_no_input)

	tokens = torch.LongTensor(tokens).unsqueeze(0)
	instance = {'input_ids': tokens,
	'top_k': 50,
	'top_p': 0.9,
	'generate_len': 128}

	length = len(tokens[0])
	with torch.no_grad():
	rest = model.generate(
	input_ids=tokens,
	max_length=length+instance['generate_len'],
	use_cache=True,
	do_sample=True,
	top_p=instance['top_p'],
	top_k=instance['top_k']
	)

	output = rest[0][length:]
	string = tokenizer.decode(output, skip_special_tokens=True)
	print(f'[!] Generation results: {string}')
	```