fergos80
/

mistral-sigoIAspirantes-Orca-oass-500-gpu

Generated from Trainer

4-bit precision

Model card Files Files and versions Community

mistral-sigoIAspirantes-Orca-oass-500-gpu / README.md

fergos80's picture

End of training

709e15a verified 5 months ago

|

history blame contribute delete

No virus

2.56 kB

	---
	license: apache-2.0
	library_name: peft
	tags:
	- generated_from_trainer
	base_model: NickyNicky/Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v2
	model-index:
	- name: mistral-sigoIAspirantes-Orca-oass-500-gpu
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# mistral-sigoIAspirantes-Orca-oass-500-gpu

	This model is a fine-tuned version of [NickyNicky/Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v2](https://huggingface.co/NickyNicky/Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v2) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.1371

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2.5e-05
	- train_batch_size: 2
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 1
	- training_steps: 500
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|
	\| 1.6909 \| 0.3521 \| 25 \| 1.2135 \|
	\| 1.026 \| 0.7042 \| 50 \| 0.8374 \|
	\| 0.7146 \| 1.0563 \| 75 \| 0.6276 \|
	\| 0.5094 \| 1.4085 \| 100 \| 0.4925 \|
	\| 0.3916 \| 1.7606 \| 125 \| 0.3939 \|
	\| 0.3408 \| 2.1127 \| 150 \| 0.3084 \|
	\| 0.1724 \| 2.4648 \| 175 \| 0.2717 \|
	\| 0.2586 \| 2.8169 \| 200 \| 0.2026 \|
	\| 0.1434 \| 3.1690 \| 225 \| 0.1940 \|
	\| 0.1253 \| 3.5211 \| 250 \| 0.1579 \|
	\| 0.1197 \| 3.8732 \| 275 \| 0.1526 \|
	\| 0.0792 \| 4.2254 \| 300 \| 0.1582 \|
	\| 0.0937 \| 4.5775 \| 325 \| 0.1579 \|
	\| 0.0898 \| 4.9296 \| 350 \| 0.1381 \|
	\| 0.0717 \| 5.2817 \| 375 \| 0.1386 \|
	\| 0.0665 \| 5.6338 \| 400 \| 0.1350 \|
	\| 0.0773 \| 5.9859 \| 425 \| 0.1325 \|
	\| 0.0602 \| 6.3380 \| 450 \| 0.1394 \|
	\| 0.055 \| 6.6901 \| 475 \| 0.1377 \|
	\| 0.0602 \| 7.0423 \| 500 \| 0.1371 \|


	### Framework versions

	- PEFT 0.10.1.dev0
	- Transformers 4.41.0.dev0
	- Pytorch 2.3.0+cu121
	- Datasets 2.19.0
	- Tokenizers 0.19.1