collapse_gemma-2-2b_hs2_replace_iter8_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.6108	0.0315	5	1.3097	239488
1.2048	0.0630	10	1.2514	488880
0.7736	0.0945	15	1.3428	739832
0.4942	0.1259	20	1.5487	988640
0.3684	0.1574	25	1.6597	1234208
0.2257	0.1889	30	1.8226	1477784
0.104	0.2204	35	2.0198	1730776
0.079	0.2519	40	2.1574	1971328
0.0504	0.2834	45	2.3647	2217856
0.0368	0.3148	50	2.4414	2465200
0.0362	0.3463	55	2.5177	2715224
0.0347	0.3778	60	2.5495	2963688
0.0318	0.4093	65	2.5692	3204352
0.0298	0.4408	70	2.5663	3455912
0.026	0.4723	75	2.5764	3694848
0.0277	0.5037	80	2.5583	3950488
0.0251	0.5352	85	2.5831	4197448
0.03	0.5667	90	2.6005	4438720
0.0247	0.5982	95	2.5882	4687496
0.024	0.6297	100	2.5853	4937840
0.0245	0.6612	105	2.6122	5185648
0.0259	0.6926	110	2.6367	5428648
0.0261	0.7241	115	2.6511	5673016
0.0276	0.7556	120	2.6375	5923456
0.0257	0.7871	125	2.6391	6177184
0.0255	0.8186	130	2.6434	6421672
0.025	0.8501	135	2.6282	6667984
0.0265	0.8815	140	2.6097	6917840
0.0258	0.9130	145	2.6087	7163648
0.0243	0.9445	150	2.6101	7416408
0.0237	0.9760	155	2.6211	7665640