chessdevilai

This model is a fine-tuned version of EleutherAI/pythia-70m-deduped on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.7609

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss
1.1912	0.0100	62	1.2654
1.1714	0.0200	124	1.1780
1.0771	0.0301	186	1.1419
1.0829	0.0401	248	1.1046
1.0113	0.0501	310	1.0850
1.152	0.0601	372	1.0701
1.0895	0.0701	434	1.0544
0.9123	0.0802	496	1.0484
1.0489	0.0902	558	1.0214
1.0312	0.1002	620	1.0252
0.9756	0.1102	682	1.0020
1.0125	0.1202	744	0.9940
1.0581	0.1303	806	0.9862
1.0726	0.1403	868	0.9809
0.9963	0.1503	930	0.9830
0.9309	0.1603	992	0.9653
0.8858	0.1703	1054	0.9538
1.1137	0.1803	1116	0.9472
0.9024	0.1904	1178	0.9411
0.9812	0.2004	1240	0.9396
0.9916	0.2104	1302	0.9254
0.9509	0.2204	1364	0.9334
0.8848	0.2304	1426	0.9439
0.8302	0.2405	1488	0.9175
1.0111	0.2505	1550	0.9158
1.0273	0.2605	1612	0.9182
0.8968	0.2705	1674	0.9116
0.8892	0.2805	1736	0.9098
0.7539	0.2906	1798	0.8896
0.811	0.3006	1860	0.8968
0.928	0.3106	1922	0.8875
0.8163	0.3206	1984	0.8821
0.9202	0.3306	2046	0.8820
1.0208	0.3407	2108	0.8811
0.8297	0.3507	2170	0.8823
0.8213	0.3607	2232	0.8736
0.8324	0.3707	2294	0.8698
0.7721	0.3807	2356	0.8735
0.9504	0.3908	2418	0.8705
0.858	0.4008	2480	0.8620
0.8791	0.4108	2542	0.8540
0.8411	0.4208	2604	0.8606
0.8845	0.4308	2666	0.8496
0.7752	0.4409	2728	0.8462
0.8598	0.4509	2790	0.8481
0.7935	0.4609	2852	0.8412
0.7352	0.4709	2914	0.8392
0.8153	0.4809	2976	0.8426
0.7371	0.4910	3038	0.8332
0.7136	0.5010	3100	0.8300
0.9777	0.5110	3162	0.8294
0.8336	0.5210	3224	0.8306
0.7546	0.5310	3286	0.8234
0.8436	0.5410	3348	0.8237
0.9316	0.5511	3410	0.8224
0.6996	0.5611	3472	0.8191
0.7417	0.5711	3534	0.8146
0.8528	0.5811	3596	0.8110
0.6861	0.5911	3658	0.8095
0.8401	0.6012	3720	0.8096
0.7056	0.6112	3782	0.8080
0.8643	0.6212	3844	0.8004
0.7575	0.6312	3906	0.8018
0.8133	0.6412	3968	0.8008
0.8221	0.6513	4030	0.7940
0.8004	0.6613	4092	0.7948
0.7002	0.6713	4154	0.7984
0.8425	0.6813	4216	0.7892
0.6777	0.6913	4278	0.7876
0.9178	0.7014	4340	0.7865
0.787	0.7114	4402	0.7844
0.6979	0.7214	4464	0.7829
0.7954	0.7314	4526	0.7825
0.7937	0.7414	4588	0.7792
0.7849	0.7515	4650	0.7790
0.7108	0.7615	4712	0.7782
0.831	0.7715	4774	0.7768
0.8242	0.7815	4836	0.7741
0.7472	0.7915	4898	0.7731
0.8171	0.8016	4960	0.7732
0.7857	0.8116	5022	0.7702
0.7925	0.8216	5084	0.7707
0.7134	0.8316	5146	0.7680
0.8401	0.8416	5208	0.7686
0.6919	0.8516	5270	0.7679
0.7689	0.8617	5332	0.7658
0.7899	0.8717	5394	0.7645
0.8457	0.8817	5456	0.7639
0.7738	0.8917	5518	0.7635
0.7943	0.9017	5580	0.7628
0.756	0.9118	5642	0.7625
0.8021	0.9218	5704	0.7619
0.7325	0.9318	5766	0.7615
0.7312	0.9418	5828	0.7613
0.8255	0.9518	5890	0.7613
0.794	0.9619	5952	0.7610
0.7392	0.9719	6014	0.7609
0.841	0.9819	6076	0.7609
0.7018	0.9919	6138	0.7609

Framework versions

Transformers 4.41.2
Pytorch 2.3.0+cu121
Datasets 2.19.2
Tokenizers 0.19.1

Vasanth
/

chessdevilai

chessdevilai

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Vasanth/chessdevilai

Evaluation results