Commit
β’
047d292
1
Parent(s):
52c7b2a
GPT-3.5 HumanEval_R CodeForces2305 contamination based on https://arxiv.org/abs/2402.15938 (#28)
Browse files- GPT-3.5 HumanEval_R CodeForces2305 contamination based on https://arxiv.org/abs/2402.15938 (42e416f06292d2cdca4d5374feb3737679a43426)
- Add PR number + postprocessing (1e54760087228b8b8102b4ef47180ea90571f032)
Co-authored-by: Suryansh Sharma <[email protected]>
- contamination_report.csv +6 -0
contamination_report.csv
CHANGED
@@ -6,6 +6,9 @@ Anagrams 1;;GPT-3;;model;;3.0;;data-based;https://arxiv.org/abs/2005.14165;13
|
|
6 |
|
7 |
Anagrams 2;;GPT-3;;model;;7.0;;data-based;https://arxiv.org/abs/2005.14165;13
|
8 |
|
|
|
|
|
|
|
9 |
Cycled Letters;;GPT-3;;model;;1.0;;data-based;https://arxiv.org/abs/2005.14165;13
|
10 |
|
11 |
EdinburghNLP/xsum;;GPT-3.5;;model;0.0;;100.0;model-based;https://arxiv.org/abs/2308.08493;3
|
@@ -17,6 +20,9 @@ EdinburghNLP/xsum;;allenai/c4;;corpus;;;15.49;data-based;https://arxiv.org/abs/2
|
|
17 |
|
18 |
EleutherAI/hendrycks_math;;GPT-4;;model;100.0;;;data-based;https://arxiv.org/abs/2303.08774;11
|
19 |
|
|
|
|
|
|
|
20 |
RadNLI;;GPT-3.5;;model;0.0;0.0;0.0;model-based;https://arxiv.org/abs/2308.08493;8
|
21 |
RadNLI;;GPT-4;;model;0.0;0.0;0.0;model-based;https://arxiv.org/abs/2308.08493;8
|
22 |
|
|
|
6 |
|
7 |
Anagrams 2;;GPT-3;;model;;7.0;;data-based;https://arxiv.org/abs/2005.14165;13
|
8 |
|
9 |
+
CodeForces2305;;GPT-3.5-turbo;0613;model;;;0.0;model-based;https://arxiv.org/abs/2402.15938;28
|
10 |
+
CodeForces2305;;GPT-3.5-turbo;1106;model;;;0.0;model-based;https://arxiv.org/abs/2402.15938;28
|
11 |
+
|
12 |
Cycled Letters;;GPT-3;;model;;1.0;;data-based;https://arxiv.org/abs/2005.14165;13
|
13 |
|
14 |
EdinburghNLP/xsum;;GPT-3.5;;model;0.0;;100.0;model-based;https://arxiv.org/abs/2308.08493;3
|
|
|
20 |
|
21 |
EleutherAI/hendrycks_math;;GPT-4;;model;100.0;;;data-based;https://arxiv.org/abs/2303.08774;11
|
22 |
|
23 |
+
HumanEval_R;;GPT-3.5-turbo;0613;model;;;9.76;model-based;https://arxiv.org/abs/2402.15938;28
|
24 |
+
HumanEval_R;;GPT-3.5-turbo;1106;model;;;10.97;model-based;https://arxiv.org/abs/2402.15938;28
|
25 |
+
|
26 |
RadNLI;;GPT-3.5;;model;0.0;0.0;0.0;model-based;https://arxiv.org/abs/2308.08493;8
|
27 |
RadNLI;;GPT-4;;model;0.0;0.0;0.0;model-based;https://arxiv.org/abs/2308.08493;8
|
28 |
|