OSainz suryanshs16103 commited on
Commit
047d292
β€’
1 Parent(s): 52c7b2a

GPT-3.5 HumanEval_R CodeForces2305 contamination based on https://arxiv.org/abs/2402.15938 (#28)

Browse files

- GPT-3.5 HumanEval_R CodeForces2305 contamination based on https://arxiv.org/abs/2402.15938 (42e416f06292d2cdca4d5374feb3737679a43426)
- Add PR number + postprocessing (1e54760087228b8b8102b4ef47180ea90571f032)


Co-authored-by: Suryansh Sharma <[email protected]>

Files changed (1) hide show
  1. contamination_report.csv +6 -0
contamination_report.csv CHANGED
@@ -6,6 +6,9 @@ Anagrams 1;;GPT-3;;model;;3.0;;data-based;https://arxiv.org/abs/2005.14165;13
6
 
7
  Anagrams 2;;GPT-3;;model;;7.0;;data-based;https://arxiv.org/abs/2005.14165;13
8
 
 
 
 
9
  Cycled Letters;;GPT-3;;model;;1.0;;data-based;https://arxiv.org/abs/2005.14165;13
10
 
11
  EdinburghNLP/xsum;;GPT-3.5;;model;0.0;;100.0;model-based;https://arxiv.org/abs/2308.08493;3
@@ -17,6 +20,9 @@ EdinburghNLP/xsum;;allenai/c4;;corpus;;;15.49;data-based;https://arxiv.org/abs/2
17
 
18
  EleutherAI/hendrycks_math;;GPT-4;;model;100.0;;;data-based;https://arxiv.org/abs/2303.08774;11
19
 
 
 
 
20
  RadNLI;;GPT-3.5;;model;0.0;0.0;0.0;model-based;https://arxiv.org/abs/2308.08493;8
21
  RadNLI;;GPT-4;;model;0.0;0.0;0.0;model-based;https://arxiv.org/abs/2308.08493;8
22
 
 
6
 
7
  Anagrams 2;;GPT-3;;model;;7.0;;data-based;https://arxiv.org/abs/2005.14165;13
8
 
9
+ CodeForces2305;;GPT-3.5-turbo;0613;model;;;0.0;model-based;https://arxiv.org/abs/2402.15938;28
10
+ CodeForces2305;;GPT-3.5-turbo;1106;model;;;0.0;model-based;https://arxiv.org/abs/2402.15938;28
11
+
12
  Cycled Letters;;GPT-3;;model;;1.0;;data-based;https://arxiv.org/abs/2005.14165;13
13
 
14
  EdinburghNLP/xsum;;GPT-3.5;;model;0.0;;100.0;model-based;https://arxiv.org/abs/2308.08493;3
 
20
 
21
  EleutherAI/hendrycks_math;;GPT-4;;model;100.0;;;data-based;https://arxiv.org/abs/2303.08774;11
22
 
23
+ HumanEval_R;;GPT-3.5-turbo;0613;model;;;9.76;model-based;https://arxiv.org/abs/2402.15938;28
24
+ HumanEval_R;;GPT-3.5-turbo;1106;model;;;10.97;model-based;https://arxiv.org/abs/2402.15938;28
25
+
26
  RadNLI;;GPT-3.5;;model;0.0;0.0;0.0;model-based;https://arxiv.org/abs/2308.08493;8
27
  RadNLI;;GPT-4;;model;0.0;0.0;0.0;model-based;https://arxiv.org/abs/2308.08493;8
28