Update README.md
Browse files
README.md
CHANGED
@@ -78,10 +78,10 @@ The training was executed on 1 x V100 (16GB) GPU for 6h 42m
|
|
78 |
|
79 |
We evaluated the model on the first 400 samples of XLCOST's [XLCost-single-prompt test split](https://huggingface.co/datasets/giulio98/xlcost-single-prompt/viewer/Python/test) and comparing the outputs of the generated codes with respect to the expected output using pass@k metric.
|
80 |
|
81 |
-
| Metric | codegen-350M-multi-xlcost |
|
82 |
-
|
83 |
-
|pass@1 | 3.70% |
|
84 |
-
|pass@10 | 14.5% |
|
85 |
|
86 |
The [pass@k metric](https://huggingface.co/metrics/code_eval) tells the probability that at least one out of k generations passes the tests.
|
87 |
|
|
|
78 |
|
79 |
We evaluated the model on the first 400 samples of XLCOST's [XLCost-single-prompt test split](https://huggingface.co/datasets/giulio98/xlcost-single-prompt/viewer/Python/test) and comparing the outputs of the generated codes with respect to the expected output using pass@k metric.
|
80 |
|
81 |
+
| Metric | codegen-350M-multi-xlcost | codegen-350M-mono(zero-shot) | codegen-350M-mono (one-shot) | codegen-350M-mono(few-shot)
|
82 |
+
|--------|-----|-----|-----|-----|
|
83 |
+
|pass@1 | 3.70% | 0.4% | 0.35% | 0.48% |
|
84 |
+
|pass@10 | 14.5% | 3.5% | 3 % | 3.75% |
|
85 |
|
86 |
The [pass@k metric](https://huggingface.co/metrics/code_eval) tells the probability that at least one out of k generations passes the tests.
|
87 |
|