Spaces:
Runtime error
Runtime error
Update app.py
Browse files
app.py
CHANGED
@@ -137,6 +137,8 @@ To raise awareness of this issue, we show in this demo how much [StarCoder](http
|
|
137 |
We found that **StarCoder memorized at least 8% of the training samples** we used, which highlights the high risks of LLMs exposing the training set. We provide a notebook to reproduce our results [here](https://colab.research.google.com/drive/1YaaPOXzodEAc4JXboa12gN5zdlzy5XaR?usp=sharing). ๐
|
138 |
|
139 |
To evaluate memorization of the training set, we can prompt StarCoder with the first tokens of an example from the training set. If StarCoder completes the prompt with an output that looks very similar to the original sample, we will consider this sample to be memorized by the LLM. ๐พ
|
|
|
|
|
140 |
"""
|
141 |
|
142 |
memorization_definition = """
|
|
|
137 |
We found that **StarCoder memorized at least 8% of the training samples** we used, which highlights the high risks of LLMs exposing the training set. We provide a notebook to reproduce our results [here](https://colab.research.google.com/drive/1YaaPOXzodEAc4JXboa12gN5zdlzy5XaR?usp=sharing). ๐
|
138 |
|
139 |
To evaluate memorization of the training set, we can prompt StarCoder with the first tokens of an example from the training set. If StarCoder completes the prompt with an output that looks very similar to the original sample, we will consider this sample to be memorized by the LLM. ๐พ
|
140 |
+
|
141 |
+
โ ๏ธNon responsiveness: We use Hugging Face Pro Inference solution to query StarCoder, which might be not available. If the demo does not work, please try later.
|
142 |
"""
|
143 |
|
144 |
memorization_definition = """
|