Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
pminervini
commited on
Commit
β’
7148b21
1
Parent(s):
d176dfa
update
Browse files- app.py +1 -1
- src/display/about.py +1 -3
app.py
CHANGED
@@ -242,7 +242,7 @@ with demo:
|
|
242 |
leaderboard_table,
|
243 |
queue=True)
|
244 |
|
245 |
-
with gr.TabItem("
|
246 |
gr.Markdown(LLM_BENCHMARKS_TEXT, elem_classes="markdown-text")
|
247 |
print(f'dataset df columns: {list(dataset_df.columns)}')
|
248 |
dataset_table = gr.components.Dataframe(
|
|
|
242 |
leaderboard_table,
|
243 |
queue=True)
|
244 |
|
245 |
+
with gr.TabItem("About", elem_id="llm-benchmark-tab-table", id=2):
|
246 |
gr.Markdown(LLM_BENCHMARKS_TEXT, elem_classes="markdown-text")
|
247 |
print(f'dataset df columns: {list(dataset_df.columns)}')
|
248 |
dataset_table = gr.components.Dataframe(
|
src/display/about.py
CHANGED
@@ -6,13 +6,11 @@ INTRODUCTION_TEXT = """
|
|
6 |
π The Hallucinations Leaderboard aims to track, rank and evaluate hallucinations in LLMs.
|
7 |
|
8 |
Submit a model for automated evaluation on the [Edinburgh International Data Facility](https://www.epcc.ed.ac.uk/hpc-services/edinburgh-international-data-facility) (EIDF) GPU cluster on the "Submit" page.
|
9 |
-
|
10 |
The backend of the Hallucinations leaderboard is based on the [Eleuther AI Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) --- more details in the "About" page.
|
11 |
-
|
12 |
Metrics and datasets used by the Hallucinations Leaderboard were identified while writing our [awesome-hallucinations-detection](https://github.com/EdinburghNLP/awesome-hallucination-detection) page (you are encouraged to contribute to this list via pull requests).
|
13 |
If you have comments or suggestions on datasets and metrics, please [reach out to us in our discussion forum](https://huggingface.co/spaces/hallucinations-leaderboard/leaderboard/discussions).
|
14 |
|
15 |
-
For more information
|
16 |
"""
|
17 |
|
18 |
LLM_BENCHMARKS_TEXT = f"""
|
|
|
6 |
π The Hallucinations Leaderboard aims to track, rank and evaluate hallucinations in LLMs.
|
7 |
|
8 |
Submit a model for automated evaluation on the [Edinburgh International Data Facility](https://www.epcc.ed.ac.uk/hpc-services/edinburgh-international-data-facility) (EIDF) GPU cluster on the "Submit" page.
|
|
|
9 |
The backend of the Hallucinations leaderboard is based on the [Eleuther AI Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) --- more details in the "About" page.
|
|
|
10 |
Metrics and datasets used by the Hallucinations Leaderboard were identified while writing our [awesome-hallucinations-detection](https://github.com/EdinburghNLP/awesome-hallucination-detection) page (you are encouraged to contribute to this list via pull requests).
|
11 |
If you have comments or suggestions on datasets and metrics, please [reach out to us in our discussion forum](https://huggingface.co/spaces/hallucinations-leaderboard/leaderboard/discussions).
|
12 |
|
13 |
+
For more information about the leaderboard, check our [HuggingFace Blog article](https://huggingface.co/blog/leaderboards-on-the-hub-hallucinations).
|
14 |
"""
|
15 |
|
16 |
LLM_BENCHMARKS_TEXT = f"""
|