grg commited on
Commit
f0948c9
1 Parent(s): d7c0630

Dataset link opens in new tab

Browse files
Files changed (2) hide show
  1. templates/about.html +2 -2
  2. templates/index.html +1 -1
templates/about.html CHANGED
@@ -216,7 +216,7 @@
216
  We adopt the <b>Schwartz Theory of Basic Personal Values</b>, which defines 10 values: Self-Direction, Stimulation, Hedonism, Achievement, Power, Security, Conformity, Tradition, Benevolence, and Universalism.
217
  To evaluate their expression we use the associated questionnaires: <b>PVQ-40</b>, and <b>SVS</b>.
218
  </p>
219
- <p>You can browse the questionnaires, population, and contexts used on our <a href="https://huggingface.co/datasets/flowers-team/StickToYourRole">&#129303; StickToYourRole dataset</a>. </p>
220
  <p>
221
  The Stick to Your Role! leaderboard aims to provide an up-to-date comparison of recent LLMs based on their ability to coherently simulate popultions.
222
  It, in tandem with other minimal-context benchmarks, should enable you to choose the best-suited model for your usecase!
@@ -258,7 +258,7 @@
258
  <li> <b> chess </b>: "1. e4" is given as the initial message to all personas, but for each persona the Interlocutor model is instructed to simulate a different persona (instead of a human user) </li>
259
  <li> <b> grammar </b>: like chess, but "Can you check this sentence for grammar? \n Whilst Jane was waiting to meet hers friend their nose started bleeding." is given as the initial message.
260
  </ul>
261
- <p>You can browse the simulated population, questionnaires, and contexts used on our <a href="https://huggingface.co/datasets/flowers-team/StickToYourRole">&#129303; StickToYourRole dataset</a>.</p>
262
  </div>
263
  <div class="section" id="validation">
264
  <div class="section-title">Validation</div>
 
216
  We adopt the <b>Schwartz Theory of Basic Personal Values</b>, which defines 10 values: Self-Direction, Stimulation, Hedonism, Achievement, Power, Security, Conformity, Tradition, Benevolence, and Universalism.
217
  To evaluate their expression we use the associated questionnaires: <b>PVQ-40</b>, and <b>SVS</b>.
218
  </p>
219
+ <p>You can browse the questionnaires, population, and contexts used on our <a target="_blank" href="https://huggingface.co/datasets/flowers-team/StickToYourRole">&#129303; StickToYourRole dataset</a>. </p>
220
  <p>
221
  The Stick to Your Role! leaderboard aims to provide an up-to-date comparison of recent LLMs based on their ability to coherently simulate popultions.
222
  It, in tandem with other minimal-context benchmarks, should enable you to choose the best-suited model for your usecase!
 
258
  <li> <b> chess </b>: "1. e4" is given as the initial message to all personas, but for each persona the Interlocutor model is instructed to simulate a different persona (instead of a human user) </li>
259
  <li> <b> grammar </b>: like chess, but "Can you check this sentence for grammar? \n Whilst Jane was waiting to meet hers friend their nose started bleeding." is given as the initial message.
260
  </ul>
261
+ <p>You can browse the simulated population, questionnaires, and contexts used on our <a target="_blank" href="https://huggingface.co/datasets/flowers-team/StickToYourRole">&#129303; StickToYourRole dataset</a>.</p>
262
  </div>
263
  <div class="section" id="validation">
264
  <div class="section-title">Validation</div>
templates/index.html CHANGED
@@ -252,7 +252,7 @@
252
  The Stick to You Role! leaderboard focuses on the <b>stability of simulated personal values during role-playing</b>.
253
  We study the <b>coherence of a simulated population</b>.
254
  In contrast to evaluating each simulated persona separately, we evaluate personas relative to each other, i.e. as a population.
255
- You can browse the simulated population, questionnaires, and contexts used on our <a href="https://huggingface.co/datasets/flowers-team/StickToYourRole">&#129303; StickToYourRole dataset</a>.
256
  </p>
257
  <div class="table-responsive main-table">
258
  <!-- Render the table HTML here -->
 
252
  The Stick to You Role! leaderboard focuses on the <b>stability of simulated personal values during role-playing</b>.
253
  We study the <b>coherence of a simulated population</b>.
254
  In contrast to evaluating each simulated persona separately, we evaluate personas relative to each other, i.e. as a population.
255
+ You can browse the simulated population, questionnaires, and contexts used on our <a target="_blank" href="https://huggingface.co/datasets/flowers-team/StickToYourRole">&#129303; StickToYourRole dataset</a>.
256
  </p>
257
  <div class="table-responsive main-table">
258
  <!-- Render the table HTML here -->