Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
posted an update Mar 15
KTO offers an easier way to preference train LLMs (only πŸ‘πŸ‘Ž ratings are required). As part of #DataIsBetterTogether, I've written a tutorial on creating a preference dataset using Argilla and Spaces.

Using this approach, you can create a dataset that anyone with a Hugging Face account can contribute to 🀯

See an example of the kind of Space you can create following this tutorial here: davanstrien/haiku-preferences

πŸ†• New tutorial covers:
πŸ’¬ Generating responses with open models
πŸ‘₯ Collecting human feedback (do you like this model response? Yes/No)
πŸ€– Preparing a TRL-compatible dataset for training aligned models

Check it out here:

I see

The current notebooks and code currently only show how to generate the synthetic data and create a preference dataset annotation Space. The next steps would be to collect human feedback on the synthetic data and then use this to train a model. We will cover this in a future notebook.

Is there a future notebook with this content already?


Hopefully I'll have something to share for this soon! I still need to do some more annotating!