Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
davanstrienΒ 
posted an update Aug 19
Post
3119
πŸš€ Introducing Hugging Face Similar: a Chrome extension to find relevant datasets!

✨ Adds a "Similar Datasets" section to Hugging Face dataset pages
πŸ” Recommendations based on dataset READMEs
πŸ—οΈ Powered by https://huggingface.co/chromadb and https://huggingface.co/Snowflake embeddings.

You can try it here: https://chromewebstore.google.com/detail/hugging-face-similar/aijelnjllajooinkcpkpbhckbghghpnl?authuser=0&hl=en.

I am very happy to get feedback on whether this could be useful or not πŸ€—

Very useful extension! Thanks

image.png

96% to wikipedia - i love the idea - but the similarity estimation is far off

original dataset
https://huggingface.co/datasets/kalomaze/Opus_Instruct_25k

Β·

At the moment, this is relying on the dataset cards, so the similarity does indeed work better for longer dataset cards. I plan for a version that will directly use the dataset to create the similarity scores, which should hopefully work better!

thanks!