Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
louisbrulenaudet 
posted an update Apr 9
Post
2256
LegalKit Retrieval, a binary Search with Scalar (int8) Rescoring through French legal codes is now available as a 🤗 Space.

This process is designed to be memory efficient and fast, with the binary index being small enough to fit in memory and the int8 index being loaded as a view. Additionally, the binary index is much faster (up to 32x) to search than the float32 index, while the rescoring is also extremely efficient.

This space also showcases the tsdae-lemone-mbert-base, a sentence embedding model based on BERT fitted using Transformer-based Sequential Denoising Auto-Encoder for unsupervised sentence embedding learning with one objective : french legal domain adaptation.

Link to the 🤗 Space : louisbrulenaudet/legalkit-retrieval

Notes:
The SentenceTransformer model currently in use is in beta and may not be suitable for direct use in production.

Very cool! check this out @tomaarsen

Very glad to see more uses of embedding quantization, great job.