--- license: mit datasets: - natural_questions pipeline_tag: question-answering --- # AdANNS: A Framework for Adaptive Semantic Search 💃 _Aniket Rege*, Aditya Kusupati*, Sharan Ranjit S, Alan Fan, Qinqqing Cao, Sham Kakade, Prateek Jain, Ali Farhadi_ GitHub: https://github.com/RAIVNLab/AdANNS Arxiv: https://arxiv.org/abs/2305.19435
Adaptive representations can be utilized effectively in the decoupled components of clustering and searching for a better accuracy-compute trade-off (AdANNS-IVF).
We provide four BERT-Base models finetuned on Natural Questions with [Matryoshka Representation Learning](https://github.com/RAIVNLab/MRL) (MRL). A vanilla pretrained BERT-Base has a 768-d representation (information bottleneck). As we train with MRL, we enforce the network to learn representations at multiple granularities nested within a 768-d embedding. The granularities at which we finetune BERT-Base with Matroyshka Loss are specified in the folder name, e.g. for `dpr-nq-d768_384_192_96_48`, we have d=[48, 96, 192, 384, 768]. You can easily load an mrl-nq model as follows: ``` from transformers import BertModel import torch model = BertModel.from_pretrained('dpr-nq-d768_384_192_96_48') ``` ## Citation If you find this project useful in your research, please consider citing: ``` @inproceedings{rege2023adanns, title={AdANNS: A Framework for Adaptive Semantic Search}, author={Aniket Rege and Aditya Kusupati and Sharan Ranjit S and Alan Fan and Qingqing Cao and Sham Kakade and Prateek Jain and Ali Farhadi}, year={2023}, booktitle = {Advances in Neural Information Processing Systems}, month = {December}, year = {2023}, } ```