Is there a paper associated with this model?
Hello, I played around with this model and it seems to fall short of OpenFashionClip based on my eye tests in a Jupyter notebook. I wonder if this is because CLIP works differently and I need to structure my queries in a particular way. Aside from the blog, is there more I can read about how marqo-fashionSigLIP was made and benchmarked? Thanks.
Hi, Thanks for the question. There is no paper at the moment but you can read the blog for more information https://www.marqo.ai/blog/search-model-for-fashion. The training method does have a paper and repo
https://github.com/marqo-ai/GCL
https://arxiv.org/pdf/2404.08535
In terms of performance, it is definitely worthwhile to create a data driven evaluation. There is no prefix required for our models, adding one can hurt performance. It was benchmarked on 7 openly available datasets. If your data looks like this then it should perform well. If the data is quite different then the results may be different. You can see here for more details https://github.com/marqo-ai/marqo-FashionCLIP
One question, which framework are you using to load and inference the model?
Hi Jesse, thanks for the quick reply. I will take a read. I have followed the steps for Python using AutoModel as shown in the same code. Would there be a difference if I used Transformers.js or OpenCLIP? Additionally it seems like the SigLIP model is strictly better than the CLIP variant in all benchmarks. Is there any benefit of using the Marqo/marqo-fashionCLIP model instead? Thanks again.
There is a small difference because HF image preprocessor doesn't allow the proper resizing functions that are needed for parity. We measured downstream performance and found there to be negligible difference in practice (<0.1%) however. If you have a choice, I would suggest using OpenCLIP to load. In our testing SigLIP was better. FashionCLIP is trained from a different base model so it can have some slightly different behavior which might be beneficial. My suggestion is start with SigLIP and maybe test CLIP to see if there is a difference for your dataset but SigLIP was better in all our testing.
Thanks. I tried OpenCLIP today but I got this error when trying to download the SigLIP model. However, when I replaced the model name with another popular one, it OpenCLIP was able to download it. Is it possible the server that hosts the model is down?
FileNotFoundError: Failed to download any files for Marqo/marqo-fashionSigLIP. Last error: (MaxRetryError("HTTPSConnectionPool(host='cdn-lfs-us-1.hf.co', port=443): Max retries exceeded with url: /repos/9b/85/9b858a1ee4a09cea3f513581e420ba53410c35f849fbe6d80865157a4c686314/f51a245681b2a027c26c1684a89dbd27cbd2819fca2fc2d4c697208d33d46400?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27open_clip_pytorch_model.bin%3B+filename%3D%22open_clip_pytorch_model.bin%22%3B&response-content-type=application%2Foctet-stream&Expires=1729751689&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcyOTc1MTY4OX19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmhmLmNvL3JlcG9zLzliLzg1LzliODU4YTFlZTRhMDljZWEzZjUxMzU4MWU0MjBiYTUzNDEwYzM1Zjg0OWZiZTZkODA4NjUxNTdhNGM2ODYzMTQvZjUxYTI0NTY4MWIyYTAyN2MyNmMxNjg0YTg5ZGJkMjdjYmQyODE5ZmNhMmZjMmQ0YzY5NzIwOGQzM2Q0NjQwMD9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoifV19&Signature=tAO5oWKShwasTtJ4tIb0vGrA8kYp1U90VWTbGs62o5MNsIiOrOGxqccEDjsXzVlcOi3-wBaTp5iG-GhJnImHg-ZR6rXwLOeugY2ZZCS4hWVPMogoJH2QCSlTeUQl63sUvNA9DCR7YSNn8G5bezqF-hRF7MVkmhnVwL6uW-bd8HKXlj91C4tkfi0zG34MOBcIsPRS8vXFvYGxASWpcMCboKP092RjgbPuniJNw9LCyLhOjtWdPRRVu7bG7rLbi37zT-cXZ7BBWOF4njcsh9-Hat4ublXWEfsjc1rOuamSREzXhySTJ03uCix7ejvLT71Cj8AZNXQAuQdNEwj5okbQ__&Key-Pair-Id=K24J24Z295AEI9 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x1103166d0>: Failed to establish a new connection: [Errno 61] Connection refused'))"), '(Request ID: 4f4b5e2d-8864-4f1f-9fef-b9aabd0fc31e)')
I think that must be a huggingface issue as I was able to download it just now.