neuralmagic
/

CLIP-ViT-B-32-256x256-DataComp-s34B-b86K-quant-ds

Zero-Shot Classification

ONNX

deepsparse

Model card Files Files and versions Community

mgoin commited on Dec 20, 2023

Commit

17680c0

•

1 Parent(s): 398d001

Update README.md

Browse files

Files changed (1) hide show

README.md +47 -5

README.md CHANGED Viewed

@@ -7,7 +7,7 @@ tags:
 ---
 This is a quantized version of https://huggingface.co/laion/CLIP-ViT-B-32-256x256-DataComp-s34B-b86K that is ready to use with [DeepSparse](https://github.com/neuralmagic/deepsparse). It achieves 71.1% one-shot accuracy on ImageNet.
-## Usage
 [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1ZvU9ZSHJKSeJyH5bgxo_A-GSVIUcSt2E?usp=sharing)
 First, install DeepSparse with extensions for CLIP:
 ```
@@ -21,7 +21,7 @@ wget -O buddy.jpeg https://raw.githubusercontent.com/neuralmagic/deepsparse/main
 wget -O thailand.jpg https://raw.githubusercontent.com/neuralmagic/deepsparse/main/src/deepsparse/yolact/sample_images/thailand.jpg
 ```
-For this model there is a second input that is the length of tokens, so run this input override before making the pipeline:
 ```python
 import numpy as np
 from deepsparse.clip import CLIPTextPipeline
@@ -40,7 +40,49 @@ def custom_process_inputs(self, inputs):
 CLIPTextPipeline.process_inputs = custom_process_inputs
 ```
-Then make and run a pipeline in Python:
 ```python
 from deepsparse import Pipeline
 from deepsparse.clip import (
@@ -58,8 +100,8 @@ images = ["basilica.jpg", "buddy.jpeg", "thailand.jpg"]
 # Load the model into DeepSparse
 pipeline = Pipeline.create(
-    task="clip_zeroshot",
-    visual_model_path=model_folder + "/visual.onnx",
     text_model_path=model_folder + "/textual.onnx"
 )

 ---
 This is a quantized version of https://huggingface.co/laion/CLIP-ViT-B-32-256x256-DataComp-s34B-b86K that is ready to use with [DeepSparse](https://github.com/neuralmagic/deepsparse). It achieves 71.1% one-shot accuracy on ImageNet.
+## Setup for usage
 [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1ZvU9ZSHJKSeJyH5bgxo_A-GSVIUcSt2E?usp=sharing)
 First, install DeepSparse with extensions for CLIP:
 ```
 wget -O thailand.jpg https://raw.githubusercontent.com/neuralmagic/deepsparse/main/src/deepsparse/yolact/sample_images/thailand.jpg
 ```
+For this model there is a second input that is the length of tokens, so run this input override code before making a text pipeline:
 ```python
 import numpy as np
 from deepsparse.clip import CLIPTextPipeline
 CLIPTextPipeline.process_inputs = custom_process_inputs
 ```
+## Text embedding pipeline
+Here is an example of how to create and use a [DeepSparse pipeline for text embeddings](https://github.com/neuralmagic/deepsparse/blob/main/src/deepsparse/clip/text_pipeline.py).
+```python
+from deepsparse import Pipeline
+from huggingface_hub import snapshot_download
+# Download the model from HF
+model_folder = snapshot_download(repo_id="neuralmagic/CLIP-ViT-B-32-256x256-DataComp-s34B-b86K-quant-ds")
+text_embed_pipeline = Pipeline.create(task="clip_text", model_path=model_folder + "/textual.onnx")
+text = ["ice cream", "an elephant", "a dog", "a building", "a church"]
+embeddings = text_embed_pipeline(text=text).text_embeddings
+for i in range(len(embeddings)):
+    print(embeddings[i].shape)
+    print(embeddings[i])
+```
+## Image embedding pipeline
+Here is an example of how to create and use a [DeepSparse pipeline for image embeddings](https://github.com/neuralmagic/deepsparse/blob/main/src/deepsparse/clip/visual_pipeline.py).
+```python
+from deepsparse import Pipeline
+from huggingface_hub import snapshot_download
+# Download the model from HF
+model_folder = snapshot_download(repo_id="neuralmagic/CLIP-ViT-B-32-256x256-DataComp-s34B-b86K-quant-ds")
+image_embed_pipeline = Pipeline.create(task="clip_visual", model_path=model_folder + "/visual.onnx")
+images = ["basilica.jpg", "buddy.jpeg", "thailand.jpg"]
+embeddings = image_embed_pipeline(images=images).image_embeddings
+for i in range(len(embeddings)):
+    print(embeddings[i].shape)
+    print(embeddings[i])
+```
+## Zero-shot image classification pipeline
+Since CLIP trained both the text and image embedding models in tandem, we can generate embeddings for both and relate them together without retraining. Here is an example of how to create and use a [DeepSparse pipeline for zero-shot image classification](https://github.com/neuralmagic/deepsparse/blob/main/src/deepsparse/clip/zeroshot_pipeline.py).
 ```python
 from deepsparse import Pipeline
 from deepsparse.clip import (
 # Load the model into DeepSparse
 pipeline = Pipeline.create(
+    task="clip_zeroshot",
+    visual_model_path=model_folder + "/visual.onnx",
     text_model_path=model_folder + "/textual.onnx"
 )