metadata
license: bsd-3-clause
base_model:
- microsoft/resnet-50
pipeline_tag: image-feature-extraction
ResNet-50 Embeddings Only
This is a modified version of a standard ResNet-50 architecture, where the final, fully connected layer that does the classification, has been removed.
This effectively gives you the embeddings.
NB: You may want to flatten the embeddings, as it'll be of shape (1, 20248, 1, 1)
otherwise.
Example
import onnxruntime
from PIL import Image
from torchvision import transforms
def load_and_preprocess_image(image_path):
# Define the same preprocessing as used in training
preprocess = transforms.Compose(
[
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
]
)
# Open the image file
img = Image.open(image_path)
# Preprocess the image
img_preprocessed = preprocess(img)
# Add batch dimension
return img_preprocessed.unsqueeze(0).numpy()
onnx_model_path = "resnet50_embeddings.onnx"
session = onnxruntime.InferenceSession(onnx_model_path)
input_name = session.get_inputs()[0].name
# Load and preprocess an image (replace with your image path)
image_path = "disco-ball.jpg"
input_data = load_and_preprocess_image(image_path)
# Run inference
outputs = session.run(None, {input_name: input_data})
# The output should be a single tensor (the embeddings)
embeddings = outputs[0]
# Flatten the embeddings
embeddings = embeddings.reshape(embeddings.shape[0], -1)