Salesforce/blip2-opt-2.7b - Deployment in SageMaker Real time Endpoint - GPU [Solved]
Hello!
I created this discussion to help you to deploy this model into a SageMaker Real time Endpoint and make predictions with it.
Note
- Environment: Studio - Data Science 3.0 - Python 3 - PublicInternetOnly
- Test image: https://storage.googleapis.com/sfr-vision-language-research/BLIP/demo.jpg
I gathered different issues for similar image-to-text models and the following are insights that can help to deal with them:
Helpful information
Latest GPU DLC versions for HuggingFace + Pytorch helps to avoid the mapping error "KeyError: 'blip-2'" from the default configuration in the HF model card.
"ml.g5.2xlarge" Instance type helps to reduce the latency of the model and it helps to determine where the issue is happening when invoking the Endpoint.
Different values for Parameters from _sanitize_parameters() in the prediction request can modify the prediction output.
The model input accepts URL strings as input data.
Local images encoded in Base64 work for predictions instead of passing local paths directly as input data.
Local path as input data doesn't work. It seems that os.path.isfile() tries to find the path in the hosting instance where the model is running, but the image file is not pre-loaded there. SageMaker hosting instance is different from the instance where Studio application is running.
Deploying the model with a fixed configuration
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel
try:
role = sagemaker.get_execution_role()
except ValueError:
iam = boto3.client('iam')
role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']
# Hub Model configuration. https://huggingface.co/models
hub = {
'HF_MODEL_ID':'Salesforce/blip2-opt-2.7b',
'HF_TASK':'image-to-text'
}
# create Hugging Face Model Class
# HuggingFace Inference Containers: https://github.com/aws/deep-learning-containers/blob/master/available_images.md#huggingface-inference-containers
huggingface_model = HuggingFaceModel(
#transformers_version='4.26.0',
#pytorch_version='1.13.1',
#py_version='py39',
image_uri = '763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-inference:2.1.0-transformers4.37.0-gpu-py310-cu118-ubuntu20.04',
env=hub,
role=role,
)
# deploy model to SageMaker Inference
predictor_03 = huggingface_model.deploy(
initial_instance_count=1, # number of instances
instance_type='ml.g5.2xlarge' # ec2 instance type
)
#aws/sagemaker-huggingface-inference-toolkit
#sagemaker-huggingface-inference-toolkit/src/sagemaker_huggingface_inference_toolkit/handler_service.py
#https://github.com/aws/sagemaker-huggingface-inference-toolkit/blob/main/src/sagemaker_huggingface_inference_toolkit/handler_service.py
Invoking the Endpoint - Successful attempts - Image URL
In:
# Successful attempt with just the image URL
data_url = {'inputs':'https://storage.googleapis.com/sfr-vision-language-research/BLIP/demo.jpg'}
predictor_03.predict(data_url)
Out:
[{'generated_text': 'a woman sitting on the beach with a dog\n'}]
In:
# Successful attempt with the image URL and parameters modification here
#https://github.com/huggingface/transformers/blob/a1afec9e1759b0fdb256d41d429161cc15ecf500/src/transformers/pipelines/image_to_text.py#L76
data02 = {'inputs':'https://storage.googleapis.com/sfr-vision-language-research/BLIP/demo.jpg', "parameters": {
'max_new_tokens':3}}
predictor_03.predict(data02)
Out:
[{'generated_text': 'a woman sitting'}]
In:
# Successful attempt with the image URL and a prompt to modify the output
data03 = {'inputs':'https://storage.googleapis.com/sfr-vision-language-research/BLIP/demo.jpg', "parameters": {'prompt':'Dog helps to'}}
predictor_03.predict(data03)
Out:
[{'generated_text': ' make a girl happy\n'}]
Invoking the Endpoint - Successful attempts - Image in local environment
Download the image on your local environment. Relative or Absolute path of your image can be used.
Image to valid base64 encoded string
In:
import json
import base64
data04 = {}
with open('Dog_pic.jpeg', mode='rb') as file:
img = file.read()
data04['inputs'] = base64.encodebytes(img).decode('utf-8')
data04['inputs'] = data04['inputs'].replace('\n','') #Required to avoid 'non-base64 digit found' error when calling the endpoint.
data04['inputs']
Out:
'/9j/4AAQSkZJRgABAQEA8ADwAAD//gAcY21wMy4xMC4zLjNMcTMgMHgzZ...
Validation step to confirm that the base64 encoded string will be accepted (Optional)
In:
import PIL.Image
import base64
import os
from io import BytesIO
b64 = base64.b64decode(data04['inputs'], validate=True)
image = PIL.Image.open(BytesIO(b64))
image
Out: It should display the original image without any problem.
Invoking the endpoint
In:
predictor_03.predict(data04)
Out:
[{'generated_text': 'a woman sitting on the beach with a dog\n'}]
Common issues
Load_image( ) issue
The CloudWatch logs below appear when there is an issue with the Input Data format. The load_image() function raises the exception when the input data is not a valid URL starting with http://
or https://
, a valid path to an image file, or a base64 encoded string.
2024-02-06T19:13:17,141 [INFO ] W-Salesforce__blip2-opt-2.7-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File "/opt/conda/lib/python3.10/site-packages/transformers/image_utils.py", line 323, in load_image: Non-base64 digit found
2024-02-06T19:13:17,141 [INFO ] W-Salesforce__blip2-opt-2.7-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File "/opt/conda/lib/python3.10/site-packages/transformers/image_utils.py", line 326, in load_image
2024-02-06T19:13:17,141 [INFO ] W-Salesforce__blip2-opt-2.7-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - ValueError: Incorrect image source. Must be a valid URL starting with
http://
orhttps://
, a valid path to an image file, or a base64 encoded string. Got /9j/4AAQSkZJRgABAQEA8ADwAAD//gAcY21wMy4xMC4zLjNMcTMgMHgzZGVjN2ZlYwD/2wBDAAMB...
Code which raises the exception
else:
if image.startswith("data:image/"):
image = image.split(",")[1]
# Try to load as base64
try:
b64 = base64.b64decode(image, validate=True)
image = PIL.Image.open(BytesIO(b64))
except Exception as e:
raise ValueError(
f"Incorrect image source. Must be a valid URL starting with `http://` or `https://`, a valid path to an image file, or a base64 encoded string. Got {image}. Failed with {e}"
)
Hope this helps!
Thank you for sharing, this is very helpful!