I would like to create a Python script in which I send a POST request via the Hugging Face API Inference for an Image to Text model. The model is: nlpconnect/vit-gpt2-image-captioning link. I’m having issues with sending the image, as the POST request is returning a 400 error. The Python script is as follows:
import base64
import requests
import os
def query(API_TOKEN):
model = 'Salesforce/blip-image-captioning-large'
headers = {"Authorization": f"Bearer {API_TOKEN}"}
image_path = "./demo.jpg"
# Check if the image file exists
if not os.path.isfile(image_path):
return {"error": "Image file does not exist"}
with open(image_path, "rb") as image_file:
try:
# Try to encode the image file
encoded_string = base64.b64encode(image_file.read()).decode()
except Exception as e:
return {"error": f"Error encoding image: {str(e)}"}
data = {
"inputs": {
"images": [encoded_string], # using the base64 encoded string
"texts": ["a photography of"] # Optional, based on your current class logic
}
}
try:
# Try to send a request to the API endpoint
response = requests.post(
f'https://api-inference.huggingface.co/models/{model}',
headers=headers,
json=data
)
except Exception as e:
return {"error": f"Error sending request: {str(e)}"}
return response.json()
The function returns the error: {'error': ["Error in inputs
: Invalid image: {'images': ['/9j/4AAQSkZJRgABAQEA8ADwAA...zm2Z8+UaGwKf/Z'], 'texts': ['a photography of']}"]}.
I’m struggling to identify the source of my error. Could someone help me? Thank you!
I tried calling the function but it gives me the error: {'error': ["Error in inputs
: Invalid image: {'images': ['/9j/4AAQSkZJRgABAQEA8ADwAA...zm2Z8+UaGwKf/Z'], 'texts': ['a photography of']}"]}.
The image that i provided is demo.jpg: !wget https://storage.googleapis.com/sfr-vision-language-research/BLIP/demo.jpg