Image question answering
This example demonstrates how to use the Gemini API to analyze or understand images of cats, including using image URLs and base64 encoding.
Import necessary libraries
from google import genai
from google.genai import types
import requests
import base64
Replace with your Gemini API key
client = genai.Client(api_key="YOUR_API_KEY")
We'll start by using an image URL.
Load an image of a cat from a URL
image_url = "https://cataas.com/cat"
image_response = requests.get(image_url)
image_content = image_response.content
Ask Gemini about the cat in the image
response = client.models.generate_content(
model="gemini-2.0-flash",
contents=["What breed of cat is this?", types.Part.from_bytes(data=image_content, mime_type="image/jpeg")]
)
print("Response from URL Image:\n", response.text)
Now we'll use a local image file.
Load a local image of a cat and encode it as Base64
with open("cat.jpg", "rb") as image_file:
encoded_string = base64.b64encode(image_file.read())
Ensure the encoded string is a string
encoded_string = encoded_string.decode('utf-8')
Ask Gemini a question about the cat, providing the image as a Base64 string
response = client.models.generate_content(
model="gemini-2.0-flash",
contents=["Is this cat fluffy?", types.Part.from_bytes(data=base64.b64decode(encoded_string), mime_type="image/jpeg")]
)
print("\nResponse from Base64 Image:\n", response.text)
Running the Example
First, install the Google Generative AI library and requests
$ pip install google-genai requests
Download an example cat image (replace with your own if needed)
$ wget https://cataas.com/cat -O cat.jpg
Then run the program with Python
$ python gemini-cat.py
Response from URL Image:
This looks like a British Shorthair cat.
Response from Base64 Image:
Yes, this cat appears to be fluffy.