How Can I Send Files to Google's Gemini Models via API Call?

Question

How Can I Send Files to Google's Gemini Models via API Call?

6k Views Asked by Davis Jones At 07 June 2025 at 05:02

Overview

Currently, I use the GoogleGenerativeAI library to handle generative AI prompt generation requests in my application. Gemini promises to be a multi-modal AI model, and I'd like to enable my users to send files (e.g. PDFs, images, .xls files) in line with their AI prompts.

I was using the following workflow to enable people to upload a file and use it in a prompt:

Enable file selection from their local machine (e.g. PDFs, .doc, .xls formatted files).
Upload the file to Google Cloud Storage, get an accessible link to the newly-uploaded file.
Send the request to Gemini with the link to the file included in the prompt (where appropriate).

However, now I'm finding that this solution doesn't work as it was. Instead, I'm seeing responses like this:

I lack the ability to access external websites or specific files from given URLs, including the one you provided from Google Cloud Storage. Therefore, I'm unable to summarize the content of the file.

What I've Considered

Using multiple libraries to handle document types client-side to convert them into text (e.g. pdf-parser for PDFs) and using Gemini's image-handling model when there's an image involved. However, this involves lots of libraries, and it seems that Gemini is promising to handle this for me / my users.
Pre-processing the uploaded files server-side (for example, sending them to Google's Document AI), turning their document into some type of consistently-structured data, then using that data with the GoogleGenerativeAI library. Document AI calls are expensive though and it seems that Gemini is meant to handle this kind of thing.

My App's Stack (In Case it Matters)

Firebase / Google Cloud Functions
Vercel
Next.js

Can you help with an approach that will enable the user to include files in their requests made (via the web) to Gemini?

Thanks in advance!

Original Q&A

There are 2 best solutions below

**Frank van Puffelen** · Answer 1

The documentation on generating text from text-and-image input (multimodal) has an example of how to include image data in a request.

As Guillaume commented, this requires that you include your image data as a base64 encoded part in your request. While I didn't test the JavaScript bindings myself yet, this matches with my experience of using Dart bindings - where I also included the images as base64 encoded parts.

**Prisoner** · Answer 2

Gemini promises to be a multi-modal AI model

The multi-modal capabilities of Gemini are currently limited, and they are slightly different if you are using the Google AI Studio version of the library or the Google Cloud Vertex AI version of the library.

The Google AI Studio version only supports text and images that are jpeg, png, hiec, heif, or webp images. These can only be inline. See https://ai.google.dev/api/rest/v1/Content#part
The Google Cloud Vertex AI version also supports these, but has a couple of additions:
- URL references to documents in Google Cloud Storage are allowed
- While inline data is still permitted, you're only allowed one inline image
- Only png and jpeg image files are supported
- Video files (mov, mpeg, mp4, mpg, avi, wmv, mpegps, and flv) up to two minutes in length are supported
- See the field definitions at https://cloud.google.com/vertex-ai/docs/generative-ai/model-reference/gemini#request_body for more details

Currently, neither library supports other modalities, including PDFs, doc files, spreadsheets, etc. While these may be available in the future, they're not available today.

How Can I Send Files to Google's Gemini Models via API Call?

Overview

What I've Considered

My App's Stack (In Case it Matters)

There are 2 best solutions below

Related Questions in JAVASCRIPT

Related Questions in FIREBASE

Related Questions in GOOGLE-CLOUD-PLATFORM

Related Questions in GOOGLE-GENERATIVEAI

Related Questions in GOOGLE-GEMINI

Trending Questions

Popular # Hahtags

Popular Questions