API access to Google's Gemini models
Install this plugin in the same environment as LLM.
llm install llm-gemini
Configure the model by setting a key called "gemini" to your API key:
llm keys set gemini
<paste key here>
You can also set the API key by assigning it to the environment variable LLM_GEMINI_KEY
.
Now run the model using -m gemini-2.0-flash
, for example:
llm -m gemini-2.0-flash "A short joke about a pelican and a walrus"
A pelican and a walrus are sitting at a bar. The pelican orders a fishbowl cocktail, and the walrus orders a plate of clams. The bartender asks, "So, what brings you two together?"
The walrus sighs and says, "It's a long story. Let's just say we met through a mutual friend... of the fin."
You can set the default model to avoid the extra -m
option:
llm models default gemini-2.0-flash
llm "A joke about a pelican and a walrus"
Other models are:
gemini-2.5-flash-preview-04-17
- Gemini 2.5 Flash previewgemini-2.5-pro-exp-03-25
- free experimental release of Gemini 2.5 Progemini-2.5-pro-preview-03-25
- paid preview of Gemini 2.5 Progemma-3-27b-it
- Gemma 3 27Bgemini-2.0-pro-exp-02-05
- experimental release of Gemini 2.0 Progemini-2.0-flash-lite
- Gemini 2.0 Flash-Litegemini-2.0-flash
- Gemini 2.0 Flashgemini-2.0-flash-thinking-exp-01-21
- experimental "thinking" model from January 2025gemini-2.0-flash-thinking-exp-1219
- experimental "thinking" model from December 2024learnlm-1.5-pro-experimental
- "an experimental task-specific model that has been trained to align with learning science principles" - more details here.gemini-2.0-flash-exp
- Gemini 2.0 Flashgemini-exp-1206
- recent experimental #3gemini-exp-1121
- recent experimental #2gemini-exp-1114
- recent experimental #1gemini-1.5-flash-8b-latest
- the least expensivegemini-1.5-flash-latest
Gemini models are multi-modal. You can provide images, audio or video files as input like this:
llm -m gemini-2.0-flash 'extract text' -a image.jpg
Or with a URL:
llm -m gemini-2.0-flash-lite 'describe image' \
-a https://static.simonwillison.net/static/2024/pelicans.jpg
Audio works too:
llm -m gemini-2.0-flash 'transcribe audio' -a audio.mp3
And video:
llm -m gemini-2.0-flash 'describe what happens' -a video.mp4
The Gemini prompting guide includes extensive advice on multi-modal prompting.
Use -o json_object 1
to force the output to be JSON:
llm -m gemini-2.0-flash -o json_object 1 \
'3 largest cities in California, list of {"name": "..."}'
Outputs:
{"cities": [{"name": "Los Angeles"}, {"name": "San Diego"}, {"name": "San Jose"}]}
Gemini models can write and execute code - they can decide to write Python code, execute it in a secure sandbox and use the result as part of their response.
To enable this feature, use -o code_execution 1
:
llm -m gemini-2.0-flash -o code_execution 1 \
'use python to calculate (factorial of 13) * 3'
Some Gemini models support Grounding with Google Search, where the model can run a Google search and use the results as part of answering a prompt.
Using this feature may incur additional requirements in terms of how you use the results. Consult Google's documentation for more details.
To run a prompt with Google search enabled, use -o google_search 1
:
llm -m gemini-2.0-flash -o google_search 1 \
'What happened in Ireland today?'
Use llm logs -c --json
after running a prompt to see the full JSON response, which includes additional information about grounded results.
To chat interactively with the model, run llm chat
:
llm chat -m gemini-2.0-flash
The plugin also adds support for the gemini-embedding-exp-03-07
and text-embedding-004
embedding models.
Run that against a single string like this:
llm embed -m text-embedding-004 -c 'hello world'
This returns a JSON array of 768 numbers.
The gemini-embedding-exp-03-07
model is larger, returning 3072 numbers. You can also use variants of it that are truncated down to smaller sizes:
gemini-embedding-exp-03-07
- 3072 numbersgemini-embedding-exp-03-07-2048
- 2048 numbersgemini-embedding-exp-03-07-1024
- 1024 numbersgemini-embedding-exp-03-07-512
- 512 numbersgemini-embedding-exp-03-07-256
- 256 numbersgemini-embedding-exp-03-07-128
- 128 numbers
This command will embed every README.md
file in child directories of the current directory and store the results in a SQLite database called embed.db
in a collection called readmes
:
llm embed-multi readmes -d embed.db -m gemini-embedding-exp-03-07-128 \
--files . '*/README.md'
You can then run similarity searches against that collection like this:
llm similar readmes -c 'upload csvs to stuff' -d embed.db
See the LLM embeddings documentation for further details.
To set up this plugin locally, first checkout the code. Then create a new virtual environment:
cd llm-gemini
python3 -m venv venv
source venv/bin/activate
Now install the dependencies and test dependencies:
llm install -e '.[test]'
To run the tests:
pytest
This project uses pytest-recording to record Gemini API responses for the tests.
If you add a new test that calls the API you can capture the API response like this:
PYTEST_GEMINI_API_KEY="$(llm keys get gemini)" pytest --record-mode once
You will need to have stored a valid Gemini API key using this command first:
llm keys set gemini
# Paste key here