Multimodal support for pdf documents

The multimodal support for images appears to work fine.  However, adding pdf files results in an error: "Invalid argument provided to Gemini: 400 The document has no pages."  (The pdf file is fine.  I tried multiple pdf's and they all generate this error message.)  

The langchain-genai docs only explain how to pass multimodal data in the case of images.  But the how-to from the langchain docs describe the following way:   https://python.langchain.com/docs/how_to/multimodal_inputs/#documents-pdf   I get the error using both langchain's howto and the code below that uses the langchain-google module.  

My question is is this a feature that is just not implemented  yet?  Or is there some design flaw that is prohibitive?  Does anyone know of a work around?   Or am I just not doing it right? 

The google gemini  site at ai.google.dev  appears to use the google.genai.Part type in the content parameter, but I don't know how that relates to the langchain-genai module.  

Gemini models are equipped to handle very large pdf's, so I would think this would be a valuable feature.  

Thanks  in advance for any pointers. 


```
import base64, httpx
from PyPDF2 import PdfReader
from dotenv import load_dotenv
from langchain_core.messages import HumanMessage
from langchain_google_genai import ChatGoogleGenerativeAI

load_dotenv()

pdf_file = "test.pdf"
llm = ChatGoogleGenerativeAI(model="models/gemini-2.0-flash")


reader = PdfReader(pdf_file)
page = reader.pages[0]
pdf_text = page.extract_text()
pdf_data = base64.b64encode(pdf_text.encode('utf-8')).decode('utf-8')

message = HumanMessage(
    content=[
        {"type": "text", "text": "Describe this document."},
        {"type": "file", "source_type": "base64", "mime_type":"application/pdf", "data": pdf_data}
    ]
)

response = llm.invoke([message])
print(response.content)
```


```
Traceback (most recent call last):
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/langchain_google_genai/chat_models.py", line 192, in _chat_with_retry
    return generation_method(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/google/ai/generativelanguage_v1beta/services/generative_service/client.py", line 868, in generate_content
    response = rpc(
               ^^^^
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/google/api_core/gapic_v1/method.py", line 131, in __call__
    return wrapped_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 294, in retry_wrapped_func
    return retry_target(
           ^^^^^^^^^^^^^
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 156, in retry_target
    next_sleep = _retry_error_helper(
                 ^^^^^^^^^^^^^^^^^^^^
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/google/api_core/retry/retry_base.py", line 214, in _retry_error_helper
    raise final_exc from source_exc
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 147, in retry_target
    result = target()
             ^^^^^^^^
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/google/api_core/timeout.py", line 130, in func_with_timeout
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/google/api_core/grpc_helpers.py", line 78, in error_remapped_callable
    raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.InvalidArgument: 400 The document has no pages.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/data/Projects/ai_agent_project/test_multimodal.py", line 25, in <module>
    response = llm.invoke([message])
               ^^^^^^^^^^^^^^^^^^^^^
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/langchain_google_genai/chat_models.py", line 1255, in invoke
    return super().invoke(input, config, stop=stop, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 372, in invoke
    self.generate_prompt(
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 957, in generate_prompt
    return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 776, in generate
    self._generate_with_cache(
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 1022, in _generate_with_cache
    result = self._generate(
             ^^^^^^^^^^^^^^^
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/langchain_google_genai/chat_models.py", line 1342, in _generate
    response: GenerateContentResponse = _chat_with_retry(
                                        ^^^^^^^^^^^^^^^^^
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/langchain_google_genai/chat_models.py", line 210, in _chat_with_retry
    return _chat_with_retry(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/tenacity/__init__.py", line 336, in wrapped_f
    return copy(f, *args, **kw)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/tenacity/__init__.py", line 475, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/tenacity/__init__.py", line 376, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/tenacity/__init__.py", line 398, in <lambda>
    self._add_action_func(lambda rs: rs.outcome.result())
                                     ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/tenacity/__init__.py", line 478, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/home/david/.local/share/venvs/venv-langchain/lib/python3.11/site-packages/langchain_google_genai/chat_models.py", line 204, in _chat_with_retry
    raise ChatGoogleGenerativeAIError(
langchain_google_genai.chat_models.ChatGoogleGenerativeAIError: Invalid argument provided to Gemini: 400 The document has no pages.

```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multimodal support for pdf documents #1014

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Multimodal support for pdf documents #1014

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions