Gemini 2.5flash empty reply with “finishReason” : “MAX_TOKENS”

I have already built a RAG with the following langchain tutorial and it works well with Gemini Flash 2.0:
https://python.langchain.com/docs/tutorials/rag/

Now, since 2.5 Flash was launched, I changed the model name to 2.5 Flash, but return an empty result like the following:
content='' additional_kwargs={} response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'MAX_TOKENS', 'model_name': 'gemini-2.5-flash', 'safety_ratings': []} id='xxxxxxxxxxxxxxxxxxx' usage_metadata={'input_tokens': 39236, 'output_tokens': 0, 'total_tokens': 42307, 'input_token_details': {'cache_read': 0}, 'output_token_details': {'reasoning': 3071}}

2.5 Flash works well when you ask the question directly without RAG:
```
 llm = ChatGoogleGenerativeAI(model='gemini-2.5-flash',safety_settings={
        HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_NONE, 
        HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_NONE, 
        HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_NONE, 
        HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: HarmBlockThreshold.BLOCK_NONE,
    },api_key=google_api_key,temperature = 0.3,top_p = 0.7,max_output_tokens=3072, timeout=40,max_retries=2)

def generate(state: State):
    docs_content = "\n\n".join(doc.page_content for doc in state["context"])
    messages = prompt.invoke({"input": state["question"], "context": docs_content})
    try:
        start_time = time.time()
        response = llm.invoke(messages)  
        print(1)
        print(messages)
        print(2)
        print(response) #return empty text
        print(3)
        end_time = time.time()
        elapsed_time = end_time - start_time
        print(elapsed_time)
        print(llm.invoke('explain yourself in 500 words')) #return text
        
    except Exception as e:
        print(f"Error: {e}")

    return {"answer": response.content}
```
Other info:
python 3.10
langchain                                0.3.26
langchain-community                      0.3.26
langchain-core                           0.3.68
langchain-google-genai                   2.1.6
google-ai-generativelanguage             0.6.18
langgraph                                0.5.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Gemini 2.5flash empty reply with “finishReason” : “MAX_TOKENS” #1020

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Gemini 2.5flash empty reply with “finishReason” : “MAX_TOKENS” #1020

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions