Skip to content

get_num_tokens_from_messages broken for Multi-modal messages #879

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Jflick58 opened this issue Apr 22, 2025 · 1 comment
Open

get_num_tokens_from_messages broken for Multi-modal messages #879

Jflick58 opened this issue Apr 22, 2025 · 1 comment
Labels
enhancement New feature or request

Comments

@Jflick58
Copy link

Jflick58 commented Apr 22, 2025

Similar to #491

When trying to use get_num_tokens_from_messages with a ChatVertexAI model and multi-modal inputs, the token count from the Langchain method is wildly inflated (1369082) vs the GenAI SDK and Vertex AI Console token number (3358).

I believe this is due to the use of the inherited get_num_tokens_from_messages method that BaseChatModel inherits from BaseModel. This method uses get_buffer_string to convert everything in a message to a string https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/messages/utils.py#L82

Unfortunately, that means the entire base64 string gets counted vs the image_bytes that the GenAI SDK (and presumably, Vertex AI console) uses.

This makes it very difficult to track token usage and debug token limit exceeded errors.

Here is an example, with a sample image to use.

import os
import sys
import base64
import logging
from google import genai
from google.genai.types import HttpOptions, Part
from langchain_google_vertexai import ChatVertexAI
from langchain_core.messages import HumanMessage, SystemMessage

# Set up logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

def get_file_size_info(file_path):
    """Get file size info for an image file."""
    file_size = os.path.getsize(file_path)
    logging.info(f"Original file size: {file_size / 1024:.2f} KB")
    return file_size

def encode_image_base64(file_path):
    """Encode an image as base64."""
    with open(file_path, "rb") as f:
        image_bytes = f.read()
        encoded = base64.b64encode(image_bytes).decode()
        logging.info(f"Base64 encoded size: {len(encoded) / 1024:.2f} KB")
        logging.info(f"Base64 encoded length: {len(encoded)} characters")
        return encoded

def test_langchain_method(file_path):
    """Test how LangChain handles images and count tokens."""
    logging.info("\n===== Testing LangChain Method =====")
    
    # Initialize LangChain model
    llm = ChatVertexAI(model="gemini-2.0-flash-001")
    
    # Encode image
    encoded_image = encode_image_base64(file_path)
    
    # Create messages
    messages = [
        SystemMessage(content="You are a helpful assistant."),
        HumanMessage(
            content=[
                {
                    "type": "text",
                    "text": "Please analyze this image:"
                },
                {
                    "type": "image",
                    "source_type": "base64",
                    "data": encoded_image,
                    "mime_type": "image/png",
                }
            ]
        )
    ]
    
    # Count tokens
    try:
        token_count = llm.get_num_tokens_from_messages(messages)
        logging.info(f"LangChain token count: {token_count}")
    except Exception as e:
        logging.error(f"Error counting tokens: {str(e)}")
    
    return token_count

def test_direct_genai_method(file_path):
    """Test how direct Google Generative AI handles images using the Gemini API."""
    logging.info("\n===== Testing Direct GenAI Method =====")
    
    client = genai.Client(http_options=HttpOptions(api_version="v1"))

    contents = [
    Part.from_bytes(
      data=encode_image_base64(file_path),
      mime_type="image/png",
        ),
        "Please analyze this image",
    ]

    response = client.models.count_tokens(
        model="gemini-2.0-flash-001",
        contents=contents,
    )
    return response.total_tokens

def main():
    if len(sys.argv) < 2:
        print("Usage: python img_token_test.py <image_path>")
        sys.exit(1)
    
    file_path = sys.argv[1]
    if not os.path.exists(file_path):
        print(f"File not found: {file_path}")
        sys.exit(1)
    
    # Get file size info
    get_file_size_info(file_path)
    
    # Test both methods
    langchain_tokens = test_langchain_method(file_path)
    direct_tokens = test_direct_genai_method(file_path)
    
    # Print summary
    logging.info("\n===== Summary =====")
    logging.info(f"Image size: {os.path.getsize(file_path) / 1024:.2f} KB")
    logging.info(f"LangChain token count: {langchain_tokens}")
    
    if direct_tokens:
        logging.info(f"Direct GenAI token count: {direct_tokens}")
        logging.info(f"Difference: {abs(langchain_tokens - direct_tokens)} tokens")
        logging.info(f"Ratio between methods: {langchain_tokens / direct_tokens if direct_tokens else 'N/A'}")
    
    logging.info(f"Token ratio: {langchain_tokens / (os.path.getsize(file_path) / 1024):.2f} tokens per KB")

if __name__ == "__main__":
    main()

You'll need the following .env to run the example:

GOOGLE_CLOUD_PROJECT="project"
GOOGLE_CLOUD_LOCATION="us-central1"
GOOGLE_GENAI_USE_VERTEXAI=True

To run: uv run --env-file=.env img_token_test.py buddy-photo-pd61clsCVnY-unsplash.jpg

Image

Notating this issue here - if I have time I'll open a PR to fix but figured I would open it in case someone else picks it up sooner. Seems like we need to implement an overloaded version of get_num_tokens_from_messages specific to ChatVertexAI

@lkuligin
Copy link
Collaborator

lkuligin commented May 7, 2025

@lkuligin lkuligin added the enhancement New feature or request label May 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants