2 questions about con.sql feature (LLM usage) #11212

augcollet · 2025-05-14T13:39:14Z

augcollet
May 14, 2025

Hello,

LLMs generally have a lot of trouble producing valid Ibis code, particularly because (unlike packages like Pandas) its use is not widespread.

To overcome this, I was considering guiding the LLM by asking it to prioritize the use of the con.sql function, which allows it to perform certain processing operations directly in SQL (generally well-understood by the models) while retrieving Ibis objects after each call.

My experiments led me to two questions.

First, I would like to be sure I understand the "dialect" parameter. If, for example, I use the polars backend, and I use con.sql(<SQL>, dialect='postgres'), this allows the SQL to be expressed in Postgres, and to be "translated" internally into Polars for execution? If so, this would be ideal because LLMs are very familiar with the Postgres dialect. More generally, it would be enough to force the LLM to use the dialect it masters best (without even having to know the actual backend), and Ibis would take care of translating the requests to the target backend alone.

Second, when using con.sql, is there a way to save the resulting table in the backend context, based on the name given to the assigned variable? This would make it easy to chain calls using any previously constructed tables. Here's an example:

Here's the code if needed.

import ibis, polars as pl, pandas as pd

con = ibis.polars.connect(
    {
        'clients': pl.LazyFrame({
                    'id': [1, 2, 3],
                    'name': ['Alice', 'Bob', 'Charlie'],
                    'age': [25, 30, 35],
                    'active': [True, False, True],
                    'date': pd.to_datetime(['2022-01-01', '2022-06-15', '2023-03-20']),
                    'score': [95.5, 88.2, 92.1],
                })
    }
)
display(con.list_tables())

filtered_clients=con.sql("""
    select * from clients where age=25
""", dialect='postgres')
display(filtered_clients.execute(), con.list_tables()) # <--- filtered_clients is not registered in list_tables as 'filtered_clients', but as an autogenerated name

con.sql("""
    select nom, score from filtered_clients -- Table cant be found
""", dialect='postgres').execute()

Thank you for your feedback.

Regards,

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2 questions about con.sql feature (LLM usage) #11212

{{title}}

Replies: 0 comments

Select a reply

2 questions about con.sql feature (LLM usage) #11212

augcollet May 14, 2025

Replies: 0 comments

augcollet
May 14, 2025