DataTable performance enhancement #5953

fancidev · 2025-07-11T02:03:12Z

fancidev
Jul 11, 2025

The DataTable class is really handy for displaying tabular data, but I find it somewhat sluggish to scroll when populated with a moderate amount of data (say a few hundred rows by a few dozen columns).

A bit searching suggests this is kind of a known limitation. A popular suggestion is to use textual-fastdatatable, but that package is not perfect either. The main issues I find are that it offers a different, immutable programming interface, and requires pyarrow which seems like an overkill.

Playing with the built-in DataTable, it seems the amount of sluggishness scales mainly with the number of columns rather than with the number of rows. Since the number of columns in a table is usually bounded, this suggests that some sort of micro-optimization could help to improve the UI performance without changing any API or behavior.

I did a quick profiling using the (great) Python profiler py-spy. It seems much time is spent in tight loops whose overhead could be drastically reduced a small amount of micro-optimization. I think this profiling-guided optimization approach could be promising.

In summary, I'd like to ask if you have any prior study on the performance of DataTable, and if you'd welcome a PR to improve the performance. I intend the PR to contain localized optimizations that retain the current API and behavior. I expect the PR to reduce the UI sluggishness "sufficiently" for moderate-sized data.

2025-07-11T02:03:30Z

github-actions[bot]
bot Jul 11, 2025

We found the following entry in the FAQ which you may find helpful:

Why doesn't Textual look good on macOS?

Feel free to close this issue if you found an answer in the FAQ. Otherwise, please give us a little time to review.

This project is developed and maintained by Will McGugan. Consider sponsoring Will's work on this project (and others).

This is an automated reply, generated by FAQtory

0 replies

fancidev · 2025-07-12T07:19:44Z

fancidev
Jul 12, 2025
Author

I wrote a test script (call it dt.py) to demonstrate the performance. Use python dt.py to run interactively, or python dt.py -t to run an automated sequence actions and output timing summary. To profile the script, first pip install pyinstrument, and then run pyinstrument -o t1.html dt.py -t to output an HTML report. (The profiler is sampling based but still adds about 100% overhead. But its statistics should scale with the non-profiled execution.)

Test script:

from textual.app import ComposeResult
from textual.app import App
from textual.containers import Horizontal, Vertical
from textual.widgets import Button, DataTable, Label, ListItem, ListView
import asyncio
import os
import sys
import time


class MyApp(App):
    CSS = """
    #left { width: 30%; }
    #right { width: 70%; }
    """
    dimensions = [
        (10000, 1),
        (1000, 10),
        (500, 20),
        (200, 50),
        (100, 100),
        (50, 200),
        (20, 500),
        (10, 1000),
    ]

    def compose(self) -> ComposeResult:
        with Horizontal():
            with Vertical(id="left"):
                yield ListView(*[
                    ListItem(Label(f"{num_rows} x {num_columns}"))
                    for num_rows, num_columns in self.dimensions
                ], id="list")
                yield Button(f"PID={os.getpid()}", id="go")
            with Horizontal(id="right"):
                yield DataTable(id="table")

    def on_button_pressed(self):
        index = self.query_one(ListView).index
        num_rows, num_columns = self.dimensions[index]

        table: DataTable = self.query_one(DataTable)
        table.clear(columns=True)
        for j in range(num_columns):
            table.add_column(f"Column{j}", key=f"C{j}")
        for i in range(num_rows):
            table.add_row(*[
                f"Cell({i},{j})" for j in range(num_columns)
            ], key=f"R{i}")


def interactive_mode():
    MyApp().run()


async def headless_mode():
    app = MyApp()
    async with app.run_test() as pilot:
        for i, (num_rows, num_columns) in enumerate(app.dimensions):
            await pilot.click("#list", offset=(0, i))
            t1 = time.monotonic()
            await pilot.click("#go")
            t2 = time.monotonic()
            for j in range(0, num_rows, num_rows // 10):
                app.query_one(DataTable).move_cursor(row=j)
                await pilot.pause()
            t3 = time.monotonic()
            print(f"Dimension {num_rows} x {num_columns}: "
                  f"load={t2-t1:.2f}s "
                  f"scroll={t3-t2:.2f}s")


if __name__ == "__main__":
    if "-t" in sys.argv[1:]:
        asyncio.run(headless_mode())
    else:
        interactive_mode()

0 replies

fancidev · 2025-07-12T09:24:52Z

fancidev
Jul 12, 2025
Author

The test script above produces the following statistics on my Apple M1 computer:

Dimension 10000 x 1: load=0.31s scroll=0.35s
Dimension 1000 x 10: load=0.23s scroll=0.48s
Dimension 500 x 20: load=0.25s scroll=0.82s
Dimension 200 x 50: load=0.42s scroll=1.90s
Dimension 100 x 100: load=0.99s scroll=2.75s
Dimension 50 x 200: load=3.11s scroll=3.49s
Dimension 20 x 500: load=15.58s scroll=8.07s
Dimension 10 x 1000: load=32.39s scroll=32.60s

After some profiling and code reading, the performance bottleneck appears to lie with the _render_cell() function, which is called once for each cell in a row. Each call then calls _get_row_renderables(), but the latter function then recomputes every cell in the row. Therefore, if $m$ is the number of columns, then the complexity of rendering a row is $O(m^2)$. This becomes slow as the number of columns grows, and a lot of redundant calculations are performed.

A simple workaround is to add a cache for _get_row_renderables(), as is done for a few of the other methods. With this cache, the performance is massively improved (by 50 times in the last test case):

Dimension 10000 x 1: load=0.36s scroll=0.34s
Dimension 1000 x 10: load=0.24s scroll=0.51s
Dimension 500 x 20: load=0.22s scroll=0.48s
Dimension 200 x 50: load=0.25s scroll=0.72s
Dimension 100 x 100: load=0.36s scroll=0.66s
Dimension 50 x 200: load=0.39s scroll=0.59s
Dimension 20 x 500: load=0.68s scroll=0.50s
Dimension 10 x 1000: load=0.73s scroll=0.80s

I've submitted PR #5959 for this. Separately, there are a number of other changes that could improve the performance by a constant factor, but they involve less-localised refactoring. I might revisit them later.

0 replies

fancidev · 2025-07-14T02:30:59Z

fancidev
Jul 14, 2025
Author

Now regarding potential "constant factor" improvements, under Windows the single largest time consumer, which accounts for 50% of the time, is "console.size", which recomputes the console size every time it is called. It seems that this call is rather expensive under Windows.

A workaround is to cache the computed size within the console. As a quick hack, simply setting console.size = console.size upon DataTable._on_idle solves the problem. It is not obvious how to invalidate this cache, but it seems to get updated anyway when the Windows Terminal window is resized.

After the console.size problem is out of the way, the remaining time consumers are various magic methods that I assume to be "overhead". They might be worth looked into after DataTable._get_row_renderables and console.size. For the time being, I simply subclass DataTable to get the previous two points patched.

0 replies

fancidev · 2025-07-14T03:07:43Z

fancidev
Jul 14, 2025
Author

I found out why console.size was called frequently by DataTable. In render_cell, the code calls "self.app.console.options.update*", which retrieves the console dimensions first and then override its width and/or height. This has two potential issues: First, the console dimension retrieval is redundant. Second, and more importantly, I think the screen dimension should be retrieved via the app's driver rather than via the physical console.

The second point also explains why the app appears to update correctly even if console.size is hardcoded. The actual console window resize event is posted by the win32 driver under Windows, and the layouts are computed with that.

I ended up subclassing App and overriding its _on_idle method to cache console.size if there is a _resize_event pending.

0 replies

Uh oh!

DataTable performance enhancement #5953

Uh oh!

Uh oh!

fancidev Jul 11, 2025

Replies: 5 comments

Uh oh!

github-actions[bot] bot Jul 11, 2025

Uh oh!

Uh oh!

fancidev Jul 12, 2025 Author

Uh oh!

Uh oh!

fancidev Jul 12, 2025 Author

Uh oh!

fancidev Jul 14, 2025 Author

Uh oh!

Uh oh!

fancidev Jul 14, 2025 Author

fancidev
Jul 11, 2025

github-actions[bot]
bot Jul 11, 2025

fancidev
Jul 12, 2025
Author

fancidev
Jul 12, 2025
Author

fancidev
Jul 14, 2025
Author

fancidev
Jul 14, 2025
Author