Skip to content

Summary and count causes performance issues on large datasets #37

@markbrough

Description

@markbrough

With very large datasets (e.g. 13m rows), summary and count appear to significantly slow down the response:

babbage/babbage/cube.py

Lines 89 to 96 in 9416105

# Count
count = count_results(self, prep(cuts,
drilldowns=drilldowns,
columns=[1])[0])
# Summary
summary = first_result(self, prep(cuts,
aggregates=aggregates)[0].limit(1))

Without generating summary and count, it's 2-3 times faster to return the response.

It would be useful to make returning these properties optional. E.g. by adding an optional &simple parameter to the request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions