Skip to content

perf(weave): add pre group by ref filtering #4198

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 25, 2025

Conversation

gtarpenning
Copy link
Member

@gtarpenning gtarpenning commented Apr 21, 2025

Description

WB-24547

Add a ref filtering condition pre-group by.

Technically we do not need AND (hasAny(array_concat_agg(calls_merged.input_refs), ['ref']))) as we are guaranteed that the one call start row contain any refs, so any call starts included at the group by will be valid. Lets include it for now just to be safe, it should not add meaningful latency.

SELECT calls_merged.id AS id,
FROM calls_merged
WHERE calls_merged.project_id = 'UHJvamVjdEludGVybmFsSWQ6NDAwMDAyNDg='
        AND ((hasAny(calls_merged.output_refs, ['ref'])) OR length(calls_merged.output_refs) = 0)
        ----------------------------------------------------------------------------------------
        AND ((hasAny(calls_merged.input_refs, ['ref'])) OR length(calls_merged.input_refs) = 0)
        -------------------------------------------------------------------------------------
GROUP BY (calls_merged.project_id, calls_merged.id)
HAVING (((any(calls_merged.deleted_at) IS NULL))
        AND ((NOT ((any(calls_merged.started_at) IS NULL))))
        AND (hasAny(array_concat_agg(calls_merged.output_refs), ['ref'])))
        AND (hasAny(array_concat_agg(calls_merged.input_refs), ['ref'])))
ORDER BY any(calls_merged.started_at) ASC

Without this change we get frequent memory exceeded errors, trace link in ticket. This query is the single most errored query in prod the last week, with over 50 error queries (2.2% failure rate).

Screenshot 2025-04-23 at 6 00 50 PM

Testing

  • query unit tests
  • client tests
  • manual prod test, on a query that repeatedly fails with memory exceeded errors:
    (prod, then branch)
    Screenshot 2025-04-21 at 10 01 26 AM

@circle-job-mirror
Copy link

circle-job-mirror bot commented Apr 21, 2025

@gtarpenning gtarpenning changed the title perf(weave): add pre group by ref filtersing perf(weave): add pre group by ref filtering Apr 21, 2025
@gtarpenning gtarpenning marked this pull request as ready for review April 21, 2025 19:42
@gtarpenning gtarpenning requested a review from a team as a code owner April 21, 2025 19:42
@@ -1077,7 +1085,7 @@ def process_operand(operand: "tsi_query.Operand") -> str:
)


def process_op_name_filter_to_conditions(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better names

@neutralino1
Copy link
Contributor

Related to the query builder, I'm seeing double conditions in certain cases. It likely doesn't affect performance but makes for longer messier queries.
Example:

WHERE calls_merged.project_id = 'UHJvamVjdEludGVybmFsSWQ6NDEyOTg5ODE=' is applied three times below.

WITH filtered_calls AS
  (SELECT calls_merged.id AS id
   FROM calls_merged
   WHERE calls_merged.project_id = 'UHJvamVjdEludGVybmFsSWQ6NDEyOTg5ODE='
   GROUP BY (calls_merged.project_id,
             calls_merged.id)
   HAVING (((any(calls_merged.deleted_at) IS NULL))
           AND ((NOT ((any(calls_merged.started_at) IS NULL))))
           AND (any(calls_merged.parent_id) IS NULL)))
SELECT any(calls_merged.attributes_dump) AS attributes_dump,
       any(calls_merged.deleted_at) AS deleted_at,
       argMaxMerge(calls_merged.display_name) AS display_name,
       any(calls_merged.ended_at) AS ended_at,
       any(calls_merged.exception) AS
exception,
       calls_merged.id AS id,
       array_concat_agg(calls_merged.input_refs) AS input_refs,
       any(calls_merged.inputs_dump) AS inputs_dump,
       any(calls_merged.op_name) AS op_name,
       any(calls_merged.output_dump) AS output_dump,
       array_concat_agg(calls_merged.output_refs) AS output_refs,
       any(calls_merged.parent_id) AS parent_id,
       calls_merged.project_id AS project_id,
       any(calls_merged.started_at) AS started_at,
       any(calls_merged.summary_dump) AS summary_dump,
       any(calls_merged.trace_id) AS trace_id,
       any(calls_merged.wb_run_id) AS wb_run_id,
       any(calls_merged.wb_user_id) AS wb_user_id
FROM calls_merged
LEFT JOIN feedback ON (feedback.weave_ref = concat('weave-trace-internal:///', 'UHJvamVjdEludGVybmFsSWQ6NDEyOTg5ODE=', '/call/', calls_merged.id))
WHERE calls_merged.project_id = 'UHJvamVjdEludGVybmFsSWQ6NDEyOTg5ODE='
  AND calls_merged.project_id = 'UHJvamVjdEludGVybmFsSWQ6NDEyOTg5ODE='
  AND (calls_merged.id IN filtered_calls)
GROUP BY (calls_merged.project_id,
          calls_merged.id)
HAVING (any(feedback.trigger_ref) = 'weave-trace-internal:///UHJvamVjdEludGVybmFsSWQ6NDEyOTg5ODE=/object/monitor-test:sQijAwWRX6NmaIkMEpfOZhHbjqGw3UZwz70jD9msilI')
ORDER BY any(calls_merged.started_at) DESC
LIMIT 50
OFFSET 0

@gtarpenning
Copy link
Member Author

Related to the query builder, I'm seeing double conditions in certain cases. It likely doesn't affect performance but makes for longer messier queries. Example:

That is not good, investigating

@gtarpenning
Copy link
Member Author

AND calls_merged.project_id = 'UHJvamVjdEludGVybmFsSWQ6NDEyOTg5ODE='

@neutralino1

This is apparently by design, introduced 6 months ago in this pr. Looks like I stamped it :/. As you mentioned, it shouldn't hurt performance but it is indeed pretty ugly.

It should be entirely unrelated to the changes in this pr; I will put up another pr to fix that.

@gtarpenning
Copy link
Member Author

Fix pr for @neutralino1's comment: #4245

Copy link
Contributor

@neutralino1 neutralino1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

@gtarpenning gtarpenning merged commit a01a248 into master Apr 25, 2025
141 checks passed
@gtarpenning gtarpenning deleted the griffin/ref-filtering-optimization branch April 25, 2025 01:41
@github-actions github-actions bot locked and limited conversation to collaborators Apr 25, 2025
@gtarpenning gtarpenning restored the griffin/ref-filtering-optimization branch April 29, 2025 20:34
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants