Skip to content

Traces - Add traces unique count based on cardinality aggregation #2435

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

sthobis
Copy link
Contributor

@sthobis sthobis commented Apr 30, 2025

Description

  • Count unique traces using a cardinality aggregation on traceId.
  • Use the unique trace count values for both data_prepper and custom_data_prepper.
  • Display "warning" in the data_prepper view if there are more trace records than are currently shown.
Screen.Recording.2025-04-30.at.16.12.35.720p.mov

Issues Resolved

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@sthobis sthobis changed the title Add traces unique count based on cardinality aggregation Traces - Add traces unique count based on cardinality aggregation Apr 30, 2025
@TackAdam
Copy link
Collaborator

TackAdam commented May 1, 2025

Thank you for your contribution! Investigating an edge case on large data set where the cardinality count is differing between root spans and traces with this change {Root spans reading 57,794,167: Trace reading: 88,867,865}. Will update tomorrow.

Root spans:
RootSpans
Traces:
Traces

@synhershko
Copy link

Thank you for your contribution! Investigating an edge case on large data set where the cardinality count is differing between root spans and traces with this change {Root spans reading 57,794,167: Trace reading: 88,867,865}. Will update tomorrow.

@TackAdam probably due to this? https://www.elastic.co/docs/reference/aggregations/search-aggregations-metrics-cardinality-aggregation#_counts_are_approximate

I don't see a point in providing an actual number tbh. You could show an actual number if it's below 40k (the maximum you can get accurately enough) and otherwise show 10k+.

@synhershko
Copy link

BTW, is there a reason why the UI components and features used are different between the Data Prepper and Custom Source views? It looks like Custom Sources is enjoying the use of the updated components also used in Discovery etc. "Data Prepper" should be a specific case of a "Custom Source", I don't see a reason to differentiate the two?

@TackAdam
Copy link
Collaborator

TackAdam commented May 2, 2025

BTW, is there a reason why the UI components and features used are different between the Data Prepper and Custom Source views? It looks like Custom Sources is enjoying the use of the updated components also used in Discovery etc. "Data Prepper" should be a specific case of a "Custom Source", I don't see a reason to differentiate the two?

Custom source is an experimental feature introduced in version 2.17. It allowed us to iterate rapidly on the user experience, even with breaking changes. The team is actively working to bring this feature out of experimental status, at which point we can update and align the Data Prepper experience accordingly.

@TackAdam
Copy link
Collaborator

TackAdam commented May 2, 2025

@TackAdam probably due to this? https://www.elastic.co/docs/reference/aggregations/search-aggregations-metrics-cardinality-aggregation#_counts_are_approximate

I don't see a point in providing an actual number tbh. You could show an actual number if it's below 40k (the maximum you can get accurately enough) and otherwise show 10k+.

After meeting with UX, we agreed that displaying the exact count is not particularly useful and can be replaced with a "10k+" indicator. This change enables the use of the more efficient cardinality aggregation. As a result, the existing check here: https://github.com/opensearch-project/dashboards-observability/blob/main/public/components/trace_analytics/requests/traces_request_handler.ts#L161 can safely be removed, since the exact count is no longer required.

Next steps:

  1. Update the warning message to indicate that "10k+" results were retrieved for Traces / All Spans / Root Spans / Service Entry Spans.
  2. Remove the redundant count aggregation.

cc: @sthobis

@sthobis
Copy link
Contributor Author

sthobis commented May 3, 2025

< maxDisplayRows (10k) > maxDisplayRows (10k)
Screenshot 2025-05-03 at 17 54 27 Screenshot 2025-05-03 at 17 54 50

@sthobis sthobis force-pushed the sthobis/traces-unique-count branch from 77643e8 to fcf9064 Compare May 3, 2025 12:27
@synhershko
Copy link

I think the message is confusing still. The paranthesis showing 500, and "500 out of ..." is incorrect. The page size is 10 not 500. We should just say "Traces ({exact number})" or "Traces (10,000+)" and remove the yellow text.

@TackAdam
Copy link
Collaborator

TackAdam commented May 5, 2025

I think the message is confusing still. The paranthesis showing 500, and "500 out of ..." is incorrect. The page size is 10 not 500. We should just say "Traces ({exact number})" or "Traces (10,000+)" and remove the yellow text.

Thanks for the note! After review I am in agreement that the text should be adjusted. I was thinking along the lines of “Results out of {number under 10,000}” and “Results out of 10,000+” because the exact number is displayed next to the type already. I do think this callout is still necessary to indicate to the user that more than what is displayed was factored into the query call.

@TackAdam
Copy link
Collaborator

TackAdam commented May 5, 2025

@sthobis

Pending discussion on above, adjusting wording to “Results out of {number under 10,000}” and “Results out of 10,000+” per above comment.

Few minor bugs left to fix, but it’s looking good.

  1. The Data Prepper page is showing the exact result not the 10k+ indicator.
  2. Switching from Data Prepper to Custom Source (If Traces were selected earlier causes the “No Matches” indicator to show up beneath it, and the “Load more” functionality to not be rendered when reaching the last page) (Can also reproduce by going to Custom source “Traces” then switching to services page. Then back to Traces page.)

Video example of bugs:

Bug1+2.mov
Bug3.mov

Match data_prepper warning message to custom_data_prepper

Fix datagrid pagination/empty state on traces mode

Signed-off-by: Thobi Sinaga <[email protected]>
@sthobis
Copy link
Contributor Author

sthobis commented May 6, 2025

Updated

  • Update warning message wordings
  • Match data_prepper warning message to custom_data_prepper
  • Fix datagrid pagination/empty state on traces mode
Screen.Recording.2025-05-06.at.19.23.03.720p.mov

@TackAdam
Copy link
Collaborator

TackAdam commented May 6, 2025

@sthobis Great work! All bugs fixed.
Last adjustment is just for the text. We want to remove the row count from the start of the warning message for both cases {Custom Source {Spans/Traces} and DataPrepper}.
The reasoning is that we are already displaying the row count next to the mode.
Screenshot 2025-05-06 at 10 16 04 AM
Screenshot 2025-05-06 at 10 16 14 AM

IE fixing
{${rowCount} results out of ${totalCountText}}
to {Results out of ${totalCountText}}

Should be good to merge after those two minor changes!

@@ -8,6 +8,8 @@ export const NANOS_TO_MS = 1e6;

export const MILI_TO_SEC = 1000;

export const MAX_DISPLAY_ROWS = 10000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for moving this to constants

@TackAdam TackAdam merged commit f213bbb into opensearch-project:main May 7, 2025
17 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants