Skip to content

Rewrite computeSelfTime to improve performance on Trace Statistics page #2767

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 26, 2025
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@

import memoizeOne from 'memoize-one';
import _uniq from 'lodash/uniq';
import DRange from 'drange';
import { Trace, Span } from '../../../types/trace';
import { ITableSpan } from './types';
import colorGenerator from '../../../utils/color-generator';
Expand Down Expand Up @@ -45,21 +44,47 @@ function getChildOfSpans(parentID: string, allSpans: Span[]): Span[] {

function computeSelfTime(span: Span, allSpans: Span[]): number {
if (!span.hasChildren) return span.duration;
// We want to represent spans as half-open intervals like [startTime, startTime + duration).
// This way the subtraction preserves the right boundaries. However, DRange treats all
// intervals as inclusive. For example,
// range(1, 10).subtract(4, 8) => range([1, 3], [9-10])
// length=(3-1)+(10-9)=2+1=3
// In other words, we took an interval of length=10-1=9 and subtracted length=8-4=4.
// We should've ended up with length 9-4=5, but we got 3.
// To work around that, we multiply start/end times by 10 and subtract one from the end.
// So instead of [1-10] we get [10-99]. This makes the intervals work like half-open.
const spanRange = new DRange(10 * span.startTime, 10 * (span.startTime + span.duration) - 1);
const children = getChildOfSpans(span.spanID, allSpans);
children.forEach(child => {
spanRange.subtract(10 * child.startTime, 10 * (child.startTime + child.duration) - 1);
});
return Math.round(spanRange.length / 10);

let spanSelfTime = span.duration;
let previousSpanEndTime = span.startTime;

const children = getChildOfSpans(span.spanID, allSpans).sort((a, b) => a.startTime - b.startTime);

const parentSpanEndTime = span.startTime + span.duration;

for (let index = 0; index < children.length; index++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the alternative algorithm I had in mind:

  • lastEnd=parent.start
  • selfTime = parent.duration
  • for each sorted child
    • if child.start > parent.end break
    • start = max(lastEnd, span.start)
    • end = min(parent.end, span.end)
    • selfTime = selfTime - (end-start)
    • lastEnd = end

I asked Gemini to implement it, it got this:

function calculateSelfTime(parent, children) {
  let lastEnd = parent.startTime;
  let selfTime = parent.duration;

  // Sort children by their start time
  children.sort((a, b) => a.startTime - b.startTime);

  const parentEndTime = parent.startTime + parent.duration;

  for (const child of children) {
    // If child starts after parent ends, no more overlap is possible
    if (child.startTime >= parentEndTime) {
      break;
    }

    const childEndTime = child.startTime + child.duration;

    // Calculate the effective start of the overlap
    const start = Math.max(lastEnd, child.startTime);

    // Calculate the effective end of the overlap
    const end = Math.min(parentEndTime, childEndTime);

    // Only subtract if there's actual overlap and it's within the parent's current "uncovered" time
    if (end > start) {
      selfTime -= (end - start);
    }

    // Update lastEnd to the maximum of its current value and the child's end time
    // This correctly handles overlapping children and ensures we don't double-count
    lastEnd = Math.max(lastEnd, end);
  }

  // Ensure selfTime doesn't go below zero due to floating point inaccuracies or edge cases
  return Math.max(0, selfTime);
}

Copy link
Contributor Author

@DamianMaslanka5 DamianMaslanka5 May 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is exactly the same algorithm, just implemented a bit differently.

Initially wanted to keep the code dumb, so I wanted to avoid using Math.max/min. But using Math.max/Math.min should be fine, because comments with examples are very helpful. Now I don't have any preference for which version is easier to understand.

Please let me know which version you prefer. I made the changes in 03532d7

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect the biggest cost to come from sorting the children repeatedly, making the whole thing an O(N^3) complexity, while I think it's possible to do it with O(N^2) if we sort first, the construct the tree from that sorted order.

As for the algorithm, min/max makes it a lot easier (for me at lest) to reason as to what's happening, since it makes the logic linear, vs. reasoning about which edge case in the diagrams represent and whether all edge cases are actually covered. In my algo I only need to think about lastEnd boundary moving up and each span measuring against it, not against any "previous" spans

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect the biggest cost to come from sorting the children repeatedly, making the whole thing an O(N^3) complexity, while I think it's possible to do it with O(N^2) if we sort first, the construct the tree from that sorted order.

computeSelfTime is no longer visible in a profiler, so even if it can be faster, I just don't think it is worth spending time on this, because any change to this function will not be visible. Also see #2767 (comment)

const child = children[index];

const spanEndTime = child.startTime + child.duration;
const spanStartsAfterParentEnded = child.startTime > parentSpanEndTime;
const spanEndsBeforePreviousSpan = spanEndTime < previousSpanEndTime;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the naming is confusing. Can we stick with parentXyz and childXyz to make it easier to follow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO these names are accurate.

parent |..................|
child    |.......|
child     |.....|                      - spanEndsBeforePreviousSpan is true, this span is skipped
child                         |......| - spanStartsAfterParentEnded is true, this span is skipped

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

current naming is just spanXyz, which is confusing because there are two spans involved in every step and it's not clear which one is referenced by spanXyz name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I misunderstood your initial comment, this is a general issue for the whole function, not just the lines you commented on? I see now what that I used span in the parameter of the function and span for children.

Can you confirm that these names are less confusing? Or if you have other suggestion it would be helpful if you could provide examples.

  • previousSpanEndTime => previousChildEndTime
  • spanEndTime => childEndTime
  • spanEndsBeforePreviousSpan => childEndsBeforePreviousChild
  • spanStartsAfterParentEnded => childStartsAfterParentEnded
  • spanStartsBeforePreviousSpanEnds => childStartBeforePreviousChildEnds
  • spanSelfTime => parentSelfTime?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's better

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I updated the names in 5935bd1

if (spanStartsAfterParentEnded || spanEndsBeforePreviousSpan) {
continue;
}

let nonOverlappingStartTime = child.startTime;
let nonOverlappingDuration = child.duration;

const spanStartsBeforePreviousSpanEnds = child.startTime < previousSpanEndTime;
if (spanStartsBeforePreviousSpanEnds) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add add diagrams to illustrate the scenarios? something like

// child(n-1):  |.......|
// child(n):         |......|

Copy link
Contributor Author

@DamianMaslanka5 DamianMaslanka5 May 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about using format like this? Also see my previous comment, looks like example for L59,L60 will be also useful.

parent |.....................|
child    |.......|
child        |.....|                  - spanStartsBeforePreviousSpanEnds is true
child                |.....|          - spanStartsBeforePreviousSpanEnds is false

And example for L75

parent |......................|
child                      |.....|        - last span included in self time calculation
child                       |.........|   - skipped

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, in every if() there is an implied configuration which could be represented by a diagram like that to make the logic easier to follow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the comments in 7b7286d

const diff = previousSpanEndTime - child.startTime;
nonOverlappingDuration = child.duration - diff;
nonOverlappingStartTime = previousSpanEndTime;
}
// last span which can be included in self time calculation, because it ends after parent span ends
else if (spanEndTime > parentSpanEndTime) {
const diff = spanEndTime - parentSpanEndTime;

nonOverlappingDuration = child.duration - diff;
spanSelfTime -= nonOverlappingDuration;
break;
}

spanSelfTime -= nonOverlappingDuration;
previousSpanEndTime = nonOverlappingStartTime + nonOverlappingDuration;
}

return spanSelfTime;
}

function computeColumnValues(trace: Trace, span: Span, allSpans: Span[], resultValue: StatsPerTag) {
Expand Down
Loading