Skip to content

feat(v2): recording rules of function names #4232

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jun 20, 2025
Merged

Conversation

alsoba13
Copy link
Contributor

@alsoba13 alsoba13 commented Jun 6, 2025

This PR approaches recording rules of functions.

There's one part I'd like to change (marked as TODO), but as it mostly involves code style, and is not compromising performance yet, I prefer to introduce this PR with a clearer presented core logic.

You may want to read this before jumping to reviewing the code, as this may help:

Observer state

Data is organized and accessed sequentially this way:

Tenant Dataset Series Rows/profiles
tenant1 service_name1 fingerprint1 row1
row2
row3
fingerprint2 row4
service_name2 fingerprint3 row5
fingerprint4 row6
row7
tenant2
...
service_name3
...
fingerprint5
...
row8
...

Observer only see single rows (last column), and need to deduce if a context switch has occurred.
There are 3 types of context/states/scopes:

  • Tenant scope:
    • A tenant switch will cause a flush of the recorded metrics, and fetch and init new recording rules for the new tenant.
  • Dataset scope:
    • Symbols information is scoped to every dataset. During the compaction process, symbols within a dataset are gathered from different blocks and rewritten to a new block. Observe needs to detect dataset changes so symbolical information may be reset.
    • Tenant's rules are filtered and narrowed to those that target the specific dataset.
    • Dataset state holds lookup tables, that relate symbolical information to rules targeting those symbols.
  • Series scope:
    • Every batch of rows with the same fingerprint may match some rules. On a series switch, every recording rule is evaluated to know which ones are relevant to the series.

Observing symbols

We will try to avoid processing symbols that don't matter to any rule. With that purpose, row is Observed before symbols. Observe will compute the state of the series so we know if the series matches some rule with a FunctionName.

Observer will completely ignore symbols when the current Series state has no FunctionName rules. In a later series, if there's one rule that matches and has a FunctionName, symbols will be observed and pointers symbol-to-rule will be computed. As we only want to process symbols once for every new symbol rewritten, we will compute lookup tables of all rules that contain FunctionNames, not only the ones matching the current Series. The reason for this is that in a later series, a new rule with FunctionName may match and maybe all the symbols were already observed, and hence missed.

As a summary, requirements/properties we find:

  • We want to Observe row before observing symbols.
  • Symbols are only observed if we need to.
  • The observed row has old symbols, so we need to observe again to get the rewritten ones. For this reason, I included a Flush function, that re-observes the symbol. I'd like to get rid of this, or in case we keep this 2-step observer, rename it (maybe we can have a Observe(row, state) function instead)

@alsoba13 alsoba13 requested a review from a team as a code owner June 6, 2025 13:29
Comment on lines 75 to 76
// TODO: Find an alternative to this
Flush(ProfileEntry)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to find a way to avoid this. We want to observe row first, then symbols, then observe row again when symbols are rewritten. I tried some strategies but failed to make this interface simpler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the most confusing I find the naming here:

  • Flush, is used already in the exporter and also there is flush().
  • Observe is not really observing yet until ObserveSymbols or Flush.

Maybe naming it Observe -> Prepare and Flush -> ObserveValues.

In terms of interface, I am not too sure if I would prefer a callback here, but it could be a possibility. It feels weird though with ObserveSymbols:

	if m.observer != nil {
		observeValues := m.observer.Prepare(r)
		defer observeValues()
	}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of deferred observation. Something like:

if observe := evaluate(entry); observe != nil {
    defer observe()
}

I'm curious how it performs on widely spread functions, e.g., Go runtime.mallocgc. What I had in mind is that we could resolve symbols once per aggregate (we'd need to aggregate all sample values first) or at dataset boundaries. That would also allow for slightly clearer contracts. I'm not proposing we implement this unless we encounter performance issues – just something that might be helpful in the future.

Copy link
Contributor

@simonswine simonswine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this @alsoba13.

You have done a good job making your implementation easy to follow and it is well structured. I also think you taken a lot of optimisations already into account, so I think we should merge this at this stage and get some real world usage to inform us what to do next.

I think before you merge, please make sure you cover the new code paths with some tests.

Comment on lines 90 to 91

// StacktraceFilterFunctionNameRegExp functionNameRegExp = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would remove those comments from here, to avoid confusing users.

Comment on lines 96 to 97

// SELF = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would remove those comments from here, to avoid confusing users.

Comment on lines 75 to 76
// TODO: Find an alternative to this
Flush(ProfileEntry)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the most confusing I find the naming here:

  • Flush, is used already in the exporter and also there is flush().
  • Observe is not really observing yet until ObserveSymbols or Flush.

Maybe naming it Observe -> Prepare and Flush -> ObserveValues.

In terms of interface, I am not too sure if I would prefer a callback here, but it could be a possibility. It feels weird though with ObserveSymbols:

	if m.observer != nil {
		observeValues := m.observer.Prepare(r)
		defer observeValues()
	}

@@ -593,6 +593,7 @@ func (f *Phlare) Run() error {
"service_git_ref": serviceGitRef(),
"service_repository": "https://github.com/grafana/pyroscope",
},
//TenantID: "pyroscope-tenant",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean me up

}

func NewRewriter(w *SymDB, r SymbolsReader) *Rewriter {
type SymbolsObserver interface {
// ObserveSymbols is called once new symbols has been rewritten. This method must not modify the symbols.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth noting that Prepare() should be called before

Copy link
Collaborator

@kolesnikovae kolesnikovae left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks Alberto for adding this! It looks great. I think we can iterate on the API and performance improvements in the future

@alsoba13 alsoba13 force-pushed the alsoba13/functions-poc branch from 537ee8f to 89a24b0 Compare June 18, 2025 10:08
@alsoba13 alsoba13 requested review from korniltsev and a team as code owners June 18, 2025 10:08
@alsoba13 alsoba13 changed the base branch from alsoba13/function-totals-rules-proto to main June 18, 2025 10:09
@alsoba13 alsoba13 enabled auto-merge (squash) June 20, 2025 10:53
@alsoba13
Copy link
Contributor Author

Tests to cover observeSymbols will be added in a following PR

@alsoba13 alsoba13 merged commit 7191356 into main Jun 20, 2025
24 checks passed
@alsoba13 alsoba13 deleted the alsoba13/functions-poc branch June 20, 2025 11:05
simonswine added a commit that referenced this pull request Jul 9, 2025
simonswine added a commit that referenced this pull request Jul 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants