Skip to content

Trace connection durations #5146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

crocodile-dentist
Copy link
Contributor

@crocodile-dentist crocodile-dentist commented Jun 20, 2025

Description

It may be useful for some diagnostic purposes to know how long lived
an outbound hot connection has been. This patch introduces tracing
when a hot peer is demoted or errors out with the active time in seconds.

Checklist

Quality

  • Commit sequence makes sense and have useful messages, see ref.
  • New tests are added and existing tests are updated.
  • Self-reviewed the PR.

Maintenance

  • Linked an issue or added the PR to the current sprint of ouroboros-network project.
  • Added labels.
  • Updated changelog files.
  • The documentation has been properly updated, see ref.

@crocodile-dentist crocodile-dentist self-assigned this Jun 20, 2025
@crocodile-dentist crocodile-dentist requested a review from a team as a code owner June 20, 2025 15:30
@github-project-automation github-project-automation bot moved this to In Progress in Ouroboros Network Jun 20, 2025
@crocodile-dentist crocodile-dentist marked this pull request as draft June 20, 2025 15:35
@crocodile-dentist crocodile-dentist force-pushed the mwojtowicz/track-connection-durations branch from 425fb9a to 2a6e3e0 Compare June 20, 2025 15:58
@crocodile-dentist crocodile-dentist marked this pull request as ready for review June 20, 2025 17:06
@coot coot added connection-manager Issues / PRs related to connection-manager tracing Issues / PRs related to tracing labels Jun 23, 2025
Copy link
Contributor

@coot coot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The life time of inbound / outbound connection ought to be measured by the connection manager, as it knows enough about a connection to do that. It can also measure lifespan of a connection (which could be longer than either of them).

However, connection manager doesn't know about protocol temperature, so it cannot measure how long a peer was hot or remote hot - and this should be logged by peer selection actions and inbound governor respectively.

@crocodile-dentist
Copy link
Contributor Author

Are we actually interested in knowing the connection duration at the handler level? I thought about that and didn't find the metric useful enough in general and so decided against implementing re your first point. If we want to track outbound duration to big ledger peers, the connection duration might not give an accurate result if the connection was used via the inbound side strictly at least some of the time. This implementation will give the connection durations separately for outbound and inbound sides via patches in the inbound governor and peer state actions (via peer connection handle, so the duration can be exposed to peer selection governor easily in the future if we do decide to implement a max duration policy).

@crocodile-dentist crocodile-dentist force-pushed the mwojtowicz/track-connection-durations branch 6 times, most recently from 993087c to dee490c Compare June 26, 2025 09:36
A trace is emitted whenever a hot outbound peer is demoted
or closed (possibly due to an error), giving the duration in seconds of how long the peer
has been in hot mode.

Some other refactoring to aid comprehensibility
@crocodile-dentist crocodile-dentist force-pushed the mwojtowicz/track-connection-durations branch from dee490c to 8951682 Compare June 26, 2025 10:50
Copy link
Contributor

@karknu karknu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this to be usable in the way we discussed I think that the lifetime for the hot connections must be included in the DebugSt data too.

We are looking for the life time of the longest lived connections, and they may still be active. That is emitting a trace event for hot demotions won't capture these connections.

@crocodile-dentist crocodile-dentist force-pushed the mwojtowicz/track-connection-durations branch from 84598db to 692bf9f Compare June 27, 2025 10:46
@crocodile-dentist crocodile-dentist force-pushed the mwojtowicz/track-connection-durations branch from 692bf9f to c718863 Compare June 27, 2025 11:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
connection-manager Issues / PRs related to connection-manager tracing Issues / PRs related to tracing
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

3 participants