Skip to content

[Draft] Semantic Logging Support #6394

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

Aaronontheweb
Copy link
Member

Changes

Prototype semantic logging support and a major performance improvement to Akka.NET logging in general.

Designed to be source compatible with Akka.NET, but not binary compatible.

Work in progress

Checklist

For significant changes, please ensure that the following have been completed (delete if not relevant):

Latest dev Benchmarks

Include data from the relevant benchmark prior to this change here.

This PR's Benchmarks

Include data from after this change here.

@Aaronontheweb Aaronontheweb force-pushed the logging-internal-structs branch from eaf69a2 to 941cc17 Compare February 9, 2023 23:04
@Aaronontheweb
Copy link
Member Author

This PR isn't fully compiling yet and I've had to re-review my own code since I started writing this back in October, 2022, but I wanted to offer a high-level summary of these changes, their motivations, and goals.

Goals & Motivations

Logging has occasionally reared its head as a significant bottleneck in user-designed actors, mostly as a result of excessive allocations, stringification, and subsequent GC pressure.

Our current EventStream logging design is ideal for offloading most of the object allocation into a dedicated background actor, which solves many of those performance problems, but the "close to the metal" logging infrastructure (ILoggingAdapter) is designed in a largely inextensible way from the point of view of the ActorSystem. It uses a one-size-fits-all method that results in a large number of allocations inside user callsites and has to be customized on a per-instance basis if you want support for custom formatters for things like semantic logging.

Our goals with this PR are to help improve performance and extensibility while retaining our asynchronous log-consumption design and its benefits. In addition to those - we also want these new APIs to be at least source-compatible with the ones that are currently in-use through Akka.NET's internals as well as what users are doing in the wild - this will require a bit of planning but no reflection magic or any hacks of the sort.

Goals

  1. Reduce allocations inside all logging calls - namely unnecessary boxing and array allocations caused by the params object[] overloads all of the ILoggingAdapter interface methods call today.
  2. Make native semantic logging feasible - this means supporting the ability to decouple how an object is logged from how it's formatted at an infrastructure level, rather than making it a per-actor or per-logging call concern. I.E. no more needing to specify the SerilogLogFormatter in each individual actor that wants to use semantic logging.
  3. Make the front-end logging infrastructure (ILoggingAdapter) configurable across the ActorSystem - if you have a piece of context you want to be able to pass into all of your logs (i.e. the TraceId, if available), there should be a means of configuring this centrally in Akka.NET. Currently that is not the case in v1.4.
  4. Introduce all ILoggingAdapter API changes in a source-compatible way - this API is used heavily everywhere and I don't think it's feasible to have users update all of their logging calls. Therefore, we should make this a "compile and go" experience to the extent that it's achievable.

Binary Compatibility

Binary compat may not be achievable due to how "greedy" the existing params object[] methods are in the current API. In theory, we could make this binary compatible by putting all of the new "best practice" APIs on their own overloads and preserve the original API signatures - and that'd mean all users would need to manually upgrade to the new way of doing things in order to reap the benefits.

My present thinking is that if we can't have our cake and eat it too, we're better off taking some binary compat API risk and automatically upgrading the performance of everyone's existing infrastructure so long as they just compile their code with v1.5 installed, but I'm open to arguments to the contrary.

Design

I'll need to put a longer draft together in a subsequent comment, but basically we're making the internal logging state and the formatting strongly typed without boxing.

@Aaronontheweb
Copy link
Member Author

Design

In this design, we do the following.

All Logs Still Flow Through EventStream

We're going to keep our existing fundamental logging pipeline design, which works as follows:

existing-akkadotnet-logging-pipeline

This design has a number of advantages:

  1. Most major allocations (i.e. string formatting) happen inside Logger implementation, not your actors;
  2. LogEvents can be broadcast to multiple consumers - Loggers, actors, etc;
  3. Heavy duty log writing (i.e. transmitting logs to LogStash, writing to file) happens out of band from your actors; and
  4. There's no possibility of log writing having effects such as "global synchronization" on [Akka.NET](http://akka.net/) actors.

Point 4 is the most important facet of this design from a safety standpoint and is why we're going introduce any customizations that allow users to do foreground logging directly inside the actors. We've seen dozens of users absolutely footgun themselves by calling ILogger / Serilog et al directly inside actors without realizing that they were essentially synchronizing their entire ActorSystem around a single lock guarding the logging pipeline or around writing to a single log file. Orleans users have run into issues Microsoft.Extensions.Logging downstreams can cause a similar problems in their applications.

The ILoggingAdapter and the EventStream design we currently use is vastly safer, more fundamentally performant, and less prone to these types of problems than anything else in the .NET ecosystem. If you're not using ILoggingAdapter today, you are absolutely playing with fire.

Keeping this design, in addition to the benefits I've listed above, helps ensure continuity and compatibility between all prior versions of [Akka.NET](http://akka.net/) and v1.5.

LogEvent and Strongly Typed Logging Paylaods

Currently, our Akka.Event.LogEvent types look like this:

public abstract class LogEvent : INoSerializationVerificationNeeded
{
    /// <summary>
    /// Initializes a new instance of the <see cref="LogEvent" /> class.
    /// </summary>
    protected LogEvent()
    {
        Timestamp = DateTime.UtcNow;
        Thread = Thread.CurrentThread;
    }

    /// <summary>
    /// The exception that caused the log event. Can be <c>null</c>
    /// </summary>
    public Exception Cause { get; protected set; }

    /// <summary>
    /// The timestamp that this event occurred.
    /// </summary>
    public DateTime Timestamp { get; private set; }

    /// <summary>
    /// The thread where this event occurred.
    /// </summary>
    public Thread Thread { get; private set; }

    /// <summary>
    /// The source that generated this event.
    /// </summary>
    public string LogSource { get; protected set; }

    /// <summary>
    /// The type that generated this event.
    /// </summary>
    public Type LogClass { get; protected set; }

    /// <summary>
    /// The message associated with this event.
    /// </summary>
    public object Message { get; protected set; }

    /// <summary>
    /// Retrieves the <see cref="Akka.Event.LogLevel" /> used to classify this event.
    /// </summary>
    /// <returns>The <see cref="Akka.Event.LogLevel" /> used to classify this event.</returns>
    public abstract LogLevel LogLevel();
}

It's the object LogEvent.Message that is somewhat problematic here - this was done in order to keep designs simple back in the very earliest days of [Akka.NET](http://akka.net/), but the byproduct is that it results in a lot of boxing and array allocations inside the foreground ILoggingAdapter.Log operations, which can have a measurable impact on the way system and user actors execute.

So instead, we're going to restructure LogEvents to take a dependency on ILogContents - an interface that includes all necessary default type information along with the function required to format them.

public interface ILogContents
{
    public LogLevel LogLevel { get; }

    public Exception Exception { get; }
    
    /// <summary>
    /// The timestamp that this event occurred.
    /// </summary>
    public DateTime Timestamp { get; }

    /// <summary>
    /// The thread where this event occurred.
    /// </summary>
    public int ThreadId { get; }

    /// <summary>
    /// The source that generated this event.
    /// </summary>
    public LogSource LogSource { get; }

    
    /// <summary>
    /// Renders an underlying <see cref="LogEntry{TState}"/> in a non-generic fashion.
    /// </summary>
    /// <returns></returns>
    string Format();
}

We’re going to include a single default implementation of ILogContents - a readonly struct that accepts a generic state argument and a formatting function:

/// <summary>
/// Raw data payload produced from any logging call.
/// </summary>
/// <typeparam name="TState">The state for the specified log message.</typeparam>
public readonly struct LogEntry<TState> : ILogContents
{
    public LogEntry(LogLevel logLevel, TState message, Func<TState, Exception, string> formatter, 
        LogSource source, int threadId, DateTime timestamp, Exception exception = null)
    {
        LogLevel = logLevel;
        Message = message;
        Formatter = formatter;
        LogSource = source;
        ThreadId = threadId;
        Timestamp = timestamp;
        Exception = exception;
    }

    public LogLevel LogLevel { get; }
    
    public TState Message {get;}
    
    public Exception Exception { get; }
    
    /// <summary>
    /// The timestamp that this event occurred.
    /// </summary>
    public DateTime Timestamp { get; }

    /// <summary>
    /// The thread where this event occurred.
    /// </summary>
    public int ThreadId { get; }

    /// <summary>
    /// The source that generated this event.
    /// </summary>
    public LogSource LogSource { get; }

    public Func<TState, Exception, string> Formatter { get; }

    public string Format()
    {
        return Formatter(Message, Exception);
    }
}

The format string, however it’s defined, is included as part of the Formatter function.

The idea behind these changes are to help reduce allocations from the ILoggingAdapter callsites - for instance, today we have the following:

ILoggingAdapter.Info(string format, params object[] args);

So if I make the following call somewhere inside one of my actors:

_log.Info("Received [{0}] events from [{1}] at {2}", 1, Sender, DateTime.UtcNow);

This will allocate:

  1. 1 object[]
  2. Box an int and a DateTime

Under the LogEntry regime, here’s how this method call would look:

ILoggingAdapter.Info<T1, T2, T3>(string format, T1 param1, T2 param2, T3 param3);

And this would produce the following LogEntry:

 public LogEntry(LogLevel.Info, ValueTuple<int, IActorRef, DateTime> message, Func<ValueTuple<int, IActorRef, DateTime>, Exception, string> formatter, 
        LogSource source, int threadId, DateTime timestamp, Exception exception = null){}

No boxing, no array allocations. The formatter function would likely still be a string.Format under this scenario.

💡 It would also be possible to customize the `ILoggingContent` by explicitly calling the `ILoggingAdapter.Log` method, which can accept a custom state parameter and a custom formatter.

Make ILogMessageFormatter Configurable at the ActorSystem Level

A system-wide ILogMessageFormatter that can be configured via HOCON or Akka.Hosting - all ILogMessageFormatters will be required to handle your legacy string.Format calls but can also be optionally configured to support semantic logging.

N.B. The default ILogMessageFormatter will be able to support semantic logging but won't attempt to perform any cute optimizations, template caching, or whatever else. Configure a custom one like Serilog's if this matters to you. We won't accept pull requests adding this functionality either - not our concern.

All ILogMessageFormatters will be expected to return a method with the following signature:

Func<string, TState, Exception, string> :

  • A format string;
  • The primary state being formatted;
  • Optionally - an Exception to be formatted; and
  • returns a complete string that can be printed or exported.

To make this work, we have to change the way that ILoggingAdapter is currently instantiated today:

/// <summary>
/// Creates a new logging adapter using the specified context's event stream.
/// </summary>
/// <param name="context">The context used to configure the logging adapter.</param>
/// <param name="logMessageFormatter">The formatter used to format log messages.</param>
/// <returns>The newly created logging adapter.</returns>
public static ILoggingAdapter GetLogger(this IActorContext context, ILogMessageFormatter logMessageFormatter = null)
{
    var logSource = LogSource.Create(context, context.System);
    return new BusLogging(context.System.EventStream, logSource.Source, logSource.Type, logMessageFormatter ?? DefaultLogMessageFormatter.Instance);
}

This callsite is frequently called today, so the only real change here would be substituting the DefaultLogMessageFormatter static member with something that’s attached to the ActorSystem.

@Aaronontheweb
Copy link
Member Author

Benchmarks so far, bearing in mind we haven't made any internal optimizations yet. It's going to be ugly.

dev

BenchmarkDotNet=v0.13.2, OS=Windows 10 (10.0.19044.2486/21H2/November2021Update)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=7.0.100
  [Host]     : .NET 7.0.0 (7.0.22.51805), X64 RyuJIT AVX2
  Job-WCODYP : .NET 7.0.0 (7.0.22.51805), X64 RyuJIT AVX2

InvocationCount=1  UnrollFactor=1  
Method Mean Error StdDev Median Ratio RatioSD Gen0 Gen1 Gen2 Allocated Alloc Ratio
LogInfoNoParams 163.2 ns 3.25 ns 8.22 ns 160.3 ns 1.00 0.00 0.0110 0.0060 0.0010 64 B 1.00
LogInfo1Params 383.5 ns 19.88 ns 58.61 ns 401.0 ns 2.36 0.39 0.0250 0.0120 - 160 B 2.50
LogInfo2Params 401.8 ns 9.62 ns 28.21 ns 410.3 ns 2.47 0.22 0.0300 0.0150 - 192 B 3.00
LogInfoWithException 164.6 ns 3.26 ns 7.62 ns 160.9 ns 1.01 0.07 0.0100 0.0050 - 64 B 1.00
LogInfoWithException1Params 383.3 ns 20.53 ns 60.52 ns 398.9 ns 2.36 0.38 0.0260 0.0130 0.0010 160 B 2.50
LogInfoWithException2Params 408.3 ns 9.78 ns 28.82 ns 416.9 ns 2.51 0.22 0.0300 0.0150 - 192 B 3.00

This PR

BenchmarkDotNet=v0.13.2, OS=Windows 10 (10.0.19044.2486/21H2/November2021Update)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=7.0.100
  [Host]     : .NET 7.0.0 (7.0.22.51805), X64 RyuJIT AVX2
  Job-NPJPYK : .NET 7.0.0 (7.0.22.51805), X64 RyuJIT AVX2

InvocationCount=1  UnrollFactor=1  
Method Mean Error StdDev Median Ratio RatioSD Gen0 Gen1 Gen2 Allocated Alloc Ratio
LogInfoNoParams 345.2 ns 24.55 ns 72.40 ns 373.1 ns 1.00 0.00 0.0180 0.0090 0.0010 112 B 1.00
LogInfo1Params 419.5 ns 11.53 ns 33.99 ns 435.6 ns 1.29 0.37 0.0260 0.0130 - 168 B 1.50
LogInfo2Params 579.0 ns 51.58 ns 152.09 ns 589.1 ns 1.85 0.83 0.0310 0.0150 - 200 B 1.79
LogInfoWithException 339.4 ns 24.61 ns 72.56 ns 352.0 ns 0.98 0.04 0.0180 0.0090 0.0010 112 B 1.00
LogInfoWithException1Params 420.3 ns 11.60 ns 34.19 ns 434.2 ns 1.29 0.36 0.0260 0.0130 - 168 B 1.50
LogInfoWithException2Params 568.0 ns 45.92 ns 135.41 ns 582.7 ns 1.81 0.77 0.0310 0.0150 - 200 B 1.79

/// </summary>
public class LogMessage
/// <typeparam name="TState">The state for the specified log message.</typeparam>
public readonly struct LogEntry<TState> : ILogContents
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any cast from struct to interface is a boxing, using struct here instead of class is a tradeoff?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is something I'm debating - to make this whole system more performant, we have to either commit to a larger breaking change or backwards compatibility.

  1. More performance - use generics for log state all the way down and discard the old LogMessage types. We can make the impact on the ILoggingAdapter side of things minimal but all custom back-end logging infrastructure will be broken. Changes needed to fix will be minor but will require new source code.
  2. Backwards compat - this is what I opted for in this approach so far: preserving backwards compat to the extent that it's possible. That's why there's boxing and other issues.

I'm going to go down the more radical route and see what needs to be done here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it matters where the boxing happens - we 100% don't care if it happens in the consuming logger actor. We care a lot if it happens in the foreground.

@Aaronontheweb Aaronontheweb mentioned this pull request Feb 15, 2023
6 tasks
@Aaronontheweb Aaronontheweb deleted the logging-internal-structs branch February 17, 2023 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants