-
Notifications
You must be signed in to change notification settings - Fork 170
Restructure AggregationBinder and ComputeBinder #1378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restructure AggregationBinder and ComputeBinder #1378
Conversation
d738dd3
to
1e16908
Compare
ddb1c83
to
5f1336c
Compare
cb8049f
to
272220e
Compare
dbd2e61
to
cb6895f
Compare
src/Microsoft.AspNetCore.OData/Query/Expressions/QueryBinder.cs
Outdated
Show resolved
Hide resolved
d6c58fd
to
dff8b0e
Compare
@gathogojr based on the description, I'm trying to understand the high-level flow of data, expressions and transformations and what/whether dependencies exist between them. From looking at the signature, it appears that the different steps (flattening, BingGroupBy, BindSelect) can be executed independently so long as their respective results are combined at the end in the correct order, is this correct? Does the following diagram correctly capture this flow? graph TD
Input[Input: TransformationNode, Context]
InputQueryable[Input: IQueryable data source]
BindGroupBy(BindGroupBy)
BindSelect(BindSelect)
FlattenProperties(FlattenProperties)
GroupByResult[GroupByExpression]
SelectResult[SelectExpression]
FlatteingResult[FlatteningExpression]
CombineQuery("Compose(FlatteningExpression, GroupByExpression, SelectExpression)")
FinalQuery[FinalQueryExpression]
FinalResult[FinalResult]
ApplyQuery("Apply(Expression, IQueryable)")
Input --> BindGroupBy
Input --> BindSelect
Input --> FlattenProperties
InputQueryable --> FlattenProperties
BindGroupBy --> GroupByResult
BindSelect --> SelectResult
FlattenProperties --> FlatteingResult
GroupByResult --> CombineQuery
SelectResult --> CombineQuery
FlatteingResult --> CombineQuery
CombineQuery --> FinalQuery
FinalQuery --> ApplyQuery
InputQueryable --> ApplyQuery
ApplyQuery --> FinalResult
|
@habbes The diagram doesn't quite capture it correctly, though it almost does. |
Thanks for the clarification. I tried to capture the dependency if But are you saying that the expression returned by |
Is this other information (from flattening process) propagated in the context?: Correct. The context parameter in particular |
/// 3). Nested property with Next | ||
/// 4). Nested property without Next. | ||
/// However, Entity Framework doesn't allow to set different properties for the same type in two places in a lambda expression. | ||
/// Using new type with just new name to workaround that issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean something like:
s => new { "name" : s.Value } // where name is "some new name"
src/Microsoft.AspNetCore.OData/Query/Expressions/BinderExtensions.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One concern I have with the PR is whether we're exposing a rigid public interface that's designed with one primary implementation in mind (EF 6) and forces every implement of the interfaces to fit their customizations to match this paradigm. For example, it seems like we require custom implementations of IAggregationBinder
to also flatten references properties. Is flattening properties a core part of aggregation support or is it mainly a workaround for inefficient queries produces by classic EF? Is this something that every implementer might contend with? Can they skip flattening if they wanted to? or do something post-flattening?
How customizable are the interfaces? How composable are they? Do we expect any customization beyond overriding a few methods the built-in default implementations?
I was wondering whether we could have a model where the aggregation expression translation is broken down into a pipeline of phases. But I'm not sure if this is feasible or if the way aggregations work lend itself naturally to a pipeline model. You could think about and see if it makes sense.
In the pipeline model, each phase or component would have a method that accepts TransformationNode, QueryBinderContext
(and possibly IQueryable
) and transform the context or return a result that the next phase can consume. The BindGroupBy
would represent one phase, BindSelect
another, and even flattening would be another phase. We could create a common interface to represent these phases, and the aggregation binder would execute the phases in the order they appear in the pipeline. So the default pipeline would [Flatten, BindGroupBy, BindSelect]
. These built-in implementations could be public (and possibly overridable). The customer could customize the aggregation by introducing providing their own pipeline, they could derive a new pipeline from the default pipeline by inserting new phases, removing some, etc. Might require some additional work to validate the result to ensure it's still a valid aggregation expression (or assume the user knows what they're doing).
Again, not entirely sure if this works exactly here. It may also be that I misunderstand the current flow. But I'm just thinking of whether we're currently coupling and projecting internal workarounds into public APIs and making customizations rigid and clunky.
TContainer GroupByContainer { get; set; } | ||
|
||
/// <summary>Gets or sets the property container that contains the aggregation properties.</summary> | ||
TContainer Container { get; set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel this would do with a better name. E.g AggregationContainer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KenitoInc This is a change I’d prefer to avoid for two key reasons:
- Scope & Impact – Renaming
Container
would be a sweeping change affecting many areas, making it difficult to justify and review. - Conceptual Clarity – The
Container
property is not limited to aggregation; it is also used in other contexts, such asComputeBinder
, where it stores computed properties. The same name is used inSelectExpand
feature. Naming itAggregationContainer
would introduce unnecessary ambiguity, making its broader purpose less clear.
For these reasons, keeping Container
as-is maintains both consistency and semantic correctness.
src/Microsoft.AspNetCore.OData/Properties/SRResources.Designer.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.AspNetCore.OData/Properties/SRResources.Designer.cs
Outdated
Show resolved
Hide resolved
/// </remakrs> | ||
internal class AggregationPropertyContainer : NamedProperty<object> | ||
/// </remarks> | ||
internal class AggregationPropertyContainer : NamedProperty<object>, IAggregationPropertyContainer<GroupByWrapper, AggregationPropertyContainer> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this class internal? I think we should allow this class to be extensible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KenitoInc cc. @habbes I have explained in QueryBinder.ConvertToAggregationPropertyContainerIfNeeded
method why the coupling between "Aggregation" and "SelectExpand" should be broken first to make it easier to publicly expose the default wrapper (GroupByWrapper
, FlatteningWrapper<T>
, ComputeWrapper
, EntitySetAggregationWrapper
, etc.) and container (AggregationPropertyContainer
) types - to support extending via inheritance, like you're suggesting here: https://github.com/OData/AspNetCoreOData/pull/1378/files#diff-b13ea755c3b30ddde5b8c435f2f908c3328849ce8febbfda51375405df9e1d8eR1725
The coupling between the two features it through the AggregationPropertyContainer
inheriting from the NamedProperty<T>
used in the SelectExpand
feature.
To rehash my points here...
- The
NamedProperty<T>
type, which inherits fromPropertyContainer
, was designed primarily forSelectExpand
scenarios. - It introduces a significant amount of logic that is not required for aggregation.
- Aggregation functionality only depends on the
Name
andValue
properties, as well as theToDictionaryCore
method. - The
ToDictionaryCore
method requires a booleanincludeAutoSelected
parameter, which is irrelevant in aggregation scenarios and is simply ignored. - The inheritance structure complicates making essential wrapper types (e.g.,
GroupByWrapper
,FlatteningWrapper<T>
) and container types (e.g.,AggregationPropertyContainer
) public. - Exposing these types as part of the public API is necessary to enable subclassing
AggregationBinder
, instead of forcing developers to implementIAggregationBinder
from scratch. - Due to the coupling via
NamedProperty<T>
inheritance, making theAggregationPropertyContainer
public would require exposing approximately 128 private types fromPropertyContainer
, which is impractical. - The current coupling doesn't give us much. You might notice that the
TestAggregationPropertyContainer
in the E2E tests project avoids the coupling and works perfectly fine.
To manage the change and mitigate risks, I proposed to expose the wrapper and container types public in a separate pull request. What this pull request achieves for now is to allow people to implement their own wrapper and container types to use with their custom implementations of aggregation binder.
src/Microsoft.AspNetCore.OData/Query/Container/AggregationPropertyContainer.cs
Show resolved
Hide resolved
src/Microsoft.AspNetCore.OData/Query/Expressions/AggregationBinder.cs
Outdated
Show resolved
Hide resolved
} | ||
|
||
private List<MemberAssignment> CreateSelectMemberAssigments(Type type, MemberExpression propertyAccessor, | ||
IEnumerable<GroupByPropertyNode> properties) | ||
private static Expression WrapDynamicCastIfNeeded(Expression propertyAccessExpression) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this returns an Expression
, it a good candidate for public virtual
method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can look into the methods in AggregationBinder
that can be extensible public virtual
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KenitoInc I’d prefer to defer making these methods public until we are ready to expose the default wrappers (e.g., GroupByWrapper
, FlatteningWrapper<T>
, ComputeWrapper<T>
) and container implementations (e.g., AggregationPropertyContainer
).
Reasons for Deferral:
- Dependency on Internal Types – Many of these methods rely on types that are not yet public. For example, while
CreateEntitySetAggregateExpression
could technically be exposed, it references other methods and types—such asEntitySetAggregationWrapper
—that are currently internal. - Incomplete Public API – Exposing these methods prematurely would create an inconsistent API, where some related types remain internal, making it difficult to extend or override the implementation cleanly.
- Planned Design Review – I plan to document and outline the necessary changes in a design proposal to ensure a well-thought-out approach before exposing the default wrapper and container types publicly. I mentioned this in the Next Steps section of the pull request description
For these reasons, making these methods public at this stage would be premature and introduce unnecessary complexity.
src/Microsoft.AspNetCore.OData/Query/Expressions/AggregationBinder.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.AspNetCore.OData/Query/Expressions/AggregationBinder.cs
Outdated
Show resolved
Hide resolved
5ae15b3
to
b84a2aa
Compare
b84a2aa
to
69cd5da
Compare
Here's my take...
I don't believe this is the case. Remember this is the implementation that we currently have out there and it's being applied against EF 6 and EF Core alike. We're not introducing anything new. Before this, we didn't even know that EF Core is able to generate efficient SQL queries even when the flattening process is skipped. But it's still the code we're running out there. I also mentioned that I don't consider to have done exhaustive investigation to determine conclusively that it's an unnecessary exercise when using EF Core. I don't particularly agree with the idea that the interface is rigid. If the flattening part is a big concern, we could remove it from the public API and retain
Before this refactor to support extensibility, I don't believe a lot of the contributors knew that the mitigation against inefficient queries was only required for classic EF. We haven't even exhaustively investigated that.
If someone returns null from the
I personally don't know anything that can be done post-flattening. Are you suggesting we should have a post-flattening "hook" that the implementer can ride on to do something we don't know?
Expound what you mean by interfaces being "customizable"? One can implement the interfaces from scratch and I have demonstrated that in the E2E tests. The E2E tests run against the default implementations of the interfaces alongside the custom implementations. The customizations I expect more of is not the interfaces being implemented from scratch but rather users wanting to override a few methods from the default implementations. I have mentioned in "Next Steps" that we'll expose the default wrapper and container types to support extensibility via inheritance.
Isn't this what we have more or less? The
I don't know how significantly different this proposal is to the current approach. Do you have something like the following in mind? Core Interface public interface IAggregationTranslator
{
int Order { get; }
Expression Process(IQueryable source, TransformationNode transformationNode, QueryBinderContext context);
} Implement the interface in components public class BindGroupBy : QueryBinder, IAggregationTranslator
{
public int Order { get => return 20; }
public IQueryable Process(IQueryable source, TransformationNode node, QueryBinderContext context)
{
// Implementation
}
}
public class BindSelect : QueryBinder, IAggregationTranslator
{
public int Order { get => return 30; }
public IQueryable Process(IQueryable source, TransformationNode node, QueryBinderContext context)
{
// Implementation
}
} Optional flattening component public int Order { get => return 10; }
public class FlattenReferencedProperties : QueryBinder, IAggregationTranslator
{
public IQueryable Process(IQueryable source, TransformationNode node, QueryBinderContext context)
{
// Implementation
}
} Pipeline interface public interface IAggregationPipeline
{
IQueryable Execute(IQueryable source, TransformationNode transformationNode, QueryBinderContext context);
} Pipeline implementation public class AggregationPipeline
{
private readonly List<IAggregationTranslator> components;
public AggregationPipeline(IEnumerable<IAggregationTranslator> components)
{
this.components = components.ToList();
}
public IQueryable Execute(IQueryable source, TransformationNode node, QueryBinderContext context)
{
IQueryable current = source;
foreach (var component in this.components.OrderBy(c => c.Order))
{
current = component.Process(current, node, context);
}
return current;
}
} Configure components as services services.AddTransient<IAggregationTranslator, FlattenReferencedProperties>();
services.AddTransient<IAggregationTranslator, BindGroupBy>();
services.AddTransient<IAggregationTranslator, BindSelect>();
services.AddTransient<IAggregationPipeline, AggregationPipeline>();
// .NET will inject the translator components into the constructor of the AggregationPipeline Aggregation binder public class AggregationBinder
{
private readonly IAggregationPipeline pipeline;
public AggregationBinder(IAggregationPipeline pipeline)
{
this.pipeline = pipeline;
}
public IQueryable Bind(IQueryable source, TransformationNode transformationNode, QueryBinderContext context)
{
return this.pipeline.Execute(source, transformationNode, context);
}
}
public static IQueryable ApplyBind(this AggregationBinder binder, IQueryable source, TransformationNode transformationNode, QueryBinderContext context, out Type resultClrType)
{
context.EnsureFlattenedProperties();
binder.Bind(source, transformationNode, context);
} Or did you have something different in mind? Can you illustrate? Do you consider the pipeline approach as something we should strive for in the first iteration?
This is a first attempt at refactoring the aggregation feature to support extensibility. I'm sure there are many ways we can slice and dice it. We can improve iteratively based on actual customer feedback. If you noticed, the base branch is actually release-8.x. We'd have an opportunity to iterate over both minor and major releases and that iteration can be driven by actual customer feedback. Premature optimization is the root of all evil. Let's not sacrifice simplicity, clarity, flexibility, and correctness for unknown gains too early. |
@gathogojr One of the reasons I would avoid having the Pipelines at this stage and why we should spend more time to plan them out is some of the issues on knowing how they should be organized especially if their inputs and outputs can change the output or outcome which flattening can. |
69cd5da
to
4ab4797
Compare
@gathogojr @marabooy. I like your interpretation of what a pipeline could look like. But I'm okay with not going that route. I'm okay with having an interface that only exposes groupby and select methods, since they are fundamental parts of the aggregation, then flattening could be part of the default implementation that could be overridden or provided as some helper method. |
096fed7
to
85d3d3c
Compare
This pull request restructures the
AggregationBinder
andComputeBinder
used in OData's LINQ translation process to improve extensibility.Restructuring
AggregationBinder
for ExtensibilityThe primary goal is to modularize the three transformation steps:
GroupBy
expressions – Converts$apply
grouping into LINQ GroupBy.Select
expressions – Maps aggregation results into structured output.This restructuring introduces well-defined interfaces (
IGroupByWrapper
,IAggregationPropertyContainer
,IFlattentingWrapper
, etc.) to enforce a type-safe contract, ensuring extensibility while preventing unintended mix-and-match of unrelated implementations.After an OData
$apply
expression is transformed into a parse tree represented by aTransformationNode
, it undergoes a three-step process to translate it into a LINQ expression tree:$apply
clause into the corresponding LINQGroupBy
expression.$apply
clause into the corresponding LINQSelect
expression.The following sections will explain each step in detail, using illustrations while also describing the APIs introduced to support extensibility..
Example OData Query
Consider the following OData query containing a
$apply
expression:Class diagram for the data model:

It's easier to cover the introduced interfaces by going over the second step - constructing
GroupBy
expression, before the first step - pre-flattening referenced properties.Constructing the
GroupBy
ExpressionTo support the construction of the
GroupBy
expression, we expose theBindGroupBy
method with the following signature:The
BindGroupBy
method takes the parse tree represented by theTransformationNode
and constructs a lambda expression representing the body of theGroupBy
expression. For the OData$apply
expression introduced earlier, the expression will look as follows:The process involves:
The wrapper used in constructing the
GroupBy
expression must inherit fromDynamicTypeWrapper
and implement theIGroupByWrapper<TContainer, TWrapper>
interface:TContainer
represents the aggregation property container, whileTWrapper
represents the groupby wrapper. This establishes a type-safe pairing.Similarly,
IAggregationPropertyContainer<TWrapper, TContainer>
is defined as:This bidirectional dependency ensures that each wrapper and container type can only interact with valid implementations.
Here's how implementations of the two interfaces would look:
Pre-flattening Referenced Properties
To support flattening of referenced properties, we expose the
FlattenReferencedProperties
method:The
AggregationFlatteningResult
type encapsulates the outcome of the flattening process.For the OData
$apply
expression above, the result of flattening the original query source would be:The wrapper used in flattening must inherit from
DynamicTypeWrapper
and implement theIGroupByWrapper
andIFlatteningWrapper<T>
interfaces:Constructing the
Select
ExpressionTo support the construction of the
Select
expression, we expose theBindSelect
method:The
BindSelect
method constructs a lambda expression representing the body of theSelect
expression. For the OData$apply
expression introduced earlier, the expression will look as follows:Bringing Everything Together
The
ApplyBind
method in BinderExtensions combines everything to construct the final modified query source. TheFlattenReferenceProperties
method is invoked conditionally, followed byBindGroupBy
andBindSelect
.For the OData
$apply
expression introduced earlier, the final query source will look as follows:Why is Flattening Referenced Properties Necessary?
Flattening referenced properties is crucial when navigation properties are referenced in the aggregate transformation. For example, consider the following OData query referencing a navigation property in the aggregate clause:
Comparing the LINQ expressions and SQL queries generated with and without flattening demonstrates the efficiency gained by flattening.
Flattened - LINQ Expression:
Non-flattened - LINQ Expression:
Flattened - SQL Query:
Non-flattened - SQL Query:
Notice how we end up with nested subqueries when we don't flatten the referenced navigation properties. The subqueries become more pronounced with more navigation properties in the
$apply
expression.Consider the following
$apply
expression referencing two navigation properties:Here's how the generated SQL query would look like when we don't take the navigation properties through the flattening process:
These nested subqueries are inefficient and would gradually lead to performance degradation.
Note: The nested subqueries behavior is currently reproducible with EF6. The EF Core team might have figured out how to generate efficient queries even when properties are not flattened. I'll create a ticket to scrutinize this further to determine whether to conditionally perform the flattening step.
Restructuring
ComputeBinder
for ExtensibilityOverview of
compute
transformationThe
compute
transformation feature enables the dynamic addition of computed or derived properties to an OData query result. This allows users to define expressions that generate new virtual properties based on existing data within the query pipeline. The compute transformation supports:+
), subtraction (-
), multiplication (*
), division (/
), etc.$apply
– Works in conjunction withgroupby
andaggregate
for advanced computations.The
ComputeBinder
class is responsible for binding an ODatacompute
transformation (represented as aComputeTransformationNode
) into a LINQ expression that gets applied dynamically to the query.Introducing
IComputeBinder
for extensibilityTo improve flexibility and maintainability, this restructuring introduces the
IComputeBinder
interface, which provides a contract for binding acompute
transformation. This allows custom implementations of the binder in case different computation strategies are needed.IComputeBinder
interface definition:The introduction of this interface decouples the ComputeBinder logic from a specific implementation, making it easier to extend or replace in future updates.
Steps in
compute
transformation bindingThe process of transforming an OData
compute
expression into a LINQ query projection involves the following steps:Instance
property is initialized to the original source object, allowing access to existing properties referenced in the compute expression.NamedPropertyExpression
, which pairs the computed property alias with its corresponding expression.Wrapper implementation for
compute
transformationTo support the compute transformation, the wrapper class used for computed properties must:
DynamicTypeWrapper
to support dynamic property additions.IGroupByWrapper
to maintain compatibility with$apply
transformations.IComputeWrapper<T>
to provide structured access to the source object and EDM model:The default implementation of the compute wrapper,
ComputeWrapper<T>
, extendsGroupByWrapper
and indirectly implementsIGroupByWrapper
by subclassingGroupByWrapper
:This ensures that:
IEdmEntityObject
is not strictly required, as demonstrated in theTestComputeWrapper<T>
used in tests.Testing
Unit tests and E2E tests were added to verify the changes. In addition, custom implementations of
IAggregationBinder
-TestAggregationBinder
andTestComputeBinder
- were added to demonstrate how to custom implementations ofIAggregationBinder
andIComputeBinder
can be created and injected into the DI container. In addition, E2E tests have been added to confirm that the default and custom implementations ofIAggregationBinder
andIComputeBinder
both achieve the same result. The E2E tests use both an EF Core in-memory and SqlServer database.Next Steps
Update 1
Following this investigation, which revealed that EF Core 6.x and above can successfully translate queries involving navigation properties, deep projections, and includes within aggregation clauses, the
IAggregationBinder
interface has been refactored. Specifically, the API responsible for flattening referenced properties has been moved into a newIFlatteningBinder
interface, which can now be optionally implemented:The default aggregation binder currently implements this new interface. Moving forward, we plan to gather more information about the providers our customers are using to determine whether query flattening should remain opt-in or be enabled by default. In the meantime, customers who wish to avoid the overhead of flattening can implement a custom aggregation binder that omits the
IFlatteningBinder
functionality.Here is an example:
Alternatively, consumers may implement
IAggregationBinder
from scratch, without implementingIFlatteningBinder
, for full control over the binding behavior.Additional Details
Fixes OData/WebApi#2108
Fixes #1416