Skip to content

Make it possible to have a deterministic order in serialized OpenApi files #1314

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
gturri opened this issue Aug 17, 2023 · 19 comments · May be fixed by #2308
Open

Make it possible to have a deterministic order in serialized OpenApi files #1314

gturri opened this issue Aug 17, 2023 · 19 comments · May be fixed by #2308
Assignees
Labels
Milestone

Comments

@gturri
Copy link

gturri commented Aug 17, 2023

Is your feature request related to a problem? Please describe.
I use OpenAPI.NET to serialize instances of OpenApiDocument and store them in a git repo.
So, ideally I would like that when I make a simple change to my API, I get a simple change in the serialized document, in order to have an easy to review git diff.

It is not the case currently, because serialization iterates on some Dictionary and that the order of the iteration is hence not deterministic.
For instance, if I add one endpoint to my API, I would like that my serialized OAS file looks like what it was at my previous generation, with just a few added lines. But in practice I may have more diffs, because my paths and my schemas may end up serialized in a different order.

Describe the solution you'd like
I can already achieve part of what I would like from my client code in the sense that I can get a deterministic order for the Schemas section, if I do

OpenApiDocument doc = BuildMyDoc();
doc.Components.Schemas = new SortedList<string, OpenApiSchema>(doc.Component.Schemas) // <-- that's the important line
doc.SerializeAsV3(new OpenApiYamlWriter(....));

It works rather fine, in the sense that it leads to a serialized schemas section where all schemas are order alphabetically (making it nice for my git diffs, and making the file easier to browse overall). However, it does not feel really "clean" in itself, and this approach does not work for other parts of the document. In particular for the paths; because they are represented by an OpenApiPaths which ends up extending a Dictionary and I could not find a virtual method or an attribute I could override to achieve a similar deterministic order.

Given this context, a solution that would be nice would be to edit IOpenApiSerializable in order to turn SerializeAsV3(IOpenApiWriter writer) into SerializeAsV3(IOpenApiWriter writer, bool enforceAlphabeticalOrder = false).
So for instance OpenApiExtensibleDictionary.SerializeAsV3 could be implemented like

var items = enforceAlphabeticalOrder ? new SortedList<string, T>(this) : this;
foreach(var item in items) {
  ...
}

Having this feature behind a boolean enforceAlphabeticalOrder could make it possible for users interested in this feature to use it, and users not interested in it would not suffer any performance penalty by leaving it to false.

Describe alternatives you've considered
As explained above, I tried to achieve this order from my client code, but faced some hard limitations

Additional context
If the proposed approach looks good to you, and if it could help, I guess I could try to propose a pull request to implement this.

@gturri
Copy link
Author

gturri commented Sep 8, 2023

FWIW (in case someone has the same issue and stumble on this feature request), I found a workaround, which is in a nutshell:

  • serialize the OpenApiDocument to a yaml string
  • deserialize this string to a Dictionary with YamlDotNet
  • sort the values of that Dictionary
  • then serialize that Dictionary to get the file I was looking for.

From a code point of view it looks like this:

var oas = ...

// Ensure the schemas are sorted. This can be done directly on the OpenApiDocument
oas.Components.Schemas = new SortedList<string, OpenApiSchema>(oas.Components.Schemas, StringComparer.OrdinalIgnoreCase);

// Now the trick to sort the paths
using var oasWriter = new StringWriter();
oas.SerializeAsV3(new OpenApiYamlWriter(oasWriter));
var yamlDoc = (Dictionary<object, object>) new YamlDotNet.Serialization.Deserializer().Deserialize(new StringReader(oasWriter.ToString()));
yamlDoc["paths"] = ToDictionaryWithSortedKeyStrings((Dictionary<object, object>) yamlDoc["paths"]);

// Now I can serialize back (with YamlDotNet this time)
using var textWriter = new StreamWriter(outPath);
new YamlDotNet.Serialization.Serializer().Serialize(textWriter, yamlDoc);


// A trick here is that we get a IDictionary<object, object> but we know that the keys are string, and we need to
// cast them in order to apply a deterministic order to it (because if I don't use StringComparer.OrdinalIgnoreCase
// I may end up with a different behavior on my laptop and on my CI
private static SortedList<string, object> ToDictionaryWithSortedKeyStrings(IDictionary<object, object> dictionary)
{
    var sorted = new SortedList<string, object>(StringComparer.OrdinalIgnoreCase);
    foreach (var kvp in dictionary)
    {
        sorted.Add(kvp.Key.ToString(), kvp.Value);
    }
    return sorted;
}

That being said, I guess it makes sense to leave this feature request open, because it would be much more convenient if this could be supported directly by OpenAPI.NET

@mikekistler
Copy link
Member

Related: dotnet/aspnetcore#59809

@captainsafia
Copy link
Member

@RachitMalik12 Can we slot this in for v2?

I don't think we necessarily have to change anything about the in-memory model to support sorting but the serialization step should attempt do lexogaphic sorting on the path values before seralizing the document.

Similarily, we should probably lexographic sort the elements when in each component type? For example, all schemas get sorted alphabetically by ID and all security schemes as well.

@RachitMalik12 RachitMalik12 added this to the v2-Preview14 milestone Mar 14, 2025
@darrelmiller
Copy link
Member

@captainsafia I think I would rather the default behaviour to be to maintain the order provided by the user, but we would be open to providing a Sort() method to reorder the dictionary.

@RachitMalik12
Copy link
Contributor

Relevant for all dictionary based collections

@Michael-Wamae Michael-Wamae self-assigned this Mar 26, 2025
@Michael-Wamae Michael-Wamae linked a pull request Apr 8, 2025 that will close this issue
@baywet
Copy link
Member

baywet commented Apr 8, 2025

My assumption here was that we'd provide either a comparer or a callback on the settings object so it can be used during the serialization.
But @Michael-Wamae 's interpretation was to implement sorting methods here.
@darrelmiller can you clarify please?

@darrelmiller
Copy link
Member

@baywet There are two distinct work items here.

  1. Change the behaviour of dictionaries/lists to maintain the order that items were added.
  2. Add a helper method to allow users to reorder the contents of the dictionaries/lists.

@baywet
Copy link
Member

baywet commented Apr 9, 2025

To maintain the order we could replace all the Dictionaries we have by OrderedDictionary I thought it was only available for net5+ but it goes all the way back to netstandard2.0, which is good news for us.

As for the helper, I'm assuming this is only relevant during serialization? there are no other scenarios which would require control over ordering from the user? correct?

We have other places where we use lists, they maintain order. We also have hashsets, they don't have an OrderedSet equivalent. Would you leave it as is? or also require to use an ordering collection instead?

@darrelmiller
Copy link
Member

@baywet OrderedDictionary sounds good. Yes I think the ordering is only necessary during serialization, but why does it matter if we just have a helper method? Are you thinking of doing this a different way, like in the writer settings?

I looked at some places where we have HashSet and I think we may need to change this too. We use it for Tags and considering Tags are used for generating documentation, the order might end up being significant to doc rendering. Are there any downsides to changing from HashSet? Is it just that we lose the uniqueness constraint?

@baywet
Copy link
Member

baywet commented Apr 9, 2025

@darrelmiller there's something I don't understand here. If we're talking about simply helper methods, and not updating the writer settings, how do you expect people to use them? reorder the DOM and THEN serialize?

As for hashsets, we could simply "implement it" by creating a new HashSet class that effectively stores everything in an OrderedDictionary<T,bool> under the hood, and discard the value all the time. The main downside is the memory footprint and maybe the and increased compute time as well.

@darrelmiller
Copy link
Member

yes. Most people will create the dom objects and then just serialize in the order they created. However, asp net are generating from code so they want to sort before serializing to minimize diffs if code gets reorganized

@baywet
Copy link
Member

baywet commented Apr 15, 2025

So they effectively ONLY need a call back during serialization time. It's ok if deserialization-serialization is NOT symmetric in terms of order (assuming no callback is passed). Which means we don't need to change collections. Correct?

@mikekistler
Copy link
Member

I can confirm that we only need to order these dictionaries during serialization.

How will this work for arrays, such as the global tags array? We will want to generate this in a deterministic order as well, probably sorted by tag name.

@darrelmiller
Copy link
Member

@baywet ASPNET only need it sorted at serialization time. However, we have had requests from other users to have deterministic ordering. I think we should fix the collections so we never have non-deterministic ordering. Having the collections sorted alphabetically is a distinct issue that is the ASP.NET requirement.

@baywet
Copy link
Member

baywet commented Apr 15, 2025

@mikekistler do you think you'd ever have a requirement to sort collections in a different way? (e.g. tags a specific way, but path items another)
As far as I can tell, all OpenAPI model objects have either a "main string property" (e.g. the tag itself), or string key (e.g. media type key, responses key, path items key, etc...). The only annoying exception are schemas. Ignoring that exception for a while, the serialization writer settings could have a property that's a IComparer and this could be used by the library during serialization. What do you think?

If we think this is not enough context to do the sorting, we could instead ONLY replace all collections by ones that maintain order, and let you do the sorting on the object model directly. Doing that would reduce the API surface, also fix the symmetry problem, and give you greater control.

Thoughts?

@mikekistler
Copy link
Member

I don't foresee a need for custom sorting options, but if possible I would like to allow a developer to override the sorted collection with an ordered collection. In other words, I think the two options of a) sorted lexicographically by the natural key, and b) a custom ordered dictionary should cover all the bases.

@mikekistler
Copy link
Member

Can you share which specific properties in the open api description you need the capability to be ordered or sorted?

Let's see:

  • Path items ordered by path
  • All properties of components ordered by key
  • Global tags array sorted by tag name
  • ServerObjects in servers ordered by ?? url ??
  • web hooks ordered by name (key)
  • securityRequirements in a security field ordered by name
  • content of requestBody or response object ordered by media-type (key)
  • headers of requestBody or response object ordered by header name (key)
  • links of response object ordered by link name (key)
  • example objects in examples field ordered by example name (key)
  • entries of an encoding field ordered by property name (key)

That's more than I expected, and I might have missed some.

@bkoelman
Copy link

All properties of components ordered by key

Please don't reorder these. They represent something functional that users care about. For example:

public class Employee
{
    public long Id { get; set; }
    public string DisplayName { get; set; }
    public decimal Salary { get; set; }
}

should be emitted as:

- Id
- DisplayName
- Salary

instead of:

- DisplayName
- Id
- Salary

Users expect to find Id first when debugging or reading log files. And that the properties represent what they declared. When ordered alphabetically in OAS, generated clients will send JSON requests with properties in the same alphabetical order. In our server, that results in hitting a sub-optimal code path. Because JsonReader is forward-only, we expect the type property to come first. The remainder of the properties is validated against that. Our suboptimal fallback code reads the request body twice: first to determine the type, and then again to read other properties and validate them against the type.

The remainder of the OAS file is technical, as it does not represent the business entities from an app, so the ordering should not matter much.

@captainsafia
Copy link
Member

Please don't reorder these. They represent something functional that users care about. For example:

This isn't about reorder properties on schemas. It's about ordering in the for top-level items under the components property in the OpenAPI document.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants