Skip to content

Include only PackageReferences included at runtime in deps.json #46218

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
May 16, 2025

Conversation

Forgind
Copy link
Contributor

@Forgind Forgind commented Jan 22, 2025

Fixes dotnet/roslyn#76797
Fixes #48944

PrivateAssets isn't what we should be looking for to determine that we don't need an assembly in the deps.json file. This instead looks for ExcludeAssets and IncludeAssets to see if it will be in the output directory and includes it in the deps.json if so.

Copy link
Contributor

Added needs-breaking-change-doc-created label because this PR has the breaking-change label.

When you commit this breaking change:

  1. Create and link to this PR and the issue a matching issue in the dotnet/docs repo using the breaking change documentation template, then remove this needs-breaking-change-doc-created label.
  2. Ask a committer to mail the .NET Breaking Change Notification DL.

Tagging @dotnet/compat for awareness of the breaking change.

@ericstj
Copy link
Member

ericstj commented Jan 23, 2025

I think this is the right change and better than the PrivateAssets approach. I do want to mark this as "breaking" though because it's possible for folks to notice. I would expect that in all the cases where folks notice it's really a no-op since I would expect all cases the SDK was just writing an empty entry.

The sort of things that folks might see - an entry goes away for a particular package, there's a package reference listed to a package but no definition. For the latter perhaps the SDK can erase that reference too.

One thing that would be great to include with this is to also remove the entries that were dropped by conflict resolution, which is effectively another "automated" way that the SDK applies the equivalent of ExcludeAssets.

@Forgind
Copy link
Contributor Author

Forgind commented Jan 24, 2025

One thing that would be great to include with this is to also remove the entries that were dropped by conflict resolution, which is effectively another "automated" way that the SDK applies the equivalent of ExcludeAssets.

Is the _ConflictPackageFiles item (filtered to just References) right? That seems to be created by the ResolvePackageFileConflicts task, and it only includes the 'losers.' I haven't found anything else that looks good yet.

@dsplaisted
Copy link
Member

I think it might be better if instead of, or in addition to this, we did the following: Remove any Libraries from the deps.json that have no assets under them. From the example in #39400:

image

NerdBank.GitVersioning is listed as a library under the .NET 7 target, but has no runtime assets or anything else under it. So maybe we could remove it regardless of whether ExcludeAssets was specified, since whatever the cause it doesn't contribute anything at runtime.

This would require diving into the DependencyContextBuilder code, understanding how it works, and updating it. We also might need to consider cases such as if a library has no assets but does list dependencies... do we need to check if it transitively brought in any assets?

Thoughts?

@baronfel
Copy link
Member

One thing I like about your addition @dsplaisted is that it makes the deps.json more purpose-built - the file only cares about libraries and assets that impact the runtime experience.

For users that need accurate recounting of compile-time library dependencies we should be promoting SBOMs, which are a separate artifact specifically for book-keeping purposes.

@dsplaisted
Copy link
Member

@ericstj @ViktorHofer @agocke Any feedback on my idea above on removing libraries from the deps.json file if they don't have any assets?

What are the different asset types possible, and how should we handle them and handle dependencies between libraries that don't have any assets?

@ericstj
Copy link
Member

ericstj commented Jan 27, 2025

@ericstj @ViktorHofer @agocke Any feedback on my idea above on removing libraries from the deps.json file if they don't have any assets?

Yes! this would be great. We've wanted these gone for some time as they also cause false positives for folks scanning the files.

What are the different asset types possible

The runtime cares about runtime, native, runtimeTargets, and resources @elinor-fung @jkoritzinsky

how should we handle them and handle dependencies between libraries that don't have any assets?

I think we just remove the library section and remove references to the removed section. This should ensure that that an entire branch from the package graph is removed. I can imagine if we're sloppy about identifying packages to remove it could lead to "orphaned" references - though I don't think the host would care about that since it doesn't attempt to construct a graph.

@dsplaisted
Copy link
Member

@ericstj How should we handle dependencies if we're removing packages without assets? Should we remove a package that has no assets but depends on a package which does? I think that could lead to a different kind of orphaning where we have a dependency of the removed package still in the file but no dependencies to it any more.

@ericstj
Copy link
Member

ericstj commented Jan 28, 2025

Should we remove a package that has no assets but depends on a package which does?

I'd correct this to say

package that has no assets but is the only package which depends on another package that has assets

This is not so uncommon if you consider a case where someone does ExcludeAssets on one package with a large package graph, but then updates one of that package's dependencies to a newer version with a direct reference (or some other graph) that's not excluded.

If you could constrain the removal in this way efficiently then I think it would be safer, however my hypothesis is that this is unlikely if you are identifying removal based on exclusion/conflict resolution. Package exclusion is specified by graph notation which flows to dependencies and conflict resolution is closure complete. So you could count on these mechanisms being graph complete for a given path.

I can see how if instead we remove packages that have no assets (regardless of how) then it would be safer to also check that they are not the only packages that reference (including indirectly) some other package with assets.

@dsplaisted
Copy link
Member

@Forgind Are you thinking of taking a look at this again? I think what we should try to do what I suggested here, ie remove libraries from the file that don't have any assets associated with them.

@Forgind
Copy link
Contributor Author

Forgind commented Mar 7, 2025

I talked with dsplaisted offline, and I think we're on the same page. I'm planning to start working on this again early next week.

@Forgind
Copy link
Contributor Author

Forgind commented Mar 10, 2025

@dsplaisted, can you look at that last commit (fa1bd9d) to see if you think that aligns with your understanding of what I should do, including the comment? If so, I'll turn the comment into code.

@marcpopMSFT marcpopMSFT requested a review from dsplaisted March 18, 2025 20:32
@dsplaisted
Copy link
Member

I haven't looked in detail at the graph algorithm code yet, but we are going to want some decent test coverage for this. We can probably have an end-to-end test or two (#39400 has a good example), and then unit tests where we test just the task with hand-modified assets files for different interesting cases.

@Forgind
Copy link
Contributor Author

Forgind commented Mar 26, 2025

I haven't looked in detail at the graph algorithm code yet, but we are going to want some decent test coverage for this. We can probably have an end-to-end test or two (#39400 has a good example), and then unit tests where we test just the task with hand-modified assets files for different interesting cases.

I added some tests. Let me know what you think!

Copy link
Member

@dsplaisted dsplaisted left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job with the graph algorithm code. I went through it and I'm pretty sure it's doing what we discussed.

I haven't yet looked at the tests in detail, but I suspect we will need more of them. We can start with the different cases that were mentioned in the discussions for this PR and linked issues, creating graphs that match those cases and then possibly thinking of other interesting variations.

private class ModifiableRuntimeLibrary
{
public RuntimeLibrary Library { get; set; }
public HashSet<string> Dependents { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's easy to get confused about what "dependent" means (libraries that depend on this one, ie the opposite of a dependency), so it would be good ta have a clarifying comment here.

Comment on lines 315 to 316
var references = runtimeLibraries.ToDictionary(lib => lib.Library.Name, lib => lib);
foreach (var reference in runtimeLibraries)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Reference" seems to mean something slightly different than "library". Is there a reason not to name these libraries and library instead of references and reference?

Comment on lines 327 to 328
var unprocessedReferences = runtimeLibraries.ToHashSet();
HashSet<ModifiableRuntimeLibrary> temp = new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for the separate temp HashSet? Originally I thought it was because you couldn't add to the unprocessedReference set while iterating over it, but now I see that you're not actually iterating over it, just popping an item out each time through the loop. Given that, could you remove the separate temp list?

If not, it would probably be a good idea to rename temp to something more descriptive and add a comment about why it's used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. That's exactly how I'd been thinking about it, but you're right that it wasn't actually being iterated over, so there's no need for a temp HashSet. Deleted it

@@ -399,11 +458,11 @@ public DependencyContext Build(string[] userRuntimeAssemblies = null)
targetInfo,
_compilationOptions ?? CompilationOptions.Default,
compilationLibraries,
runtimeLibraries,
runtimeLibraries.Select(library => library.Library),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the library and its dependencies are actually immutable, I think you need to recreate each library after the graph walk so that any dependencies that were removed don't show up as dangling links in the deps file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very confident about what the deps.json is supposed to look at, but it looks ok to me. This is a runtime change, which means it keeps them as compile-time dependencies. Here's an example from the test:
System.NotConflicting has both runtime can compile-time assets and looks like this:

"System.NotConflicting/4.0.0.0": {
  "runtime": {
    "System.NotConflicting.dll": {}
  },
  "compile": {
    "System.NotConflicting.dll": {}
  }
}

Meanwhile, System.NotConflicting.Reference doesn't and looks like this:

"System.NotConflicting.Reference/4.0.0.0": {
  "compile": {
    "System.NotConflicting.dll": {}
  },
  "compileOnly": true
}

(They should be more complicated if they were real.)

In other words, I think keeping the dependencies list as it was before is correct; it removes the runtime references while retaining the note that they're connected.

@dsplaisted
Copy link
Member

@Forgind Here are some test cases that I think we should cover in the GivenADependencyContextBuilder unit tests:

  • Leaf nodes
    • Direct reference to package with no assets: Project -> HasNoAssets
    • Indirect reference to package with no assets: Project -> HasAssets -> HasNoAssets
  • Dependency chains
    • Package with no assets references another package with no assets: Project -> HasNoAssets1 -> HasNoAssets2
    • Package with no assets references another package with assets: Project -> HasNoAssets -> HasAssets (they should all show up in the deps file)
  • Dependency graphs
    • Package with no assets references a package with assets which is also referenced by project:
      • Project -> HasNoAssets -> HasAssets
      • Project -> HasAssets
      • HasNoAssets should not appear in deps file
    • Package with no assets reference a package referenced by another package
      • Project -> HasNoAssets -> HasAssets1
      • Project -> HasAssets2 -> HasAssets1
      • HasNoAssets should not appear in deps file
    • Two packages with no assets reference a package with assets:
      • Project -> HasNoAssets1 -> HasAssets
      • Project -> HasNoAssets2 -> HasAssets
      • This is an interesting case. I think the behavior as implemented in the PR would be that exactly one of HasNoAssets1 and HasNoAssets2 would appear in the deps file.

There may also be some interesting test cases with packages that have compile-time assets but no runtime assets.

You will need an assets file for each of these test cases (it's called a lock file in these tests because they're really old). You could author them manually, but there are enough different cases and we may want to add more that I would probably create code where you could pass it a graph and it would create the assets file for you.

Forgind added 5 commits April 17, 2025 11:50
For ItDoesntCreateReferenceAssembliesWhenNoCompilationOptions, the issue is that the RuntimeLibraries are minimized to remove libraries with no assets and no (transitive) dependencies with assets. This removes all but one of the RuntimeLibraries, the one remaining being the main project. Per dsplaisted's suggestion, I then removed "Dependencies" that are absent from Libraries, which means removing all of its dependencies, so RuntimeLibraries should now have only one entry, and it should have no dependencies, hence the shift.
Copy link
Member

@dsplaisted dsplaisted left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good progress.

I think I'll want to go over the test cases added to GivenADependencyContextBuilder with you and make sure I understand them and that we are covering the right scenarios.

Comment on lines +327 to +328
// Rather than adding the main project's dependencies, we added references to them, i.e., Foo.Reference.dll
// instead of Foo.dll. This adds Foo.dll as a reference directly so another reference can be removed if it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment in GetReferenceLibraryName seems to indicate that we would only name a reference Foo.Reference.dll if there was a reference to a Foo DLL as well as a Foo package. Is this the case, or do we always end up adding .Reference to the end of the name?

Copy link
Contributor Author

@Forgind Forgind May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been a month since I wrote this, and it's a little confusing to me now. What I think was happening and I was trying to explain in this comment is that it sometimes comes out as Foo.Reference.dll and sometimes comes out as Foo.dll, exactly as you said, but we don't really care which because in either case, there's a reference from the main project to Foo, and we want to make sure to track that.

The issue is that we might have missed that there was a reference to Foo if it did come out as Foo.Reference.dll earlier, in which case there's an edge missing in our graph, and this adds the edge. If it was already added, we're adding it a second time and relying on HashSet logic to eliminate the duplicate.

I definitely had .Reference.dlls popping up fairly often when I was writing the unit tests. In practice, I think that if the main project has a dependency, that's always added twice, triggering the .Reference.dll logic, whereas other projects with dependencies are generally only added once. I may be wrong about that, though, as it's more from recollection than understanding code right now.


// Rather than adding the main project's dependencies, we added references to them, i.e., Foo.Reference.dll
// instead of Foo.dll. This adds Foo.dll as a reference directly so another reference can be removed if it
// isn't necessary because of a direct reference from the main project.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious, was this added for a specific scenario or test case that was failing?

Copy link
Contributor Author

@Forgind Forgind May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were definitely failing test cases. I think this fixed one or two of them where there were .Reference.dlls that weren't being seen as dependencies. It should reflect a real scenario, though. I can try commenting it out and seeing what fails, then investigating what that scenario reflects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specific failing test was PackageWithNoAssetsReferencesPackageReferencesByOtherPackage. It has a structure like this:
main app -> A
main app -> B
A -> B
B has assets; A does not

In this case, since A does not have assets, and main app references B directly, it should be able to skip A and remove it from the deps.json, but without this piece, it doesn't.

{
if (libraries.TryGetValue(directReference.Name, out var dep))
{
dep.Dependents.Add(_mainProjectInfo.Name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like maybe it is fixing up what was missed (or added incorrectly) in the previous loop where we were originally building the dependents list. That code assumes that the name matches, but when we need to create a unique name it doesn't. Would it make sense to instead of fixing up the dependents afterwards to account for the name mismatch in the original loop?

It seems like if there's both a Foo.dll and a Foo.Reference.dll, then in the previous loop we may have added Foo.dll as a dependent when we should have added Foo.Reference.dll. This will now add the dependent that's missing, but there still might be an extra incorrect dependent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the opposite, actually; it tries to add Foo.Reference.dll when it should be adding Foo.dll. It stops short of actually adding it, though, because it looks in runtimeLibraries and doesn't find it, so it just shrugs and moves on.

I don't think it really matters to me whether we fix the dependents list in a second loop or just get it right in the first place, but it wasn't too hard to do, so I moved that logic up so we don't have the mistake in the first place.

@@ -0,0 +1,6644 @@
{
"locked": false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is more than 6000 lines long. It looks like in the tests you're loading it and then adding the test libraries you need, so can you instead use an assets file that's as simple as possible from a project with no package references as the starting point?

Copy link
Contributor Author

@Forgind Forgind May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, pretty much. I'm not really sure if there's a better way to do this. I found other tests that were using a different project.lock.json (and even wrote a couple tests against it), but it had some extra dependencies that I didn't want that would've just been confusing. (I think they also made it impossible to properly test a couple scenarios like "no references.")

[Fact]
public void DirectReferenceToPackageWithNoAssets()
{
DependencyContext dependencyContext = BuildDependencyContextFromDependenciesWithResources([], [], ["System.A"]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about how the references parameter is handled here... it looks like that's creating direct DLL references, and the tests may not model dependencies from the main project to a NuGet package, which is the more common scenario.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; it's from the main project to any of its direct references. I'm not sure I understand the distinction you're making between a direct reference from the main project to a dll vs. a direct reference to a nuget package? Are you saying that with the nuget packages we'd be downloading it just before using it? I don't see how that distinction is represented in DependencyContextBuilder, though I may be missing something.

Dependencies = kvp.Value.Select(n => new PackageDependency(n)).ToList()
});

if (withResources.Contains(kvp.Key))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having resource assets is not the most common case, so instead we should probably model whether the libraries have runtime or compilation assets (and then resource assets if we have additional scenarios for that).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All three are handled in the same way in this part of the code, but they're added in different ways...I don't disagree with you, exactly, but I suspect it would take me a couple days to get that to work, and it wouldn't actually test anything we aren't already properly testing using resource assets.

Comment on lines 493 to 494
dependency => runtimeLibraries.Any(lib => lib.Library.Name.Equals(dependency.Name)) ||
compilationLibraries.Any(lib => lib.Name.Equals(dependency.Name))).ToList(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it ends up being O(N^2) or similar. Since I think it may not be very rare to have projects with hundreds or thousands of dependencies, we should probably use a lookup (hash or dictionary) to check if a dependency should be included rather than doing a list scan for each one.

var buildCommand = new BuildCommand(testAsset);
buildCommand.Execute().Should().Pass();

using (var depsJsonFileStream = File.OpenRead(Path.Combine(buildCommand.GetOutputDirectory("net9.0").FullName, "PackageWithoutAssets_ShouldNotShowUpInDepsJson.deps.json")))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid hardcoding the target framework in one place but not another place so this doesn't break for .NET 10 (or higher).

Suggested change
using (var depsJsonFileStream = File.OpenRead(Path.Combine(buildCommand.GetOutputDirectory("net9.0").FullName, "PackageWithoutAssets_ShouldNotShowUpInDepsJson.deps.json")))
using (var depsJsonFileStream = File.OpenRead(Path.Combine(buildCommand.GetOutputDirectory(ToolsetInfo.CurrentTargetFramework).FullName, "PackageWithoutAssets_ShouldNotShowUpInDepsJson.deps.json")))

@Forgind Forgind changed the base branch from release/9.0.3xx to main May 12, 2025 19:13
@dsplaisted dsplaisted merged commit e07a38c into dotnet:main May 16, 2025
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Infrastructure breaking-change Using this label will notify dotnet/compat and trigger a request to file a compat bug needs-breaking-change-doc-created untriaged Request triage from a team member
Projects
None yet
4 participants