-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Include only PackageReferences included at runtime in deps.json #46218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include only PackageReferences included at runtime in deps.json #46218
Conversation
Added When you commit this breaking change:
Tagging @dotnet/compat for awareness of the breaking change. |
I think this is the right change and better than the PrivateAssets approach. I do want to mark this as "breaking" though because it's possible for folks to notice. I would expect that in all the cases where folks notice it's really a no-op since I would expect all cases the SDK was just writing an empty entry. The sort of things that folks might see - an entry goes away for a particular package, there's a package reference listed to a package but no definition. For the latter perhaps the SDK can erase that reference too. One thing that would be great to include with this is to also remove the entries that were dropped by conflict resolution, which is effectively another "automated" way that the SDK applies the equivalent of ExcludeAssets. |
Is the _ConflictPackageFiles item (filtered to just References) right? That seems to be created by the ResolvePackageFileConflicts task, and it only includes the 'losers.' I haven't found anything else that looks good yet. |
I think it might be better if instead of, or in addition to this, we did the following: Remove any Libraries from the deps.json that have no assets under them. From the example in #39400: NerdBank.GitVersioning is listed as a library under the .NET 7 target, but has no runtime assets or anything else under it. So maybe we could remove it regardless of whether ExcludeAssets was specified, since whatever the cause it doesn't contribute anything at runtime. This would require diving into the DependencyContextBuilder code, understanding how it works, and updating it. We also might need to consider cases such as if a library has no assets but does list dependencies... do we need to check if it transitively brought in any assets? Thoughts? |
One thing I like about your addition @dsplaisted is that it makes the deps.json more purpose-built - the file only cares about libraries and assets that impact the runtime experience. For users that need accurate recounting of compile-time library dependencies we should be promoting SBOMs, which are a separate artifact specifically for book-keeping purposes. |
@ericstj @ViktorHofer @agocke Any feedback on my idea above on removing libraries from the deps.json file if they don't have any assets? What are the different asset types possible, and how should we handle them and handle dependencies between libraries that don't have any assets? |
Yes! this would be great. We've wanted these gone for some time as they also cause false positives for folks scanning the files.
The runtime cares about
I think we just remove the library section and remove references to the removed section. This should ensure that that an entire branch from the package graph is removed. I can imagine if we're sloppy about identifying packages to remove it could lead to "orphaned" references - though I don't think the host would care about that since it doesn't attempt to construct a graph. |
@ericstj How should we handle dependencies if we're removing packages without assets? Should we remove a package that has no assets but depends on a package which does? I think that could lead to a different kind of orphaning where we have a dependency of the removed package still in the file but no dependencies to it any more. |
I'd correct this to say
This is not so uncommon if you consider a case where someone does ExcludeAssets on one package with a large package graph, but then updates one of that package's dependencies to a newer version with a direct reference (or some other graph) that's not excluded. If you could constrain the removal in this way efficiently then I think it would be safer, however my hypothesis is that this is unlikely if you are identifying removal based on exclusion/conflict resolution. Package exclusion is specified by graph notation which flows to dependencies and conflict resolution is closure complete. So you could count on these mechanisms being graph complete for a given path. I can see how if instead we remove packages that have no assets (regardless of how) then it would be safer to also check that they are not the only packages that reference (including indirectly) some other package with assets. |
I talked with dsplaisted offline, and I think we're on the same page. I'm planning to start working on this again early next week. |
@dsplaisted, can you look at that last commit (fa1bd9d) to see if you think that aligns with your understanding of what I should do, including the comment? If so, I'll turn the comment into code. |
I haven't looked in detail at the graph algorithm code yet, but we are going to want some decent test coverage for this. We can probably have an end-to-end test or two (#39400 has a good example), and then unit tests where we test just the task with hand-modified assets files for different interesting cases. |
This one prunes out 3/7 references
I added some tests. Let me know what you think! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job with the graph algorithm code. I went through it and I'm pretty sure it's doing what we discussed.
I haven't yet looked at the tests in detail, but I suspect we will need more of them. We can start with the different cases that were mentioned in the discussions for this PR and linked issues, creating graphs that match those cases and then possibly thinking of other interesting variations.
private class ModifiableRuntimeLibrary | ||
{ | ||
public RuntimeLibrary Library { get; set; } | ||
public HashSet<string> Dependents { get; set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's easy to get confused about what "dependent" means (libraries that depend on this one, ie the opposite of a dependency), so it would be good ta have a clarifying comment here.
var references = runtimeLibraries.ToDictionary(lib => lib.Library.Name, lib => lib); | ||
foreach (var reference in runtimeLibraries) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Reference" seems to mean something slightly different than "library". Is there a reason not to name these libraries
and library
instead of references
and reference
?
var unprocessedReferences = runtimeLibraries.ToHashSet(); | ||
HashSet<ModifiableRuntimeLibrary> temp = new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for the separate temp
HashSet? Originally I thought it was because you couldn't add to the unprocessedReference
set while iterating over it, but now I see that you're not actually iterating over it, just popping an item out each time through the loop. Given that, could you remove the separate temp
list?
If not, it would probably be a good idea to rename temp
to something more descriptive and add a comment about why it's used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. That's exactly how I'd been thinking about it, but you're right that it wasn't actually being iterated over, so there's no need for a temp HashSet. Deleted it
@@ -399,11 +458,11 @@ public DependencyContext Build(string[] userRuntimeAssemblies = null) | |||
targetInfo, | |||
_compilationOptions ?? CompilationOptions.Default, | |||
compilationLibraries, | |||
runtimeLibraries, | |||
runtimeLibraries.Select(library => library.Library), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the library and its dependencies are actually immutable, I think you need to recreate each library after the graph walk so that any dependencies that were removed don't show up as dangling links in the deps file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not very confident about what the deps.json is supposed to look at, but it looks ok to me. This is a runtime change, which means it keeps them as compile-time dependencies. Here's an example from the test:
System.NotConflicting has both runtime can compile-time assets and looks like this:
"System.NotConflicting/4.0.0.0": {
"runtime": {
"System.NotConflicting.dll": {}
},
"compile": {
"System.NotConflicting.dll": {}
}
}
Meanwhile, System.NotConflicting.Reference doesn't and looks like this:
"System.NotConflicting.Reference/4.0.0.0": {
"compile": {
"System.NotConflicting.dll": {}
},
"compileOnly": true
}
(They should be more complicated if they were real.)
In other words, I think keeping the dependencies list as it was before is correct; it removes the runtime references while retaining the note that they're connected.
@Forgind Here are some test cases that I think we should cover in the
There may also be some interesting test cases with packages that have compile-time assets but no runtime assets. You will need an assets file for each of these test cases (it's called a lock file in these tests because they're really old). You could author them manually, but there are enough different cases and we may want to add more that I would probably create code where you could pass it a graph and it would create the assets file for you. |
For ItDoesntCreateReferenceAssembliesWhenNoCompilationOptions, the issue is that the RuntimeLibraries are minimized to remove libraries with no assets and no (transitive) dependencies with assets. This removes all but one of the RuntimeLibraries, the one remaining being the main project. Per dsplaisted's suggestion, I then removed "Dependencies" that are absent from Libraries, which means removing all of its dependencies, so RuntimeLibraries should now have only one entry, and it should have no dependencies, hence the shift.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good progress.
I think I'll want to go over the test cases added to GivenADependencyContextBuilder with you and make sure I understand them and that we are covering the right scenarios.
// Rather than adding the main project's dependencies, we added references to them, i.e., Foo.Reference.dll | ||
// instead of Foo.dll. This adds Foo.dll as a reference directly so another reference can be removed if it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment in GetReferenceLibraryName
seems to indicate that we would only name a reference Foo.Reference.dll
if there was a reference to a Foo DLL as well as a Foo package. Is this the case, or do we always end up adding .Reference
to the end of the name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's been a month since I wrote this, and it's a little confusing to me now. What I think was happening and I was trying to explain in this comment is that it sometimes comes out as Foo.Reference.dll and sometimes comes out as Foo.dll, exactly as you said, but we don't really care which because in either case, there's a reference from the main project to Foo, and we want to make sure to track that.
The issue is that we might have missed that there was a reference to Foo if it did come out as Foo.Reference.dll earlier, in which case there's an edge missing in our graph, and this adds the edge. If it was already added, we're adding it a second time and relying on HashSet logic to eliminate the duplicate.
I definitely had .Reference.dlls popping up fairly often when I was writing the unit tests. In practice, I think that if the main project has a dependency, that's always added twice, triggering the .Reference.dll logic, whereas other projects with dependencies are generally only added once. I may be wrong about that, though, as it's more from recollection than understanding code right now.
|
||
// Rather than adding the main project's dependencies, we added references to them, i.e., Foo.Reference.dll | ||
// instead of Foo.dll. This adds Foo.dll as a reference directly so another reference can be removed if it | ||
// isn't necessary because of a direct reference from the main project. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious, was this added for a specific scenario or test case that was failing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were definitely failing test cases. I think this fixed one or two of them where there were .Reference.dlls that weren't being seen as dependencies. It should reflect a real scenario, though. I can try commenting it out and seeing what fails, then investigating what that scenario reflects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The specific failing test was PackageWithNoAssetsReferencesPackageReferencesByOtherPackage. It has a structure like this:
main app -> A
main app -> B
A -> B
B has assets; A does not
In this case, since A does not have assets, and main app references B directly, it should be able to skip A and remove it from the deps.json, but without this piece, it doesn't.
{ | ||
if (libraries.TryGetValue(directReference.Name, out var dep)) | ||
{ | ||
dep.Dependents.Add(_mainProjectInfo.Name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like maybe it is fixing up what was missed (or added incorrectly) in the previous loop where we were originally building the dependents list. That code assumes that the name matches, but when we need to create a unique name it doesn't. Would it make sense to instead of fixing up the dependents afterwards to account for the name mismatch in the original loop?
It seems like if there's both a Foo.dll and a Foo.Reference.dll, then in the previous loop we may have added Foo.dll as a dependent when we should have added Foo.Reference.dll. This will now add the dependent that's missing, but there still might be an extra incorrect dependent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the opposite, actually; it tries to add Foo.Reference.dll when it should be adding Foo.dll. It stops short of actually adding it, though, because it looks in runtimeLibraries and doesn't find it, so it just shrugs and moves on.
I don't think it really matters to me whether we fix the dependents list in a second loop or just get it right in the first place, but it wasn't too hard to do, so I moved that logic up so we don't have the mistake in the first place.
@@ -0,0 +1,6644 @@ | |||
{ | |||
"locked": false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is more than 6000 lines long. It looks like in the tests you're loading it and then adding the test libraries you need, so can you instead use an assets file that's as simple as possible from a project with no package references as the starting point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, pretty much. I'm not really sure if there's a better way to do this. I found other tests that were using a different project.lock.json (and even wrote a couple tests against it), but it had some extra dependencies that I didn't want that would've just been confusing. (I think they also made it impossible to properly test a couple scenarios like "no references.")
[Fact] | ||
public void DirectReferenceToPackageWithNoAssets() | ||
{ | ||
DependencyContext dependencyContext = BuildDependencyContextFromDependenciesWithResources([], [], ["System.A"]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about how the references parameter is handled here... it looks like that's creating direct DLL references, and the tests may not model dependencies from the main project to a NuGet package, which is the more common scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes; it's from the main project to any of its direct references. I'm not sure I understand the distinction you're making between a direct reference from the main project to a dll vs. a direct reference to a nuget package? Are you saying that with the nuget packages we'd be downloading it just before using it? I don't see how that distinction is represented in DependencyContextBuilder, though I may be missing something.
Dependencies = kvp.Value.Select(n => new PackageDependency(n)).ToList() | ||
}); | ||
|
||
if (withResources.Contains(kvp.Key)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having resource assets is not the most common case, so instead we should probably model whether the libraries have runtime or compilation assets (and then resource assets if we have additional scenarios for that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All three are handled in the same way in this part of the code, but they're added in different ways...I don't disagree with you, exactly, but I suspect it would take me a couple days to get that to work, and it wouldn't actually test anything we aren't already properly testing using resource assets.
dependency => runtimeLibraries.Any(lib => lib.Library.Name.Equals(dependency.Name)) || | ||
compilationLibraries.Any(lib => lib.Name.Equals(dependency.Name))).ToList(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it ends up being O(N^2) or similar. Since I think it may not be very rare to have projects with hundreds or thousands of dependencies, we should probably use a lookup (hash or dictionary) to check if a dependency should be included rather than doing a list scan for each one.
var buildCommand = new BuildCommand(testAsset); | ||
buildCommand.Execute().Should().Pass(); | ||
|
||
using (var depsJsonFileStream = File.OpenRead(Path.Combine(buildCommand.GetOutputDirectory("net9.0").FullName, "PackageWithoutAssets_ShouldNotShowUpInDepsJson.deps.json"))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid hardcoding the target framework in one place but not another place so this doesn't break for .NET 10 (or higher).
using (var depsJsonFileStream = File.OpenRead(Path.Combine(buildCommand.GetOutputDirectory("net9.0").FullName, "PackageWithoutAssets_ShouldNotShowUpInDepsJson.deps.json"))) | |
using (var depsJsonFileStream = File.OpenRead(Path.Combine(buildCommand.GetOutputDirectory(ToolsetInfo.CurrentTargetFramework).FullName, "PackageWithoutAssets_ShouldNotShowUpInDepsJson.deps.json"))) |
…as-extant-assemblies
Fixes dotnet/roslyn#76797
Fixes #48944
PrivateAssets isn't what we should be looking for to determine that we don't need an assembly in the deps.json file. This instead looks for ExcludeAssets and IncludeAssets to see if it will be in the output directory and includes it in the deps.json if so.