Skip to content

Create packages diff between the current state and a revision #3911

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

SandrineP
Copy link
Member

This will be used to install a revision.

Adds a parsing function to handle the different formats that can be found in the history file.
Adds a function to generate de diff between the current state and a given revision. This generates maps of packages installed and removed between the two states.

@SandrineP SandrineP added the release::bug_fixes For PRs fixing bugs label Apr 29, 2025
Copy link

codecov bot commented Apr 29, 2025

Codecov Report

Attention: Patch coverage is 94.40000% with 7 lines in your changes missing coverage. Please review.

Please upload report for BASE (main@e39ddf4). Learn more about missing BASE report.

Files with missing lines Patch % Lines
libmamba/src/core/history.cpp 94.11% 6 Missing ⚠️
libmamba/tests/src/core/test_history.cpp 95.65% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3911   +/-   ##
=======================================
  Coverage        ?   63.57%           
=======================================
  Files           ?      300           
  Lines           ?    37431           
  Branches        ?     2806           
=======================================
  Hits            ?    23797           
  Misses          ?    13580           
  Partials        ?       54           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@AntoinePrv AntoinePrv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. It is not a very easy part of Mamba to work with and one that can be improved. Some of my suggestions may look arbitrary because things are already be done in a similar way but I think we should aim to slowly improve our APIs.

Could you document what the PackageDiff and its methods does. We have the Solution that is also a form of "package difference" that needs to be installed in the environment, but I don't believe this is doing the same.

I am also wondering if include/api/install.hpp is the best header for it. My understanding is that the (extremely imperfect) include/api/ headers are a "mirror" of things you can do with the CLI (install, remove...) and the fact that this code is tested in test_history suggests it may be better to put it in history.hpp?

As a general rule (that is not fully enforced at the moment), we should have that include/api headers are not used anywhere except in include/api & src/api themselves.

{
std::string s_pkg;
std::string channel;
size_t pos_0 = s.rfind("/");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size_t is C std::size_t (or std::string::size_t) is C++. auto is also possible.

Suggested change
size_t pos_0 = s.rfind("/");
std::size_t pos_0 = s.rfind("/");

{
std::string s_begin = s.substr(0, pos_0);
std::string s_end = s.substr(pos_0 + 1, s.size());
size_t pos = s_begin.rfind("/");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

{
channel = s_begin;
}
s_pkg = util::split(s_end, "::").back();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

util::split, splits everything and creates a temporary vector, this is more efficient:

Suggested change
s_pkg = util::split(s_end, "::").back();
s_pkg = std::get<1>(util::rsplit_once(s_end, "::"));

Comment on lines 1043 to 1044
channel = util::split(s, "::").front();
s_pkg = util::split(s, "::").back();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with util::split_once and util::rsplit_once


specs::PackageInfo pkg_info_builder(std::string s);
PackageDiff
get_revision_pkg_diff(std::vector<History::UserRequest> user_requests, int REVISION);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • int REVISION should be lowercase if it is a function parameter
  • it could have a more explicit name: target_revision? revision_count?
  • It should be explained what a negative revisions implies. If there should not be negative, then make it a positive (std::size_t for instance). I remember we talked about it being a bit hard to use std::optional in the CLI11 and that -1 as a default was reasonable, but (assuming negative have no meaning in the rest of the code), it should be checked as early as possible, cast to a positive only integer, and only used as such.

APIs are very important to explain what the code does and cannot do. Public APIs in particular cannot be changed later on without making a breaking release. To help design API, I pretend to make negotiations between the designer of the API and a user of that API (both being me and within libmamba). On the one end, if I introduce an API, it is to solve a particular problem, so as an API user I want to be able to call that new API as easily as possible with minimum effort (integer that can be negative). On the other, to improve APIs, I pretend that I am the maintainer of a library with just this API and multiple users. I should provide something that makes sense for the problem in general and should not accept data that is meaningless or too complex for me (a negative integer if I don't know what to do with it). Eventually, as a user of the new API, I make the necessary adjustments when calling the API. Maybe there is a bit of repetitive work, so I make a few helpers functions etc.

{
auto pkg_info = pkg_info_builder(ud);
auto name = pkg_info.name;
rev.removed_pkg[name] = pkg_info;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a whole learning topic but move is beneficial here

Suggested change
rev.removed_pkg[name] = pkg_info;
rev.removed_pkg[name] = std::move(pkg_info);

{
auto pkg_info = pkg_info_builder(ld);
auto name = pkg_info.name;
rev.installed_pkg[name] = pkg_info;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
rev.installed_pkg[name] = pkg_info;
rev.installed_pkg[name] = std::move(pkg_info);

* revisions to get the diff between the target revision and the current one. */
struct PackageDiff
{
std::map<std::string, specs::PackageInfo> removed_pkg_diff;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer std::unordered_map by default for associative containers, unless there is a property of std::map you need, which should be rare.

{
auto pkg_info = pkg_info_builder(ud);
auto name = pkg_info.name;
rev.removed_pkg[name] = std::move(pkg_info);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using .insert instead of [key] for inserting as this would overwrite existing key-values . Not sure if that's intended or not. Also operator[] will first create an empty element before overwriting after operator= so I think most of the time .insert (or .emplace in different cases) is better.

if (auto rev_iter = rev.installed_pkg.find(pkg_name);
rev_iter != rev.installed_pkg.end())
{
auto version = rev.installed_pkg[pkg_name].version;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is searching twice for the same package.
rev_iter is an iterator pointing to the found name/package pair, so you should use that once obtained, for example:

const auto version = rev_iter->second.version;

Similar to iter in the following block.

bool res = false;
if (auto rev_iter = rev.removed_pkg.find(pkg_name); rev_iter != rev.removed_pkg.end())
{
auto version = rev.removed_pkg[pkg_name].version;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto version = rev.removed_pkg[pkg_name].version;
const auto version = rev_iter->second.version;

revision.removed_pkg.erase(pkg_name);
bool lastly_removed = true; // whether last operation on package was a removal
lastly_removed = !handle_install(revision, pkg_name);
for (auto rev = ++revisions.begin(); rev != revisions.end(); ++rev)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (auto rev = ++revisions.begin(); rev != revisions.end(); ++rev)
for (auto rev = std::next(revisions.begin()); rev != revisions.end(); ++rev)

for (auto ld : r.link_dists)
{
auto pkg_info = pkg_info_builder(ld);
auto name = pkg_info.name;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto name = pkg_info.name;
const auto name = pkg_info.name;

for (auto ud : r.unlink_dists)
{
auto pkg_info = pkg_info_builder(ud);
auto name = pkg_info.name;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto name = pkg_info.name;
const auto name = pkg_info.name;


PackageDiff PackageDiff::get_revision_pkg_diff(
std::vector<History::UserRequest> user_requests,
int target_revision
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is target_revision allowed to be 0 or negative? Is user_requests allowed to be empty?
If there are requirements on these, consider adding assertions (using assert) at the beginnig of the function checking that these requirements are fullfilled.


PackageDiff pkg_diff{};

auto handle_install = [&pkg_diff](revision& rev, const std::string& pkg_name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor improvment: local lambdas that act as sub-functions can be const as their call function is already const (except if marked mutable but it's not the case here.

}
}
}
revisions.erase(revisions.begin());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.pop_front() would be clearer.

Note that with vector this is not very performant. .pop_back() is cheap however. Did you consider reversing the order of this vector to be able to .pop_back() progressively?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release::bug_fixes For PRs fixing bugs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants