Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

US: NGI software demo runtime environment for services #614

Open
erictapen opened this issue Mar 18, 2025 · 18 comments
Open

US: NGI software demo runtime environment for services #614

erictapen opened this issue Mar 18, 2025 · 18 comments
Assignees
Labels
infra Work on Ngipkgs itself, and related infrastructure user experience User story

Comments

@erictapen
Copy link
Contributor

erictapen commented Mar 18, 2025

This is a subset of what #575 describes.

As a user of NGIpkgs running Ubuntu, I want to launch and experiment with an NGI application running as a service, in order to assess whether it is fit for purpose.

Target audience: small-business sysadmins

  • proficient with the Linux command line and plain text editing, have their own tools
  • will gladly run podman or VMs
  • good enough with Git
  • not familiar with Nix or functional programming
  • may be security conscious
  • will be reluctant to curl | sh and pollute their existing environment

Operating conditions:

  • uses some off-the-shelf major Linux distribution: Debian, Ubuntu, Fedora, Arch
    • NGI targets FLOSS and we can expect people interested in that to show up at NGIpkgs
    • also, macOS/WSL introduces more moving parts we’d have to care about. We can
  • for the purpose of this user story, pick one to run an integration test on
    • See https://github.com/numtide/nix-vm-test for NixOS VM tests on a Debian/Ubuntu/Fedora image
      • Advantage over GitHub Actions runners:
        • no re-build/re-download if nothing changed
        • full control over the source, runs on our own infrastructure
      • Disadvantages:
        • more involved setup
        • increases load on our infrastructure

Demo application:

  • Pick exactly one, which is simple to configure and doesn’t incur maintenance overhead
    • Example: Galene, doesn’t need secrets or persistent state
      • Still needs to be put into NGIpkgs
    • Example: Cryptpad
      • Need it for ourselves anyway, may as well make it nice
      • Just make sure we don’t spend more time on the application than on the user story

Acceptance criteria

Given I have picked an NGI application to experiment with,

When I follow the instructions on the project page,

Then the software will run on my machine, I can try out different settings, and remove the software without traces afterwards.

Implementation notes

The project page should guide through roughly these steps:

  1. Download an artifact with the NGIpkgs environment
    • For example an OCI image that has NixOS preconfigured to build a certain configuration
  2. Launch the environment, using pre-existing tools from my own system
    • For example using podman
  3. Interact with the applications from my own environment, e.g. browse the web interface
    • For example by mapping ports from the container to the host
  4. Customize the NGIpkgs environment, using my own tools
    • For example by mounting my locally edited configuration.nix file into the container and running nixos-rebuild inside the container
  5. Remove all artefacts

Ongoing design discussion: https://www.notion.so/nixos-foundation/1c059d49e1be804a9b68de6caab9c891?v=8b0d731ed6b04fd985c383fa878e8be5

@erictapen erictapen added infra Work on Ngipkgs itself, and related infrastructure user experience User story labels Mar 18, 2025
@erictapen erictapen self-assigned this Mar 18, 2025
@github-project-automation github-project-automation bot moved this to Needs refinement in Nix@NGI Mar 18, 2025
@erictapen erictapen changed the title US: Service demos US: Service demos for non-NixOS linux users Mar 18, 2025
@eljamm
Copy link
Contributor

eljamm commented Mar 18, 2025

Note from @imincik on Matrix:

substiturers can be configured in flake.nixConfig

@eljamm
Copy link
Contributor

eljamm commented Mar 18, 2025

Instead of installing nix, couldn't we use a static version of Nix? If I'm not mistaken, this means users won't have to deal with the cleanup/uninstallation afterwards.

@erictapen erictapen moved this from Needs refinement to In progress in Nix@NGI Mar 18, 2025
@erictapen
Copy link
Contributor Author

The biggest complexity of installing Nix is having to create build users and the nix store, which is very system dependend and contains a lot of edgecases. I would've assumed that the current installscripts (determinate/official) already download a static binary?

@eljamm
Copy link
Contributor

eljamm commented Mar 18, 2025

Yeah, it appears I was mistaken after all. I though nixStatic could run without setting up the build users. I just tried it in a Docker Alpine image and it kept complaining about that. It did creat the nix store in ~/.cache, though, which was pretty neat.

@fricklerhandwerk
Copy link
Contributor

@erictapen and I reworked the user story to enable us to skip what were three rather involved steps installing Nix, enabling flakes, etc. which are well-known show stoppers for sysadmins that we can't do much about within our scope. We use implementation-agnostic wording, with the examples illustrating a particular implementation with OCI images, because that's something we can expect a certain fraction of sysadmins to work with habitually.
actually
@eljamm @imincik What remains as interesting steps outside of @erictapen's UX design aspects would be

  • making sure we can actually create a container that runs a desired environment (nixos-generators can create lxc things, but I haven't tested it)
  • creating the NixOS configuration we want to have in these containers
  • adding that image to CI so we can render a download URL to ngi.nixos.org
  • writing instructions for getting all the moving parts into place on e.g. Debian or Fedora
  • picking or adapting examples, and enriching them with instructions for how to interact with the applications from outside the container

@erictapen, as for UX, to make the transition between 2. and 4. smoother we may want to invoke the image with all the port mappings and mounts already, and maybe even auto-rebuild on file change so there's no need to even interact with the container after launching it.

@imincik
Copy link
Contributor

imincik commented Mar 19, 2025

@fricklerhandwerk , in case we choose application (OCI) containers as a runtime for our demos, how we want to run desktop apps or/and apps which require access to GPU or sound device (for example Blink) ?

@imincik
Copy link
Contributor

imincik commented Mar 19, 2025

Also, note that NixOS system modules (as we have now) don't work for application containers.

@imincik
Copy link
Contributor

imincik commented Mar 19, 2025

In case of application containers, what are we going to do if NGI program requires additional services to run (for example PostgreSQL, RabbitMQ, ...) ? Where are they going to run ? How we configure them ?

@fricklerhandwerk
Copy link
Contributor

fricklerhandwerk commented Mar 19, 2025

in case we choose application (OCI) containers as a runtime for our demos, how we want to run desktop apps or/and apps which require access to GPU or sound device (for example Blink) ?

We won't do it yet and thus don't care -- yet. The point of this demo is to have an e2e interaction at all. Mid-to-long-term we'd need to wean people into booting a proper NixOS on their hardware.

Also, note that NixOS system modules (as we have now) don't work for application containers.

This probably needs a clarification in the user story: We don't necessarily need per-application containers, it would be just as good to have a proper environment that can reconfigure itself with the services we need, e.g. including the entire NGIpkgs checkout and Nix set up just the right way.

In case of application containers, what are we going to do if NGI program requires additional services to run (for example PostgreSQL, RabbitMQ, ...) ? Where are they going to run ? How we configure them ?

In the same container controlled by NixOS inside.

@imincik
Copy link
Contributor

imincik commented Mar 20, 2025

@imincik and @eljamm , we where able to run NixOS system in container using docker (OCI) image built by nixos-generators and running it with podman.

My setup (@imincik )

  • flake.nix
{
  inputs = {
    nixpkgs.url = "nixpkgs/nixos-unstable";
    nixos-generators = {
      url = "github:nix-community/nixos-generators";
      inputs.nixpkgs.follows = "nixpkgs";
    };
  };

  outputs = { self, nixpkgs, nixos-generators, ... }: {
    packages.x86_64-linux = {
      docker = nixos-generators.nixosGenerate {
        system = "x86_64-linux";
        modules = [ ./configuration.nix ];
        format = "docker";
      };
    };
  };
}

  • configuration.nix
{ config, lib, pkgs, ... }:

{
  boot.isContainer = true;
  services.postgresql.enable = true;

  environment.systemPackages = [
    pkgs.bashInteractive
    pkgs.coreutils
  ];
}
  • build and load OCI image to podman
nix build .#docker
podman import ./result/tarball/nixos-system-x86_64-linux.tar.xz
  • run container
podman run -it --systemd=always <image-id> /init

Container is started, postgresql is running, but there are multiple errors such as

You are in rescue mode. After logging in, type "journalctl -xb" to view
system logs, "systemctl reboot" to reboot, or "exit"
to continue bootup.

Cannot open access to console, the root account is locked.
See sulogin(8) man page for more details.

Press Enter to continue.
.266449] (nsncd)[269]: nscd.service: Failed to keep CAP_SYS_ADMIN: Operation not permitted
[285206.266792] mount[268]: mount: /run/wrappers: permission denied.
[285206.266890] mount[268]:        dmesg(1) may have more information after failed mount system call.
[285206.266987] (nsncd)[269]: nscd.service: Failed at step USER spawning /nix/store/ln83s0gf3xy29i7vcjlw0mi02hpag5my-nsncd-1.5.1/bin/nsncd: Operation not permitted
[285206.267175] systemd[1]: run-wrappers.mount: Mount process exited, code=exited, status=32/n/a
[285206.267316] systemd[1]: run-wrappers.mount: Failed with result 'exit-code'.
[285206.267481] systemd[1]: Failed to mount /run/wrappers.
[285206.267610] systemd[1]: Dependency failed for Create SUID/SGID Wrappers.
[285206.267735] systemd[1]: suid-sgid-wrappers.service: Job suid-sgid-wrappers.service/start failed with result 'dependency'.
[285206.268538] systemd[1]: nscd.service: Main process exited, code=exited, status=217/USER
[285206.268854] systemd[1]: nscd.service: Failed with result 'exit-code'.
[285206.269411] systemd[1]: Failed to start Name Service Cache Daemon (nsncd).
[285206.269529] systemd[1]: Dependency failed for Host and Network Name Lookups.
[285206.269623] systemd[1]: nss-lookup.target: Job nss-lookup.target/start failed with result 'dependency'.
[285206.269712] systemd[1]: Dependency failed for User and Group Name Lookups.
[285206.269804] systemd[1]: nss-user-lookup.target: Job nss-user-lookup.target/start failed with result 'dependency'.
[285206.269897] systemd[1]: systemd-journald-audit.socket: Cannot add dependency job, ignoring: Exec format error
[285206.270097] systemd[1]: systemd-journald-audit.socket: Cannot add dependency job, ignoring: Exec format error
[285206.275732] systemd[1]: systemd-journald-audit.socket: Cannot add dependency job, ignoring: Unit systemd-journald-audit.socket has a bad unit file setting.

@imincik
Copy link
Contributor

imincik commented Mar 20, 2025

Even though our experiment has proved that IT IS possible to run "full" NixOS system with systemd as service manager in OCI container, I still suggest to use LXC which is technology properly designed for this use case.

@erictapen
Copy link
Contributor Author

Using LXC for our service demos

I investigated this a bit, and I ended up with a container, successfully starting the AtomicData service inside. But there seems to be quite the bit of boiler plate on the user side involved to get a container running.

Find my code here: https://github.com/erictapen/ngipkgs/tree/lxc-demos, then build the demo images with nix build .#lxc-images

Pros

  • LXC allows for self-hosted container registries, that are just statically hosted via HTTP. The output of .#lxc-images should already work as one, given that it is hosted via HTTPS.

Cons

  • Users would need to manually create a storage backend once after they installed LXD: lxc storage create default dir
  • Users would need to manually create a container network and attach it to the instance: lxc network create ngi && lxc network attach ngi atomic-data-container

It looks like an LXC image can't specify defaults for this by itself.

@fricklerhandwerk
Copy link
Contributor

Conclusion so far:

  • OCI via podman would be ideal in terms of UX but as @imincik noted may be too much work to get going smoothly.
  • LXC probably easier to do for us, but as @erictapen noted may be too fiddly for users
  • Next steps: @Erethon will look into fixing the OCI errors

@imincik imincik changed the title US: Service demos for non-NixOS linux users US: NGI Project demos for non-NixOS linux users Mar 25, 2025
@imincik imincik changed the title US: NGI Project demos for non-NixOS linux users US: NGI application demos for non-NixOS linux users Mar 25, 2025
@imincik
Copy link
Contributor

imincik commented Mar 26, 2025

What about offering NGI software in multiple formats ?

  1. We start with the most simple one for us - "native Nix". This solution will provide access to all NGI software (server, desktop, libs) on all our target platforms (all Linux) with some downsides - installer, uninstall/system cleanup, .... But we have multiple options how to improve the user experience. For example by building universal Nix package for DEB and RPM distros with all dependencies included (Nix static build) - this can be interesting task for Outreachy or SoN.

  2. In second step, add OCI container format suitable for running full NixOS which will be complementary to the "native Nix" format.

@eljamm
Copy link
Contributor

eljamm commented Mar 26, 2025

For example by building universal Nix package for DEB and RPM distros with all dependencies included (Nix static build) - this can be interesting task for Outreachy or SoN

There is already a static Nix build, but you'd also need to set up the builders to use it (see comment above). That said, this seems like an effort that the distros themselves need to take as it would just increase our maintenance burden.

We start with the most simple one for us - "native Nix"

Do you mean from the installer or the distros' package manager? In both cases, this approach might lacks a follow-up strategy by itself, as we discussed previously (see notes). As such, we'll still need a way to compose services, whether that's with VMs, OCI images or something else.

@imincik
Copy link
Contributor

imincik commented Mar 26, 2025

There is already a static Nix build, but you'd also need to set up the builders to use it (see comment above). That said, this seems like an effort that the distros themselves need to take as it would just increase our maintenance burden.

I mean DEB, RPM package containing static Nix binary with all other configuration included (e.g. services, ...).

@fricklerhandwerk fricklerhandwerk changed the title US: NGI application demos for non-NixOS linux users US: NGI application demos for users new to Nix Mar 26, 2025
@imincik imincik changed the title US: NGI application demos for users new to Nix US: NGI application demo environment Mar 28, 2025
@imincik imincik changed the title US: NGI application demo environment US: NGI software demo runtime environment Mar 28, 2025
@erictapen erictapen changed the title US: NGI software demo runtime environment US: NGI software demo runtime environment for services Mar 31, 2025
@fricklerhandwerk
Copy link
Contributor

fricklerhandwerk commented Mar 31, 2025

Following the discussion with @erictapen and @imincik, each application's page on ngi.nixos.org would show something like this

  1. Install Nix (tab per distro, e.g.)

    apt install nix
    
  2. Copy/Download the following file (with some collapsed annotations to explain what's going on)

    # default.nix
    let
      ngipkgs = import (builtins.fetchTarball "https://github.com/ngi-nix/ngipkgs/tarball/main") { };
    in
    ngipkgs.demo {
      services.cryptpad.enable = true;
    }
  3. Run this command (possibly extended or annotated to make it less magical):

    # this will spin up the VM
    $(nix-build)
  4. Instructions (ideally exactly the same as in the corresponding test) for interacting with the application

Next steps could be pointing to documentation/guides for refactoring to more sophisticated setups, such as using npins or making configuration reusable across machines, etc. but this is for later user stories.

@erictapen
Copy link
Contributor Author

We briefly talked about the problem of running software that is designed for server networks on somebodys local computer:

We can't expect users to have public IP addresses, so ACME won't work and we can't expect them to setup domain name resolution. Out of that follows:

  • We need to setup NAT port forwardings, so that users can access guest ports
  • We need to make ACME not connect to Letsencrypt and generate self-signed TLS certificates as a fallback
  • We need users to accept that self-signed certificate on their host
  • We need to check per project wether it's fine to run it without a proper domain name (e.g. localhost)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infra Work on Ngipkgs itself, and related infrastructure user experience User story
Projects
Status: In progress
Development

No branches or pull requests

5 participants