Skip to content

Daemon architecture thoughts #1014

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dylanPowers opened this issue Apr 5, 2015 · 14 comments
Open

Daemon architecture thoughts #1014

dylanPowers opened this issue Apr 5, 2015 · 14 comments

Comments

@dylanPowers
Copy link
Member

These are some of the thoughts I've had on the ipfs daemon. In my time writing a linux init daemon for it I've felt the architecture to be very strange. I have two different proposals to shoot for and discuss.

Only have a system daemon

This idea is oriented around how most systems work. They have a system service and only certain users are granted read and/or write access. With this idea, users wouldn't run their own daemon nor have their own ipns entry. Permissions would be set up such that users must be in a specific group in order to write to the daemon and do things like pinning and publishing. Not everyone needs their own ipns entry. They just need some place to write to in the same way that traditional webservers work with multiple users. Alternatively, would it be even possible for a single daemon to be the authority on multiple ipns entries? I'm guessing that would require it to have multiple peer id's. This idea also assumes that ipns would be the only hindrance and reason to stay away.

Have a system daemon and user daemons

This idea is oriented around the idea that every user on a system should have their own ipns entry, or at least the fact that a single system should be able to be the authority on multiple ipns entries. I assume there's additional use cases than multiple ipns entries, but it's the first that popped into my head. The key to this is to split out the system stuff from the user stuff which also makes it more complicated.

System level:

  • Mounting /ipfs and /ipns
  • Gateway access

User level:

  • IPNS publishing
  • Pinning
  • Adding

It would be nice if the daemon had the following features to make running by the system cleaner:

  • Mounting needs to be done as root. Then we can reduce privileges when the mount has completed. This will make it so that people won't have to modify their fuse.conf to get the desired allow_other behavior.
  • /ipns/local doesn't make much sense for the system daemon. A globally writable /ipns/local makes even less sense. I noticed that I can't use chmod or chown within the /ipns directory either, is this a fuse limitation or have we simply not implemented that functionality yet?
  • Can mounting not be a separate command, done automatically by the daemon, and simply be enabled or disabled from the config? --mount flag

Running the daemon as a user:

  • Improve the usability by having the daemon automatically start if it hasn't started when an ipfs command is given. The daemon can simply sit in the background.
  • Turn off the gateway.
  • Have /ipns/local mounted to their home directory.

Notes on all:

  • Lock down the localhost api servers so they aren't globally accessible by everyone on a system.
@whyrusleeping
Copy link
Member

I think one thing we could do around having a 'system daemon' is to have the API port be served over a unix socket, that way we can set permissions on it (very much like docker does)

@jbenet
Copy link
Member

jbenet commented Apr 8, 2015

They have a system service and only certain users are granted read and/or write access. With this idea, users wouldn't run their own daemon

this is definitely a way in which ipfs should be able to run. This is not the only way it should run. note that being able to poke at a system-wide ipfs node's repository would leak information about access patterns and use.

important: an ipfs node bears no direct relation to a unix user. an ipfs node may be owned by many users, or just one. ipfs currently places its repo in ~/.go-ipfs because it is convenient. but perhaps it may make sense to switch to /usr/local/ipfs/. or to check both places.

# try these in order
(every path in pwd)/.ipfs
~/.ipfs
/usr/local/ipfs

Not everyone needs their own ipns entry.

Every distinct ipfs node needs a private key. the ipns entry comes for free with it.

Alternatively, would it be even possible for a single daemon to be the authority on multiple ipns entries?

this is already planned with the keystore.

While i generally am in favor of systemwide daemons, this is a security nightmare. I don't think it's wise to take on the challenge of providing per-user isolation. Thus, the daemon should only be "owned" by a single user. Think about this in terms of databases. it used to be the case that dbs came with full fledged user accounts and privileges (think mysql). today most simple dbs dont bother with that and just note that each process is owned by a single user. Thus: make it easy to run ipfs as a systemwide (not just single user) daemon but be clear to the user that no extra isolation is provided now. i.e. "it is single user".

another note: in the event many nodes are needed in one machine, it is possible to have multiple nodes/daemons share local blocks part of the repo. this of course leaks information.

The key to this is to split out the system stuff from the user stuff which also makes it more complicated.

yeah i dont want to go down this route for a while. it will be very difficult to get multi-user-isolation right and then work with it on every single command. if this is a real need it can come then.

Mounting needs to be done as root. Then we can reduce privileges when the mount has completed. This will make it so that people won't have to modify their fuse.conf to get the desired allow_other behavior.

i can mount fine as a non-root user (both osx and linux). and i haven't had to edit fuse.conf

/ipns/local doesn't make much sense for the system daemon. A globally writable /ipns/local makes even less sense.

agreed. ipns being writable in the whole system does not make sense if attempting to provide multi-user isolation.

Improve the usability by having the daemon automatically start if it hasn't started when an ipfs command is given. The daemon can simply sit in the background.

so far, when we've evaluated this, we've reached the conclusion that we do not want to start the daemon always, as sometimes we seek only to manipulate the local repo. (e.g. ipfs add ). it's a different use case.

Turn off the gateway.

very much agreed. this can be done currently by clearing the gateway config val. there was a goal to make a command to add ipfs gateway [ enable | disable ]

Lock down the localhost api servers so they aren't globally accessible by everyone on a system.
I think one thing we could do around having a 'system daemon' is to have the API port be served over a unix socket, that way we can set permissions on it (very much like docker does)

unix socket is a good call. multiaddr-net may already support it (so this might be a matter of just changing the multiaddr string), but i havent tested it.

@jbenet
Copy link
Member

jbenet commented Apr 8, 2015

I should add that i think the way the daemon works right now is definitely clunky and will get better. see: https://github.com/ipfs/go-ipfs/labels/daemon%20%2B%20init

@dylanPowers
Copy link
Member Author

I've been playing around with my linux system daemon some more and right now all my machines are following a centralized ipfs system daemon architecture like I first suggested. I'm also simulating per-user access by limiting read access on the config file. So that pattern, is definitely nearly there. Just gotta lock up the security.

Mounting needs to be done as root. Then we can reduce privileges when the mount has completed. This will make it so that people won't have to modify their fuse.conf to get the desired allow_other behavior.

i can mount fine as a non-root user (both osx and linux). and i haven't had to edit fuse.conf

@jbenet If you're accessing as you're own user there won't be problems. However if you want to enable allow_other, so that system services and root can access the /ipfs and /ipns directories, that's where you'll have issues. On my distro user_allow_other is disabled by default, as it should be due to security considerations, causing

ESC[0;37m21:19:11.283 ESC[31mERROR ESC[0;34m    ipnsfs: ESC[0mleveldb: closed ESC[0;37m<autogenerated>:24ESC[0m
Error: fusermount: "fusermount: option allow_other only allowed if 'user_allow_other' is set in /etc/fuse.conf\n", exit status 1
ESC[0;37m21:20:59.357 ESC[31mERROR ESC[0;34m    ipnsfs: ESC[0mleveldb: closed ESC[0;37m<autogenerated>:24ESC[0m
Error: fusermount: "fusermount: option allow_other only allowed if 'user_allow_other' is set in /etc/fuse.conf\n", exit status 1

to be spat out when I try to mount with allow_other. Ideally we wouldn't require someone to set the system-wide user_allow_other flag and allow access by any service on the machine to /ipfs /ipns. If I remember correctly @hosh was pushing that feature for docker and I wanted it so I could play around with my webserver serving directly from those directories rather than proxying to the ipfs daemon gateway.

@Stebalien
Copy link
Member

note that being able to poke at a system-wide ipfs node's repository would leak information about access patterns and use.

One can already poke at someone else's repository over bitswap so I don't think this is much of an issue. The only real defense against this is downloading blocks not requested by the user (as described by the paper).

but perhaps it may make sense to switch to /usr/local/ipfs/. or to check both places.

/var/lib/ipfs/

important: an ipfs node bears no direct relation to a unix user. an ipfs node may be owned by many users, or just one. ipfs currently places its repo in ~/.go-ipfs because it is convenient. but perhaps it may make sense to switch to /usr/local/ipfs/. or to check both places.

Allowing users to write to a shared directory is a recipe for disaster. The only sane way to do this is to use a global daemon.

While i generally am in favor of systemwide daemons, this is a security nightmare. I don't think it's wise to take on the challenge of providing per-user isolation. Thus, the daemon should only be "owned" by a single user. Think about this in terms of databases. it used to be the case that dbs came with full fledged user accounts and privileges (think mysql). today most simple dbs dont bother with that and just note that each process is owned by a single user. Thus: make it easy to run ipfs as a systemwide (not just single user) daemon but be clear to the user that no extra isolation is provided now. i.e. "it is single user".

Personally, I'd have a system-wide daemon, a thin daemon for managing mounts, and per-user daemons. The system-wide daemon would do bitswap, dht, merkeldag (add, get, etc.), etc. but it would never interpret objects (no unixfs, tar, etc.). Users would connect to their per-user daemons which would in turn connect to the system daemon. These per-user daemons would interpret objects (unixfs, tar, etc.), manage keys, decrypt/encrypt, etc. The mount daemon would just proxy through to the user's daemon (starting it if necessary); fuse exposes the client's uid/gid so this should be reasonably easy.

Alternatively, we could have per-user mount daemons but then we lose the global namespace.

@jbenet
Copy link
Member

jbenet commented Oct 9, 2015

One can already poke at someone else's repository over bitswap so I don't
think this is much of an issue. The only real defence against this is
downloading blocks not requested by the user (as described by the paper).

Not in all cases. this is the case by default now, but it wont be for long.
next you'll have private blocks, and later there will be policies
specifying what content is shared, and to whom.

Allowing users to write to a shared directory is a recipe for disaster.

Not if files are immutable and content addressed.

The only sane way to do this is to use a global daemon.

For some cases. There are many reasons to run multiple nodes per machine.
just as you might run multiple databases.

Personally, I'd have a system-wide daemon, a thin daemon for managing
mounts, and per-user daemons. The system-wide daemon would do bitswap, dht,
merkeldag (add, get, etc.), etc. but it would never interpret objects (no
unixfs, tar, etc.). Users would connect to their per-user daemons which
would in turn connect to the system daemon. These per-user daemons would
interpret objects (unixfs, tar, etc.), manage keys, decrypt/encrypt, etc.
The mount daemon would just proxy through to the user's daemon (starting it
if necessary); fuse exposes the client's uid/gid so this should be
reasonably easy.

This seems much harder to get right IMO. and certainly soon. Of course one
global deamon is much more efficient, but it does not capture all the
nuances of how data moves around and how data sharing policies will work.

there is always a use case for user-based isolation and keeping things
completely separate. from there, we can optimize by using shared nodes the
way multiple IP hosts use a router/gateway. (may not run all the protocols,
but certainly as FULL ipfs nodes, not stunted clients/proxies. one of the
main point is to make a distributed protocol where every entity on the
network can talk to any other).

On Tuesday, October 6, 2015, Steven Allen [email protected] wrote:

note that being able to poke at a system-wide ipfs node's repository would
leak information about access patterns and use.

One can already poke at someone else's repository over bitswap so I don't
think this is much of an issue. The only real defence against this is
downloading blocks not requested by the user (as described by the paper).

but perhaps it may make sense to switch to /usr/local/ipfs/. or to check
both places.

/var/lib/ipfs/

important: an ipfs node bears no direct relation to a unix user. an ipfs
node may be owned by many users, or just one. ipfs currently places its
repo in ~/.go-ipfs because it is convenient. but perhaps it may make sense
to switch to /usr/local/ipfs/. or to check both places.

Allowing users to write to a shared directory is a recipe for disaster.
The only sane way to do this is to use a global daemon.

While i generally am in favor of systemwide daemons, this is a security
nightmare. I don't think it's wise to take on the challenge of providing
per-user isolation. Thus, the daemon should only be "owned" by a single
user. Think about this in terms of databases. it used to be the case that
dbs came with full fledged user accounts and privileges (think mysql).
today most simple dbs dont bother with that and just note that each process
is owned by a single user. Thus: make it easy to run ipfs as a systemwide
(not just single user) daemon but be clear to the user that no extra
isolation is provided now. i.e. "it is single user".

Personally, I'd have a system-wide daemon, a thin daemon for managing
mounts, and per-user daemons. The system-wide daemon would do bitswap, dht,
merkeldag (add, get, etc.), etc. but it would never interpret objects (no
unixfs, tar, etc.). Users would connect to their per-user daemons which
would in turn connect to the system daemon. These per-user daemons would
interpret objects (unixfs, tar, etc.), manage keys, decrypt/encrypt, etc.
The mount daemon would just proxy through to the user's daemon (starting it
if necessary); fuse exposes the client's uid/gid so this should be
reasonably easy.

Alternatively, we could have per-user mount daemons but then we lose the
global namespace.


Reply to this email directly or view it on GitHub
#1014 (comment).

@Stebalien
Copy link
Member

Allowing users to write to a shared directory is a recipe for disaster.

Not if files are immutable and content addressed.

Sorry, I meant a shared normal directory (files that aren't immutable/content addressed). I was responding to your suggestion that ipfs look for the repo in both /usr/lib/ipfs/ and ~/.go-ipfs.

The only sane way to do this is to use a global daemon.

For some cases. There are many reasons to run multiple nodes per machine. just as you might run multiple databases.

The only sane way to allow multiple users to read/write to a common directory (database).

Not in all cases. this is the case by default now, but it wont be for long. next you'll have private blocks, and later there will be policies specifying what content is shared, and to whom.

Ah. Sorry, I thought this was going to be done entirely in crypto (i.e., make the encrypted data public but keep the keys private).


My concerns with having no global daemon are:

  1. No global /ipfs mount. Not super important but it would be nice to have this.
  2. Disk usage (duplicate data storage).
  3. Network bandwidth. I assume that each user's daemon won't be running 24/7 so user daemons will often have to re-download content already present on the machine.
  4. Security. I'd have a global daemon run with reduced privileges running all the network facing services and then have the user daemons running behind the firewall verifying that the global daemon doesn't misbehave (checking the hashes).
  5. Ports. I don't want to have to allocate a port per user.

@chris-martin
Copy link

I'm a bit confused about the present state of the architecture. If an ipfs daemon belongs to each user, why isn't the default mount point within the user's home directory?

@Kubuxu
Copy link
Member

Kubuxu commented Feb 28, 2016

@chris-martin It is so paths are same as in other URL approaches, fs:/ipfs/Qm....AAA.
If you have fuse mount running you can just skip the fs:/.

My thoughts about system vs user daemons:
It is totally feasible to have full system daemon and small user daemons on Unix based systems using Unix Domain Sockets.
System daemons would be responsible for communication, storage and fuse, user daemons for running the gateway and API ports.

The communication would take place over Unix sockets which allows user identification, thus per user resources limitations.

Problems is localhost communication. Even with real localhost IP range (127.0.0.1-127.0.0.255) it is small enough to allow port scanning and breaking into people's API ports, which would have to be chosen either deterministic basing on UID or assigned by system daemon. This means that either token API would be required with default blocked permissions or API key.

@NeoTheFox
Copy link

NeoTheFox commented Sep 8, 2016

This is my take with systemd on system-wide daemon using ipfs as it is right now:

Add ipfs user
useradd -d /var/ipfs -m -G fuse -r ipfs

Systemd service

[Unit]
Description=IPFS daemon

[Service]
Environment=IPFS_PATH=/var/ipfs/.ipfs
User=ipfs
Group=ipfs
ExecStart=/usr/bin/ipfs daemon
Restart=on-failure

[Install]
WantedBy=default.target

This can be enhanced to auto-start using socket mechanism provided by systemd.

@Stebalien
Copy link
Member

This can be enhanced to auto-start using socket mechanism provided by systemd.

Yes but that would require support from IPFS.

Add ipfs user

You can also use sysusers.d for this u ipfs - "ipfs daemon" /var/lib/ipfs.

Systemd service

  1. With ipfs, I believe you need to set KillSignal=SIGINT if you want it to gracefully shut-down but I'm not sure.
  2. You can also harden this quite a bit:
[Service]
# ...
# No running SUID programs
NoNewPrivileges=true
# No reading/writing from/to user home directories
ProtectHome=true
# No writing system files (does not allow IPFS to update itself).
ProtectSystem=full
# No shared `/tmp` (really, you should almost never need shared `/tmp`).
PrivateTmp=true
# No access to physical device files (`/dev/sda` etc.)
PrivateDevices=true

@NeoTheFox
Copy link

But wouldn't ProtectHome prevent other users, even from ipfs group from using ipfs?

@Stebalien
Copy link
Member

@NeoTheFox That prevents the IPFS daemon from accessing normal user's home directories (i.e., /home). It doesn't prevent normal users from using IPFS.

@jbenet
Copy link
Member

jbenet commented Sep 9, 2016

Btw, you can change the mount points in the config.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants