From 579e611fc9c2c8a1b35ff6d351a51da139514eb6 Mon Sep 17 00:00:00 2001 From: Pavel 'LEdoian' Turinsky Date: Sat, 2 Dec 2023 23:47:40 +0100 Subject: [PATCH] Systemd-gripes first flush This is going to be an ever-evolving list of *my* experiences. Currently, there are a lot of details missing :-/ --- content/systemd-gripes.rst | 213 +++++++++++++++++++++++++++++++++++++ 1 file changed, 213 insertions(+) create mode 100644 content/systemd-gripes.rst diff --git a/content/systemd-gripes.rst b/content/systemd-gripes.rst new file mode 100644 index 0000000..0a49b28 --- /dev/null +++ b/content/systemd-gripes.rst @@ -0,0 +1,213 @@ +My gripes with systemd +@@@@@@@@@@@@@@@@@@@@@@ + +:slug: systemd-gripes +:date: 2023-11-13 21:50:36 +:tags: systemd, linux +:category: programming +:keywords: systemd, cgroup, programming, bug +:lang: en +:translation: false +:status: draft + +This is mainly me complaining about systemd, in very concrete instances. And I +will probably extend this in future, when some noteworthy bug appears again. + +I think I could love systemd… +============================= + +Now, don't get me wrong. The *idea* behind systemd is a nice one, it is awesome +that we can track processes with cgroups and have fine-grained control over +them. In fact, as of now, my server uses this extensively to limit resources of +various services and allow me to always SSH to it, even when services try to +use all CPU and RAM. + +The idea with everything using sockets, starting asynchronously and waiting on +sockets for services that did not yet start also feels like a neat hack (I do +not yet know the downsides, it feels there are some, but whatever). + +If this worked well, I would probably be happy. This is also the feeling I get +from the uselessd project and I am sad it died. + +… but I don't +============= + +The actual systemd implementation, however, has quite many issues and a rather +steady influx of new ones. (They are being fixed, but they are still included +in official releases, making a painful experience for bleeding-edge users.) + +Many of my gripes feel like a lack of quality control: either the devs forgot +to consider some use cases, or did not test thoroughly enough. In a few cases, +the changes seem to be merged without too much thought. That would be great for +a toy project, but that IMHO should not have become a main implementation of a +critical system part + +I have also the feeling that the project is too big for devs' capabilities, so +seeing the impact of changes is hard. This can be rephrased as "the devs are +too incompetent to handle a project this big/tangled".[#reach]_ + +Also: there is a lot written from other people: +- TODO +- TODO +- TODO + +I do not necessarily agree with everything on the pages linked above, but I am +definitely not alone. However, this page serves to list my own views and +experiences, so I will try to not repeat what has already been said. + +Here come the concrete cases: + +Disabling the user instance +--------------------------- + +Applies to version: 247.3-7+deb11u4 + +A common systemd deployment spawns the user instance automatically when logging +in. There does not seem to be a documented way of disabling it, but masking +``default.target`` in the user instance seems to do the job. (Another way is +masking the ``user@XXX.service`` in the system instance, if you have the rights.) + +Firstly, this is undocumented (and might break in the future). Secondly, if you +try to find out the current default target, it throws a lovely bullshit error:: + + $ systemctl --user get-default + Failed to get default target: Operation not possible due to RF-kill + +You know what 'wat' means? The message was transferred through D-bus and I am +pretty sure that kernel did not return ERFKILL, this is some internal systemd's +confusion. + +OOM-kill kills the whole session +-------------------------------- + +Applies to version: TODO! 252? (fixed in 252.4 iirc) + +I was so mad at this one. + +Out of memory is a quite common condition for me, because I happen to use a few +programs that leak memory. I am well aware of the risk that kernel's OOM killer +may kill some vital process, but also know that pretty much all the time the +leaking program gets shot, so it's not a big deal. + +Then new systemd landed. It started monitoring OOMs, so that it can terminate +and fail the unit which ran OOM. The idea is probably not to leave processes +running without other processes of the same service. + +That seems reasonable. For services. The session is *also* a unit. No special handling. + +Yes, *any* runaway process would kill the *entire* session. Or rather systemd +would, proactively. + +This had quite a few strong words in the issue report, and systemd devs said +such language was unwarranted. I am siding with the reporter, sending SIGKILL +to everything in a session is not a sane behaviour imho. + +.. TODO: link to github + + +Hardcoded user.slice in logind's PAM module +------------------------------------------- + +Applies to version: TODO + +The PAM module, which takes care about creating the cgroup hierarchy, is +hardcoded to put everything under ``user.slice``. + +.. TODO: link source code + +This effectively prevents having service/monitoring access over SSH be in +correct place in the hierarchy, i.e. somewhere under ``system.slice``. (Well, +you could skip the PAM module, but then everything is under ``sshd.service``). + +In my use case, I wanted to allocate a tiny amount of resources to the +monitoring (which logs in as a system account over SSH) and have the rest +available to users, so that I can monitor the machine even when the users use +all available memory or forkbomb it. Can't do. + +Generators +---------- + +Applies to version: TODO + +Systemd wants almost everything to be a unit. Unfortunately, not everything +natively is, so systemd needs generators to create the units ad-hoc. However, +generators run very early, so many facilities of systemd are not available, +making for a very hacky feel. Namely: there is no logging, so generators need +to log to kernel's ring buffer, and they *cannot* be disabled in any way. + +Quite naturally, generators form another layer in the stack and as such, they +can introduce bugs. + +For me, the pain point was when the fstab generator (which creates ``.mount`` +units) did not parse ``noauto`` or ``_netdev`` flag correctly, thus hanging +every single boot of my machine for 90 seconds waiting for a drive that I +expected not to be available. + +When I learned the nature of the problem, I pulled systemd out of the system. +(I was already aware of many other deficiencies of systemd at that time, this +sounded like fun and surprisingly the first random attempts at removing systemd +worked well enough, so I did not revert that state.) + +Spooky action at a distance w.r.t. logging from user instance +------------------------------------------------------------- + +Applies to version: TODO + +Services have logs, and logs are to be read. User services have also logs, but +systemd *sometimes* fails at the reading part. Namely, this + +assert and broken login +D-bus and weird interactions +various quality of modules +generator and fstab + +Can systemd fix this? +===================== + +I think it might be possible, since most of these issues stem from not caring +about various usecases very much. So an end-to-end test suite of the different +usecases could help avoid many of these problems. + +I have no idea how many people work on systemd and what the community looks +like, so I don't know whether the test suite or other remediations would be +viable. + +Do I use systemd? +================= + +Tough question. I use several machines with different stage of +systemd-eviction: + +- As said, my server uses systemd quite extensively. Each service has its own + user, whose user instance runs the service, the system service manages + resource limits on the user instances (``user-XXXX.slice`` from the system + instance POV). A few services for which I wanted to have logs accessible, run + under s6. +- My desktop is complete opposite: system is pretty stable, so it is only + started with a trivial shell script (without supervision or service + management). User's services run under s6-rc.[#stability]_ There is no + systemd anywhere. +- My laptops and other computers usually run system-wide systemd, but often + have very tweaked user-instance – either masked altogether, or with + non-default setup of the slices and services, or at least with some services + masked. And I am very much not afraid to kill user-systemd on the slightest + issue, because I have debugged the s6-rc solution. + +The main reason for using systemd for me is that I want to use the pre-built +Arch's packages (esp. on low-end hardware) and have them work without +rebuilding. (I am considering creating a build-server to build packages for my +other machines, but it is a rather low priority and thus has not happened yet.) + +And systemd somehow manages to only suck sometimes :-) + +------- + +.. [#reach] This is a rather bold statement. If you disagree even after reading + *the whole article*, please reach me at + ``__, I would like to know your POV. + +.. [#stability] The requirement for service manager is Linux's sound system and + its many quirks. I found out that when Pipewire (or its companions) crashes, + it sometimes brings down other services (MPD), which is unpleasant. On the + other hand, OpenSSH daemon does not crash like ever, and IP configuration is + oneshot, so there is no need for supervision on the system part.