From 579e611fc9c2c8a1b35ff6d351a51da139514eb6 Mon Sep 17 00:00:00 2001
From: Pavel 'LEdoian' Turinsky <ledoian@matfyz.cz>
Date: Sat, 2 Dec 2023 23:47:40 +0100
Subject: [PATCH] Systemd-gripes first flush

This is going to be an ever-evolving list of *my* experiences.
Currently, there are a lot of details missing :-/
---
 content/systemd-gripes.rst | 213 +++++++++++++++++++++++++++++++++++++
 1 file changed, 213 insertions(+)
 create mode 100644 content/systemd-gripes.rst

diff --git a/content/systemd-gripes.rst b/content/systemd-gripes.rst
new file mode 100644
index 0000000..0a49b28
--- /dev/null
+++ b/content/systemd-gripes.rst
@@ -0,0 +1,213 @@
+My gripes with systemd
+@@@@@@@@@@@@@@@@@@@@@@
+
+:slug: systemd-gripes
+:date: 2023-11-13 21:50:36
+:tags: systemd, linux
+:category: programming
+:keywords: systemd, cgroup, programming, bug
+:lang: en
+:translation: false
+:status: draft
+
+This is mainly me complaining about systemd, in very concrete instances. And I
+will probably extend this in future, when some noteworthy bug appears again.
+
+I think I could love systemd…
+=============================
+
+Now, don't get me wrong. The *idea* behind systemd is a nice one, it is awesome
+that we can track processes with cgroups and have fine-grained control over
+them. In fact, as of now, my server uses this extensively to limit resources of
+various services and allow me to always SSH to it, even when services try to
+use all CPU and RAM.
+
+The idea with everything using sockets, starting asynchronously and waiting on
+sockets for services that did not yet start also feels like a neat hack (I do
+not yet know the downsides, it feels there are some, but whatever).
+
+If this worked well, I would probably be happy. This is also the feeling I get
+from the uselessd project and I am sad it died.
+
+… but I don't
+=============
+
+The actual systemd implementation, however, has quite many issues and a rather
+steady influx of new ones. (They are being fixed, but they are still included
+in official releases, making a painful experience for bleeding-edge users.)
+
+Many of my gripes feel like a lack of quality control: either the devs forgot
+to consider some use cases, or did not test thoroughly enough. In a few cases,
+the changes seem to be merged without too much thought. That would be great for
+a toy project, but that IMHO should not have become a main implementation of a
+critical system part
+
+I have also the feeling that the project is too big for devs' capabilities, so
+seeing the impact of changes is hard. This can be rephrased as "the devs are
+too incompetent to handle a project this big/tangled".[#reach]_
+
+Also: there is a lot written from other people:
+- TODO
+- TODO
+- TODO
+
+I do not necessarily agree with everything on the pages linked above, but I am
+definitely not alone.  However, this page serves to list my own views and
+experiences, so I will try to not repeat what has already been said.
+
+Here come the concrete cases:
+
+Disabling the user instance
+---------------------------
+
+Applies to version: 247.3-7+deb11u4
+
+A common systemd deployment spawns the user instance automatically when logging
+in. There does not seem to be a documented way of disabling it, but masking
+``default.target`` in the user instance seems to do the job. (Another way is
+masking the ``user@XXX.service`` in the system instance, if you have the rights.)
+
+Firstly, this is undocumented (and might break in the future). Secondly, if you
+try to find out the current default target, it throws a lovely bullshit error::
+
+    $ systemctl --user get-default
+    Failed to get default target: Operation not possible due to RF-kill
+
+You know what 'wat' means? The message was transferred through D-bus and I am
+pretty sure that kernel did not return ERFKILL, this is some internal systemd's
+confusion.
+
+OOM-kill kills the whole session
+--------------------------------
+
+Applies to version: TODO! 252? (fixed in 252.4 iirc)
+
+I was so mad at this one.
+
+Out of memory is a quite common condition for me, because I happen to use a few
+programs that leak memory. I am well aware of the risk that kernel's OOM killer
+may kill some vital process, but also know that pretty much all the time the
+leaking program gets shot, so it's not a big deal.
+
+Then new systemd landed. It started monitoring OOMs, so that it can terminate
+and fail the unit which ran OOM. The idea is probably not to leave processes
+running without other processes of the same service.
+
+That seems reasonable. For services. The session is *also* a unit. No special handling.
+
+Yes, *any* runaway process would kill the *entire* session. Or rather systemd
+would, proactively.
+
+This had quite a few strong words in the issue report, and systemd devs said
+such language was unwarranted. I am siding with the reporter, sending SIGKILL
+to everything in a session is not a sane behaviour imho.
+
+.. TODO: link to github
+
+
+Hardcoded user.slice in logind's PAM module
+-------------------------------------------
+
+Applies to version: TODO
+
+The PAM module, which takes care about creating the cgroup hierarchy, is
+hardcoded to put everything under ``user.slice``.
+
+.. TODO: link source code
+
+This effectively prevents having service/monitoring access over SSH be in
+correct place in the hierarchy, i.e. somewhere under ``system.slice``. (Well,
+you could skip the PAM module, but then everything is under ``sshd.service``).
+
+In my use case, I wanted to allocate a tiny amount of resources to the
+monitoring (which logs in as a system account over SSH) and have the rest
+available to users, so that I can monitor the machine even when the users use
+all available memory or forkbomb it. Can't do.
+
+Generators
+----------
+
+Applies to version: TODO
+
+Systemd wants almost everything to be a unit. Unfortunately, not everything
+natively is, so systemd needs generators to create the units ad-hoc. However,
+generators run very early, so many facilities of systemd are not available,
+making for a very hacky feel. Namely: there is no logging, so generators need
+to log to kernel's ring buffer, and they *cannot* be disabled in any way.
+
+Quite naturally, generators form another layer in the stack and as such, they
+can introduce bugs.
+
+For me, the pain point was when the fstab generator (which creates ``.mount``
+units) did not parse ``noauto`` or ``_netdev`` flag correctly, thus hanging
+every single boot of my machine for 90 seconds waiting for a drive that I
+expected not to be available.
+
+When I learned the nature of the problem, I pulled systemd out of the system.
+(I was already aware of many other deficiencies of systemd at that time, this
+sounded like fun and surprisingly the first random attempts at removing systemd
+worked well enough, so I did not revert that state.)
+
+Spooky action at a distance w.r.t. logging from user instance
+-------------------------------------------------------------
+
+Applies to version: TODO
+
+Services have logs, and logs are to be read. User services have also logs, but
+systemd *sometimes* fails at the reading part. Namely, this 
+
+assert and broken login
+D-bus and weird interactions
+various quality of modules
+generator and fstab
+
+Can systemd fix this?
+=====================
+
+I think it might be possible, since most of these issues stem from not caring
+about various usecases very much. So an end-to-end test suite of the different
+usecases could help avoid many of these problems.
+
+I have no idea how many people work on systemd and what the community looks
+like, so I don't know whether the test suite or other remediations would be
+viable.
+
+Do I use systemd?
+=================
+
+Tough question. I use several machines with different stage of
+systemd-eviction:
+
+- As said, my server uses systemd quite extensively. Each service has its own
+  user, whose user instance runs the service, the system service manages
+  resource limits on the user instances (``user-XXXX.slice`` from the system
+  instance POV). A few services for which I wanted to have logs accessible, run
+  under s6.
+- My desktop is complete opposite: system is pretty stable, so it is only
+  started with a trivial shell script (without supervision or service
+  management). User's services run under s6-rc.[#stability]_ There is no
+  systemd anywhere.
+- My laptops and other computers usually run system-wide systemd, but often
+  have very tweaked user-instance – either masked altogether, or with
+  non-default setup of the slices and services, or at least with some services
+  masked. And I am very much not afraid to kill user-systemd on the slightest
+  issue, because I have debugged the s6-rc solution.
+
+The main reason for using systemd for me is that I want to use the pre-built
+Arch's packages (esp. on low-end hardware) and have them work without
+rebuilding. (I am considering creating a build-server to build packages for my
+other machines, but it is a rather low priority and thus has not happened yet.)
+
+And systemd somehow manages to only suck sometimes :-)
+
+-------
+
+.. [#reach] This is a rather bold statement. If you disagree even after reading
+   *the whole article*, please reach me at
+   `<mailto:systemd-gripes@pokemon.ledoian.cz>`__, I would like to know your POV.
+
+.. [#stability] The requirement for service manager is Linux's sound system and
+   its many quirks. I found out that when Pipewire (or its companions) crashes,
+   it sometimes brings down other services (MPD), which is unpleasant. On the
+   other hand, OpenSSH daemon does not crash like ever, and IP configuration is
+   oneshot, so there is no need for supervision on the system part.