My gripes with systemd
@@@@@@@@@@@@@@@@@@@@@@

:slug: systemd-gripes
:date: 2023-11-13 21:50:36
:tags: systemd, linux
:category: programming
:keywords: systemd, cgroup, programming, bug
:lang: en
:translation: false
:status: draft

This is mainly me complaining about systemd, in very concrete instances. And I
will probably extend this in future, when some noteworthy bug appears again.

I think I could love systemd…
=============================

Now, don't get me wrong. The *idea* behind systemd is a nice one, it is awesome
that we can track processes with cgroups and have fine-grained control over
them. In fact, as of now, my server uses this extensively to limit resources of
various services and allow me to always SSH to it, even when services try to
use all CPU and RAM.

The idea with everything using sockets, starting asynchronously and waiting on
sockets for services that did not yet start also feels like a neat hack (I do
not yet know the downsides, it feels there are some, but whatever).

If this worked well, I would probably be happy. This is also the feeling I get
from the uselessd project and I am sad it died.

… but I don't
=============

The actual systemd implementation, however, has quite many issues and a rather
steady influx of new ones. (They are being fixed, but they are still included
in official releases, making a painful experience for bleeding-edge users.)

Many of my gripes feel like a lack of quality control: either the devs forgot
to consider some use cases, or did not test thoroughly enough. In a few cases,
the changes seem to be merged without too much thought. That would be great for
a toy project, but that IMHO should not have become a main implementation of a
critical system part

I have also the feeling that the project is too big for devs' capabilities, so
seeing the impact of changes is hard. This can be rephrased as "the devs are
too incompetent to handle a project this big/tangled".[#reach]_

Also: there is a lot written from other people:
- TODO
- TODO
- TODO

I do not necessarily agree with everything on the pages linked above, but I am
definitely not alone.  However, this page serves to list my own views and
experiences, so I will try to not repeat what has already been said.

Here come the concrete cases:

Disabling the user instance
---------------------------

Applies to version: 247.3-7+deb11u4

A common systemd deployment spawns the user instance automatically when logging
in. There does not seem to be a documented way of disabling it, but masking
``default.target`` in the user instance seems to do the job. (Another way is
masking the ``user@XXX.service`` in the system instance, if you have the rights.)

Firstly, this is undocumented (and might break in the future). Secondly, if you
try to find out the current default target, it throws a lovely bullshit error::

    $ systemctl --user get-default
    Failed to get default target: Operation not possible due to RF-kill

You know what 'wat' means? The message was transferred through D-bus and I am
pretty sure that kernel did not return ERFKILL, this is some internal systemd's
confusion.

OOM-kill kills the whole session
--------------------------------

Applies to version: TODO! 252? (fixed in 252.4 iirc)

I was so mad at this one.

Out of memory is a quite common condition for me, because I happen to use a few
programs that leak memory. I am well aware of the risk that kernel's OOM killer
may kill some vital process, but also know that pretty much all the time the
leaking program gets shot, so it's not a big deal.

Then new systemd landed. It started monitoring OOMs, so that it can terminate
and fail the unit which ran OOM. The idea is probably not to leave processes
running without other processes of the same service.

That seems reasonable. For services. The session is *also* a unit. No special handling.

Yes, *any* runaway process would kill the *entire* session. Or rather systemd
would, proactively.

This had quite a few strong words in the issue report, and systemd devs said
such language was unwarranted. I am siding with the reporter, sending SIGKILL
to everything in a session is not a sane behaviour imho.

.. TODO: link to github


Hardcoded user.slice in logind's PAM module
-------------------------------------------

Applies to version: TODO

The PAM module, which takes care about creating the cgroup hierarchy, is
hardcoded to put everything under ``user.slice``.

.. TODO: link source code

This effectively prevents having service/monitoring access over SSH be in
correct place in the hierarchy, i.e. somewhere under ``system.slice``. (Well,
you could skip the PAM module, but then everything is under ``sshd.service``).

In my use case, I wanted to allocate a tiny amount of resources to the
monitoring (which logs in as a system account over SSH) and have the rest
available to users, so that I can monitor the machine even when the users use
all available memory or forkbomb it. Can't do.

Generators
----------

Applies to version: TODO

Systemd wants almost everything to be a unit. Unfortunately, not everything
natively is, so systemd needs generators to create the units ad-hoc. However,
generators run very early, so many facilities of systemd are not available,
making for a very hacky feel. Namely: there is no logging, so generators need
to log to kernel's ring buffer, and they *cannot* be disabled in any way.

Quite naturally, generators form another layer in the stack and as such, they
can introduce bugs.

For me, the pain point was when the fstab generator (which creates ``.mount``
units) did not parse ``noauto`` or ``_netdev`` flag correctly, thus hanging
every single boot of my machine for 90 seconds waiting for a drive that I
expected not to be available.

When I learned the nature of the problem, I pulled systemd out of the system.
(I was already aware of many other deficiencies of systemd at that time, this
sounded like fun and surprisingly the first random attempts at removing systemd
worked well enough, so I did not revert that state.)

Spooky action at a distance w.r.t. logging from user instance
-------------------------------------------------------------

Applies to version: TODO

Services have logs, and logs are to be read. User services have also logs, but
systemd *sometimes* fails at the reading part. Namely, this 

assert and broken login
D-bus and weird interactions
various quality of modules
generator and fstab

Can systemd fix this?
=====================

I think it might be possible, since most of these issues stem from not caring
about various usecases very much. So an end-to-end test suite of the different
usecases could help avoid many of these problems.

I have no idea how many people work on systemd and what the community looks
like, so I don't know whether the test suite or other remediations would be
viable.

Do I use systemd?
=================

Tough question. I use several machines with different stage of
systemd-eviction:

- As said, my server uses systemd quite extensively. Each service has its own
  user, whose user instance runs the service, the system service manages
  resource limits on the user instances (``user-XXXX.slice`` from the system
  instance POV). A few services for which I wanted to have logs accessible, run
  under s6.
- My desktop is complete opposite: system is pretty stable, so it is only
  started with a trivial shell script (without supervision or service
  management). User's services run under s6-rc.[#stability]_ There is no
  systemd anywhere.
- My laptops and other computers usually run system-wide systemd, but often
  have very tweaked user-instance – either masked altogether, or with
  non-default setup of the slices and services, or at least with some services
  masked. And I am very much not afraid to kill user-systemd on the slightest
  issue, because I have debugged the s6-rc solution.

The main reason for using systemd for me is that I want to use the pre-built
Arch's packages (esp. on low-end hardware) and have them work without
rebuilding. (I am considering creating a build-server to build packages for my
other machines, but it is a rather low priority and thus has not happened yet.)

And systemd somehow manages to only suck sometimes :-)

-------

.. [#reach] This is a rather bold statement. If you disagree even after reading
   *the whole article*, please reach me at
   `<mailto:systemd-gripes@pokemon.ledoian.cz>`__, I would like to know your POV.

.. [#stability] The requirement for service manager is Linux's sound system and
   its many quirks. I found out that when Pipewire (or its companions) crashes,
   it sometimes brings down other services (MPD), which is unpleasant. On the
   other hand, OpenSSH daemon does not crash like ever, and IP configuration is
   oneshot, so there is no need for supervision on the system part.