Simplify capability management for containers
Currently if you want to set specific capabilities on a container you're going to need to set drop_capabilities: false
and docker_options: "--cap-add=FOO"
in your services.yml, and this still fails unless the container runs as root anyway!
The reason is that "podman --user" sets an empty current capability set on the containerized process, e.g.:
# podman run --cap-drop=all --rm registry.git.autistici.org/ai3/docker/cap-test:master
Current: =
Bounding set =
# podman run --cap-drop=ALL --cap-add=NET_BIND_SERVICE --rm registry.git.autistici.org/ai3/docker/cap-test:master
Current: = cap_net_bind_service+eip
Bounding set =cap_net_bind_service
# podman run --user=1000 --cap-drop=ALL --cap-add=NET_BIND_SERVICE --rm registry.git.autistici.org/ai3/docker/cap-test:master
Current: =
Bounding set =cap_net_bind_service
Which ultimately means you have to set filesystem-based capabilities inside the container image, to reset the current set to the bounding set. This is, uh, not very practical.
So, there's a couple of things we should do:
-
make capabilities a primary attribute of the container object in services.yml -
have systemd do the setuid() and set capabilities, drop the --user flag from podman