DEV Community

olaboyejo
olaboyejo

Posted on

Linux Capabilities Use Cases - systemd

Introduction

The last post was a discussion on capabilities sets and the bits that make up the sets. We also saw how we can examine the capabilities sets of threads and processes. In this post, we will be looking at the application of capabilities to processes managed by systemd.

systemd

systemd is the default system and services manager on most modern Linux distributions. When run as the first process on boot (PID 1), It manages the startup, operation of userspace services. As the init system, the systemd process runs with an effective User ID of 0.

(base) boye@hp7940m1:~$ ps -fp 1
UID          PID    PPID  C STIME TTY          TIME CMD
root           1       0  0 Aug16 ?        00:00:41 /sbin/init splash
(base) boye@hp7940m1:~$ 
Enter fullscreen mode Exit fullscreen mode

Process Examination of systemd

To explore the capabilities sets of the processes managed by systemd and the configuration settings that change the default behaviour, we will be using a small python script for convenience. This script (cap_display) is on github. This repository has some of the other files we will be using for demonstration. The README.md file has the descriptions and installation instructions for these files.

The cap_display script scrapes the /usr/include/linux/capability.h file to create a mapping between capability names and their bit values. It also reads the hexadecimal representation of the capabilities set for a PID in /proc/PID/status file and uses the mapping to print a human-friendly representation of the capabilities set.

(capabilities_show) (base) boye@hp7940m1:~/Documents/dev/capabilities_show$ cap_display --pid 1
Name:   systemd
Tgid:   1
Pid:    1
PPid:   0
Uid:    0       0       0       0
Gid:    0       0       0       0

CapInh:  0000000000000000        None
CapPrm:  0000003fffffffff        cap_audit_read,cap_block_suspend,cap_wake_alarm,cap_syslog,cap_mac_admin,cap_mac_override,cap_setfcap,cap_audit_control,cap_audit_write,cap_lease,cap_mknod,cap_sys_tty_config,cap_sys_time,cap_sys_resource,cap_sys_nice,cap_sys_boot,cap_sys_admin,cap_sys_pacct,cap_sys_ptrace,cap_sys_chroot,cap_sys_rawio,cap_sys_module,cap_ipc_owner,cap_ipc_lock,cap_net_raw,cap_net_admin,cap_net_broadcast,cap_net_bind_service,cap_linux_immutable,cap_setpcap,cap_setuid,cap_setgid,cap_kill,cap_fsetid,cap_fowner,cap_dac_read_search,cap_dac_override,cap_chown
CapEff:  0000003fffffffff        cap_audit_read,cap_block_suspend,cap_wake_alarm,cap_syslog,cap_mac_admin,cap_mac_override,cap_setfcap,cap_audit_control,cap_audit_write,cap_lease,cap_mknod,cap_sys_tty_config,cap_sys_time,cap_sys_resource,cap_sys_nice,cap_sys_boot,cap_sys_admin,cap_sys_pacct,cap_sys_ptrace,cap_sys_chroot,cap_sys_rawio,cap_sys_module,cap_ipc_owner,cap_ipc_lock,cap_net_raw,cap_net_admin,cap_net_broadcast,cap_net_bind_service,cap_linux_immutable,cap_setpcap,cap_setuid,cap_setgid,cap_kill,cap_fsetid,cap_fowner,cap_dac_read_search,cap_dac_override,cap_chown
CapBnd:  0000003fffffffff        cap_audit_read,cap_block_suspend,cap_wake_alarm,cap_syslog,cap_mac_admin,cap_mac_override,cap_setfcap,cap_audit_control,cap_audit_write,cap_lease,cap_mknod,cap_sys_tty_config,cap_sys_time,cap_sys_resource,cap_sys_nice,cap_sys_boot,cap_sys_admin,cap_sys_pacct,cap_sys_ptrace,cap_sys_chroot,cap_sys_rawio,cap_sys_module,cap_ipc_owner,cap_ipc_lock,cap_net_raw,cap_net_admin,cap_net_broadcast,cap_net_bind_service,cap_linux_immutable,cap_setpcap,cap_setuid,cap_setgid,cap_kill,cap_fsetid,cap_fowner,cap_dac_read_search,cap_dac_override,cap_chown
CapAmb:  0000000000000000        None
(capabilities_show) (base) boye@hp7940m1:~/Documents/dev/capabilities_show$ 
Enter fullscreen mode Exit fullscreen mode

Alt Text

Above is an examination of PID 1 using the script. Here we can see the UID and GID =0 and the effective, permitted and bounded capabilities sets for the process. The have all their bits set. This is expected for a process with the EUID=0.

We will be looking at three configuration options in systemd unit files that can influence the capabilities set for services managed by systemd. These are the User, AmbientCapabilities and CapabilityBoundingSet options.

Default behaviour for processes managed by systemd

We will be exploring the behaviour of processes managed by systemd from the perspective of a dummy service unit that tells the current time. This service is a go application (DayTimeServer) which you can find in the post series repository. The code is adapted from one of the examples in this book.

We run our cutting edge service by running the executable and specifying a TCP port number as shown below;

(base) boye@hp7940m1:~/go/bin$ ./DayTimeServer 1021
Fatal Error: listen tcp :1021: bind: permission denied
(base) boye@hp7940m1:~/go/bin$ 
Enter fullscreen mode Exit fullscreen mode

Alt Text

No surprises at the failure above because an unprivileged user cannot use the port numbers under 1024. The service is started below with TCP port 1029.

(base) boye@hp7940m1:~/go/bin$ ./DayTimeServer 1029
Enter fullscreen mode Exit fullscreen mode

Alt Text

(base) boye@hp7940m1:~/go/bin$ nc localhost 1029
2021-08-17 21:29:47.570860654 +1200 NZST m=+50.203298522
(base) boye@hp7940m1:~/go/bin$  
Enter fullscreen mode Exit fullscreen mode

Alt Text

We get the date and time using the nc utility to probe the server.

Unprivileged Port

We now run the server as a systemd managed service using the unit file configuration below.

(base) boye@hp7940m1:/etc/systemd/system$ cat daytimeServer.service 
Description=Cutting Edge DayTime Announcement Service

[Service]
Type=simple
ExecStart=/home/boye/go/bin/DayTimeServer 3000

[Install]
WantedBy=multi-user.target
Enter fullscreen mode Exit fullscreen mode

Alt Text

We can see the process is running on TCP port 3000.

Alt Text

The capabilities set, just like that of systemd, has all the effective capability set bits enabled.

Privileged Port

We will now attempt to run same service with a same privileged port 1021

(base) boye@hp7940m1:/etc/systemd/system$ cat daytimeServer.service 
Description=Cutting Edge DayTime Announcement Service

[Service]
Type=simple
ExecStart=/home/boye/go/bin/DayTimeServer 1021

[Install]
WantedBy=multi-user.target
Enter fullscreen mode Exit fullscreen mode

Alt Text

Alt Text

We can see that it runs successfully and an examination of the process ID shows that it has all the effective capabilities bits set.

In summary, the processes managed by systemd run with all the privileges enabled by default.

systemd unit file with User option

We will now run the same binary service but we will be setting the user to a non-privileged user. Below is the systemd service unit file.

(base) boye@hp7940m1:/etc/systemd/system$ cat daytimeServer_user.service 
Description=Cutting Edge DayTime Announcement Service

[Service]
Type=simple
ExecStart=/home/boye/go/bin/DayTimeServer 3001
User=boye

[Install]
WantedBy=multi-user.target
Enter fullscreen mode Exit fullscreen mode

Alt Text

Alt Text

We can see from the output above that the process has no effective capability bits set.

systemd unit file with AmbientCapabilities option

Now the boss just informed us that our biggest competitor in the daytime service app business just raised the bar. They allow their service to be run on privileged ports. Using the Ambient capabilities option in the service unit file, we can add this feature to our service.

(base) boye@hp7940m1:/etc/systemd/system$ cat daytimeServer_user_net_bind.service 
Description=Cutting Edge DayTime Announcement Service

[Service]
Type=simple
ExecStart=/home/boye/go/bin/DayTimeServer 105
User=boye
AmbientCapabilities=CAP_NET_BIND_SERVICE

[Install]
WantedBy=multi-user.target
Enter fullscreen mode Exit fullscreen mode

In the unit file above, we are attempting to run the service on TCP port 105 and we are setting the CAP_NET_BIND_SERVICE bit in the ambient capabilities set. That capability empowers an unprivileged user to bind to network ports below 1024.

Alt Text

Alt Text

We can see from the output above that even though the process is owned by a regular user, using the AmbientCapabilities option allows the process to bind to a privileged port number. The cap_display output for the process ID shows that the cap_net_bind_service bit in the effective capabilities set has been enabled.

The ambient capabilities set is used to transfer privileges from the parent process (systemd, PID 1) to the service process.

systemd unit file with CapabilityBoundingSet option

The final option we will be exploring in this post is particularly useful for situations where we don't want to change the user for the process but we want the process to be limited to just the required privileges. The CapabilityBoundingSet option in the service unit file serves as a limiting set of what can be transferred from the parent to the child process.

We will look at the effects by using the service unit file below which effectively disables all the bits in the bounding capabilities set.

(base) boye@hp7940m1:/etc/systemd/system$ cat daytimeServer_limited.service 
Description=Cutting Edge DayTime Announcement Service

[Service]
Type=simple
ExecStart=/home/boye/go/bin/DayTimeServer 3002
CapabilityBoundingSet=

[Install]
WantedBy=multi-user.target
Enter fullscreen mode Exit fullscreen mode

Alt Text

Alt Text

From the cap_display output we can see that in spite of the process running as a privileged user, it has no effective permissions set. Attempting to use a privileged port will meet with failure as shown below.

(base) boye@hp7940m1:/etc/systemd/system$ cat daytimeServer_limited_privilege_port_attempt.service 
Description=Cutting Edge DayTime Announcement Service

[Service]
Type=simple
ExecStart=/home/boye/go/bin/DayTimeServer 108
CapabilityBoundingSet=

[Install]
WantedBy=multi-user.target
Enter fullscreen mode Exit fullscreen mode

Alt Text

We have just seen three options to modify the default behaviour that confers all privileges to all processes started and managed by systemd. The CapabilityBoundingSet option in the unit files can be used to decrease the capabilities that can be passed to a child process running as root and the AmbientCapabilities option is useful for giving specific privilege(s) to child processes that belongs to an unprivileged user.

The next post will explore the docker use case.

Discussion (0)