Capabilities

Published: 19-08-2018

Updated: 28-08-2018

By: Maxime de Roucy

tags: capability system

Les capabilities sont des “tags” qu’il est possible d’appliqué sur des processus et des fichier pour donner plus de droits au processus. Par exemple un processus possédant CAP_NET_BIND_SERVICE dans sa liste “effective” peut se binder sur des port inférieur à 1024, même s’il ne tourne pas en root.

Sources :

théorie

Voir la section algorithme pour que ça deviennent comprehensible.

thread/processus

man capabilities :

Each thread has three capability sets containing zero or more of the above capabilities:

Permitted
This is a limiting superset for the effective capabilities that the thread may assume. It is also a limiting superset for the capabilities that may be added to the inheritable set by a thread that does not have the CAP_SETPCAP capability in its effective set. If a thread drops a capability from its permitted set, it can never reacquire that capability (unless it execve(2)s either a set-user-ID-root program, or a program whose associated file capabilities grant that capability).
Inheritable
This is a set of capabilities preserved across an execve(2). Inheritable capabilities remain inheritable when executing any program, and inheritable capabilities are added to the permitted set when executing a program that has the corresponding bits set in the file inheritable set. Because inheritable capabilities are not generally preserved across execve(2) when running as a non-root user, applications that wish to run helper programs with elevated capabilities should consider using ambient capabilities, described below.
Effective
This is the set of capabilities used by the kernel to perform permission checks for the thread.
Ambient (since Linux 4.3)
This is a set of capabilities that are preserved across an execve(2) of a program that is not privileged. The ambient capability set obeys the invariant that no capability can ever be ambient if it is not both permitted and inheritable. The ambient capability set can be directly modified using prctl(2). Ambient capabilities are automatically lowered if either of the corresponding permitted or inheritable capabilities is lowered. Executing a program that changes UID or GID due to the set-user-ID or set-group-ID bits or executing a program that has any file capabilities set will clear the ambient set. Ambient capabilities are added to the permitted set and assigned to the effective set when execve(2) is called.
Capability bounding set

The capability bounding set is a security mechanism that can be used to limit the capabilities that can be gained during an execve(2). From Linux 2.6.25, the capability bounding set is a per-thread attribute. The bounding set is inherited at fork(2) from the thread’s parent, and is preserved across an execve(2). The bounding set is used in the following ways:

  • During an execve(2), the capability bounding set is ANDed with the file permitted capability set, and the result of this operation is assigned to the thread’s permitted capability set. The capability bounding set thus places a limit on the permitted capabilities that may be granted by an executable file.
  • (Since Linux 2.6.25) The capability bounding set acts as a limiting superset for the capabilities that a thread can add to its inheritable set using capset(2). This means that if a capability is not in the bounding set, then a thread can’t add this capability to its inheritable set, even if it was in its permitted capabilities, and thereby cannot have this capability preserved in its permitted set when it execve(2)s a file that has the capability in its inheritable set.

Réécrit succinctement :

Voir la section algorithme pour que ça deviennent comprehensible.

file

man capabilities :

Since kernel 2.6.24, the kernel supports associating capability sets with an executable file using setcap(8). The file capability sets are stored in an extended attribute (see setxattr(2) and xattr(7)) named security.capability. Writing to this extended attribute requires the CAP_SETFCAP capability. The file capability sets, in conjunction with the capability sets of the thread, determine the capabilities of a thread after an execve(2).

The three file capability sets are:

Permitted (formerly known as forced)
These capabilities are automatically permitted to the thread, regardless of the thread’s inheritable capabilities.
Inheritable (formerly known as allowed)
This set is ANDed with the thread’s inheritable set to determine which inheritable capabilities are enabled in the permitted set of the thread after the execve(2).
Effective
This is not a set, but rather just a single bit. If this bit is set, then during an execve(2) all of the new permitted capabilities for the thread are also raised in the effective set. If this bit is not set, then after an execve(2), none of the new permitted capabilities is in the new effective set. Enabling the file effective capability bit implies that any file permitted or inheritable capability that causes a thread to acquire the corresponding permitted capability during an execve(2) (see the transformation rules described below) will also acquire that capability in its effective set. Therefore, when assigning capabilities to a file (setcap(8), cap_set_file(3), cap_set_fd(3)), if we specify the effective flag as being enabled for any capability, then the effective flag must also be specified as enabled for all other capabilities for which the corresponding permitted or inheritable flags is enabled.

Réécrit succinctement :

Voir la section algorithme pour que ça deviennent comprehensible.

Notes, il n’est pas possible d’appliquer des capabilities sur un symlink. Les capabitilies appliquées à un hardlink sont appliquées aux autres hardlink pointant vers le même inode (i.e les capabilities sont appliquées sur l’inode).

algorithme

man capabilities :

During an execve(2), the kernel calculates the new capabilities of the process using the following algorithm:

> P'(ambient)     = (file is privileged) ? 0 : P(ambient)
> 
> P'(permitted)   = (P(inheritable) & F(inheritable)) |
>                   (F(permitted) & cap_bset) | P'(ambient)
> 
> P'(effective)   = F(effective) ? P'(permitted) : P'(ambient)
> 
> P'(inheritable) = P(inheritable)    [i.e., unchanged]
> ```
> 
> where:
> 
> * P: denotes the value of a thread capability set before the execve(2)
> * P': denotes the value of a thread capability set after the execve(2)
> * F: denotes a file capability set
> * cap\_bset: is the value of the capability bounding set (described below)
> * a privileged file: is one that has capabilities or has the set-user-ID or set-group-ID bit set.

# tools

[getcap](http://man7.org/linux/man-pages/man8/getcap.8.html) permet d'afficher les capabilities d'un fichier.

```console
max@mde-oxalide % getcap /usr/bin/true
max@mde-oxalide % getcap /usr/bin/named
/usr/bin/named = cap_net_bind_service+ei

setcap permet de set les capabitilities d’un fichier

max@mde-oxalide % sudo setcap cap_net_bind_service+ei /usr/bin/named
max@mde-oxalide % getcap /usr/bin/named
/usr/bin/named = cap_net_bind_service+ei

getpcaps permet d’afficher les capabilities effective, inherited et permitted d’un processus (mais pas ambiant et capability bounding set)

max@mde-oxalide % getpcaps $$
Capabilities for `18205': = cap_net_bind_service+i
max@mde-oxalide % getpcaps `pgrep named`
Capabilities for `2395': = cap_net_bind_service+eip

Pour connaitre toutes les capabilities d’un process il faut utiliser le fichier /proc/<pid>/status. capsh permet de convertir les mask dans un format lisible.

max@mde-oxalide % grep '^Cap' /proc/`pgrep named`/status
CapInh: 0000000000000400
CapPrm: 0000000000000400
CapEff: 0000000000000400
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
max@mde-oxalide % capsh --decode=0000000000000400
0x0000000000000400=cap_net_bind_service
max@mde-oxalide % capsh --decode=0000003fffffffff
0x0000003fffffffff=cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,cap_wake_alarm,cap_block_suspend,cap_audit_read
max@mde-oxalide % capsh --decode=0000000000000000
0x0000000000000000=

root@mde-oxalide # getpcaps 18675
Capabilities for `18675': = cap_net_bind_service+eip
root@mde-oxalide # grep '^Cap' /proc/18675/status
CapInh: 0000000000000400
CapPrm: 0000000000000400
CapEff: 0000000000000400
CapBnd: 0000003fffffffff
CapAmb: 0000000000000400

Dans le deuxieme exemple, on voit que les capabilities ambient et le bounding set ne sont pas affiché par getpcaps.

setpriv permet de modifier les capabilities d’un programe.

root@mde-oxalide # systemd-run --uid=1000 -p AmbientCapabilities=CAP_NET_BIND_SERVICE sleep 100
Running as unit: run-r61c601b7eb6e4110b41f2e66a68dd8bc.service
root@mde-oxalide # grep '^Cap' /proc/`pgrep sleep`/status
CapInh: 0000000000000400
CapPrm: 0000000000000400
CapEff: 0000000000000400
CapBnd: 0000003fffffffff
CapAmb: 0000000000000400
root@mde-oxalide # systemctl stop run-r61c601b7eb6e4110b41f2e66a68dd8bc.service
…
root@mde-oxalide # systemd-run --uid=1000 -p AmbientCapabilities=CAP_NET_BIND_SERVICE setpriv --inh-caps +net_bind_service --ambient-caps -net_bind_service sleep 1000
Running as unit: run-r65ff234603e7466497b55cd0766b5b46.service
root@mde-oxalide # grep '^Cap' /proc/`pgrep sleep`/status
CapInh: 0000000000000400
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
root@mde-oxalide # systemctl stop run-r65ff234603e7466497b55cd0766b5b46.service

pratique

PAM

Nous allons faire en sorte que les process démarrer au login de l’utilisateur “max” possède cap_net_bind_service dans leurs liste inheritable (méthode “non systemd”).

Avant les modifs:

max@mde-oxalide % ssh localhost
Last login: Mon Aug 27 17:53:49 2018 from ::1
max@mde-oxalide % getpcaps $$
Capabilities for `2871': =
max@mde-oxalide % logout
Connection to localhost closed.

Modifs:

root@mde-oxalide # cd /etc/pam.d
root@mde-oxalide # cp system-login /tmp
root@mde-oxalide # vim system-login
root@mde-oxalide # diff -Naur /tmp/system-login system-login
--- /tmp/system-login   2018-08-27 17:42:18.126690766 +0200
+++ system-login        2018-08-27 17:42:52.090086225 +0200
@@ -3,6 +3,7 @@
 auth       required   pam_tally.so         onerr=succeed file=/var/log/faillog
 auth       required   pam_shells.so
 auth       requisite  pam_nologin.so
+auth       optional   pam_cap.so
 auth       include    system-auth

 account    required   pam_access.so

root@mde-oxalide # echo 'cap_net_bind_service max' > /etc/security/capability.conf

Après les modifs:

max@mde-oxalide % ssh localhost
Last login: Mon Aug 27 17:54:26 2018 from ::1
max@mde-oxalide % getpcaps $$
Capabilities for `2887': = cap_net_bind_service+i
max@mde-oxalide % logout
Connection to localhost closed.

Attention, ça ne fonctionne pas si la session est géré par systemd. Ici tous va bien car la session est entièrement géré par ssh (le login ssh n’utilise pas pam_systemd.so).

Nous allons maintenant autorisé named (qui écoute sur le port 53) à se lancer en temps que “max”. On active le bit “effective” et on ajoute “cap_net_bind_service” à la liste inherited du fichier /usr/bin/named.

max@mde-oxalide % sudo setcap cap_net_bind_service+ei /usr/bin/named

Test :

max@mde-oxalide % echo '' | sudo tee /etc/security/capability.conf
max@mde-oxalide % ssh localhost
Last login: Mon Aug 27 17:53:49 2018 from ::1
max@mde-oxalide % getpcaps $$
Capabilities for `3165': =
max@mde-oxalide % named -g -c ~/.config/named.conf 2>&1 | grep 'permission denied'
27-Aug-2018 18:03:47.010 could not listen on UDP socket: permission denied
27-Aug-2018 18:03:47.019 could not listen on UDP socket: permission denied
^C
max@mde-oxalide %  
…
max@mde-oxalide % echo 'cap_net_bind_service max' | sudo tee /etc/security/capability.conf
max@mde-oxalide % ssh localhost
Last login: Mon Aug 27 18:02:08 2018 from ::1
max@mde-oxalide % getpcaps $$
Capabilities for `3233': = cap_net_bind_service+i
max@mde-oxalide % named -g -c ~/.config/named.conf 2>&1 | grep 'permission denied'
^C
max@mde-oxalide %

systemd

Nous allons faire en sorte que les process démarrer au login de l’utilisateur “max” possède cap_net_bind_service dans leurs liste inheritable. Comme la session gdm charge le module pam_systemd.so lors du login de “max”, de nombreux process on pour parent l’unit user@1000.service (1000 est l’id de mon user “max”).

Dans un unit systemd il est possible de set les capabilities ambient (AmbientCapabilities) et le capability bounding set (CapabilityBoundingSet). Modifier les capabitilies ambient modifie aussi les capabitilies permitted, inherited et effective. On ne peut pas set uniquement les capabilities inherited, c’est tout ou rien.

root@mde-oxalide # systemctl cat test.service
# /etc/systemd/system/test.service
[Unit]

[Service]
User=1000
ExecStart=/usr/bin/sleep 3600
AmbientCapabilities=CAP_NET_BIND_SERVICE

root@mde-oxalide # grep '^Cap' /proc/`pgrep sleep`/status
CapInh: 0000000000000400
CapPrm: 0000000000000400
CapEff: 0000000000000400
CapBnd: 0000003fffffffff
CapAmb: 0000000000000400

J’ai donc set les capabilities ambient (et donc inherited) puis supprimé les capabilities ambient en prenant soin de ne garder que les capabilities inherited.

root@mde-oxalide # systemctl cat test.service
# /etc/systemd/system/test.service
[Unit]

[Service]
User=1000
ExecStart=/usr/bin/setpriv --inh-caps +net_bind_service --ambient-caps -net_bind_service /usr/bin/sleep 3600
AmbientCapabilities=CAP_NET_BIND_SERVICE

root@mde-oxalide # grep '^Cap' /proc/`pgrep sleep`/status
CapInh: 0000000000000400
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000

Appliqué au service user@1000.service tous les service systemd de l’utilisateur “max” possède la capability net_bind_service en inherited.

max@mde-oxalide % systemctl cat user@1000.service
# /usr/lib/systemd/system/user@.service
…
# /etc/systemd/system/user@1000.service.d/ansible.conf
[Service]
ExecStart=
ExecStart=-/usr/bin/setpriv --inh-caps +net_bind_service --ambient-caps -net_bind_service /usr/lib/systemd/systemd --user
AmbientCapabilities=CAP_NET_BIND_SERVICE

max@mde-oxalide % systemctl show -p ExecStart,AmbientCapabilities user@1000.service
ExecStart={ path=/usr/bin/setpriv ; argv[]=/usr/bin/setpriv --inh-caps +net_bind_service --ambient-caps -net_bind_service /usr/lib/systemd/systemd --user ; ignore_errors=yes ; start_time=[n/a] ; stop_time=[n/a] ; pid=0 ; code=(null) ; status=0/0 }
AmbientCapabilities=cap_net_bind_service

max@mde-oxalide % grep '^Cap' /proc/`systemctl --user show --property=MainPID --value gnome-terminal-server.service`/status
CapInh: 0000000000000400
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000