- Information about all processes, even about short-lived ones, should be logged.
- We should have information about the full path to the executable file for all running processes.
- We, within reason, should not need to modify or recompile our code for different versions of the kernel.
- : - Kubernetes Docker, , / .
cgroup ID
. , , «» « ». , « », « », « », API, Docker . ID , . Docker .
Let's talk about common Linux APIs that can help with this task. In order not to complicate the story, we will pay special attention to the processes created using system calls
execve
. If we talk about a more complete solution of the problem, then during its implementation it is necessary, in addition, to monitor the processes created using system calls fork/clone
and their variants, as well as the results of the calls execveat
.
Simple solutions implemented in user mode
- Contacting
/proc
. This method, due to the problem of short-lived processes, is not particularly suitable for us. - netlink. netlink ,
PID
. ./proc
, . - Linux. — , . API . . . — , , , . , API ,
auditd
osquery
. , ,auditd
go-audit, in theory, can mitigate this problem. But in the case of enterprise-class solutions, you cannot know in advance whether customers are using such tools, and if they do, which ones. Nor is it possible to know ahead of time which security controls that work directly with the Auditing API are being used by clients. The second drawback is that the auditing APIs know nothing about containers. And this is despite the fact that this issue has been discussed for many years.
Simple kernel-mode debugging tools
The implementation of these mechanisms involves the use of "probes" of various types in a single copy.
▍Tracepoints
Using tracepoints (
tracepoint
). Tracepoints are sensors that are statically plugged into specific locations in the kernel during compilation. Each such sensor can be turned on independently of the others, as a result of which it will issue notifications in cases when the place of the kernel code is reached where it is embedded. The kernel contains several tracepoints suitable for us, the code of which is executed at various points in the system call execve
. This - sched_process_exec
, open_exec
, sys_enter_execve
, sys_exit_execve
. In order to get this list, I ran the commandcat /sys/kernel/tracing/available_events | grep exec
and filtered the resulting list using the information obtained from reading the kernel code. These tracepoints suit us better than the mechanisms described above, since they allow us to organize the observation of short-lived processes. But none of them gives information about the full path to the executable file of the process in the event that the parameters exec
are the relative path to such a file. In other words, if the user executes a command like cd /bin && ./ls
, then we get the path information in the form ./ls
, not in the form /bin/ls
. Here's a simple example:
# the sched_process_exec
sudo -s
cd /sys/kernel/debug/tracing
echo 1 > events/sched/sched_process_exec/enable
# ls
cd /bin && ./ls
# sched_process_exec
# ,
cd -
cat trace | grep ls
#
echo 0 > events/sched/sched_process_exec/enable
▍Kprobe / kretprobe sensors
Sensors
kprobe
allow you to extract debug information from almost anywhere in the kernel. They are like special breakpoints in kernel code that give information without stopping code execution. A sensor kprobe
, unlike trackpoints, can be connected to a wide variety of functions. The code of such a sensor will be triggered during the execution of a system call execve
. But I did not find in the call graph of execve
any function, the parameters of which are both the PID
process and the full path to its executable file. As a result, we are faced with the same "relative path problem" as when using tracepoints. Here you can, relying on the peculiarities of a particular kernel, "tweak" something. After all, sensorskprobe
can read data from the kernel call stack. But such a solution will not work stably in different kernel versions. Therefore, I do not consider it.
▍Using eBPF programs with tracepoints, with kprobe and kretprobe probes
Here we are talking about the fact that when some code is executed, tracepoints or sensors will be triggered, but the code of eBPF programs will be executed, and not the code of ordinary event handlers.
Using this approach opens up some new possibilities for us. Now we can run arbitrary code in the kernel when we make a system call
execve
. This, in theory, should give us the ability to extract any information we need from the kernel and send it to user space. There are two ways to get this kind of data, but none of them meet the above requirements.
- ,
task_struct
linux_binprm
. , , . ,sched_process_exec
, eBPF- , dentrybprm->file->f_path.dentry
, . eBPF- . . , eBPF- , , , . - eBPF . , . — API. — . (, eBPF,
cgroup ID
, ).
«»
-
LD_PRELOAD
exec
libc
. , . , , , , . -
execve
,fork/clone
chdir
, . ,execve
execve
. — eBPF- eBPF- , . - ,
ptrace
. -. —ptrace
seccomp
SECCOMP_RET_TRACE
.seccomp
execve
,execve
seccomp
execve
. - Using AppArmor. You can write an AppArmor profile to prevent processes from calling executable files. If you use this profile in the training mode (complain), then AppArmor will not prohibit the execution of processes, but will only issue notifications about violations of the rules specified in the profile. If you connect a profile to each running process, then we get a working, but very unattractive and too "hackish" solution. It is probably not worth using this approach.
Other solutions
I will say right away that none of these solutions meet our requirements, but, nevertheless, I will list them:
- Using the utility
ps
. This tool simply refers to/proc
and, as a result, suffers from the same problems as direct access to/proc
. - execsnoop, eBPF. , , ,
kprobe/kretprobe
, , , . ,execsnoop
, , , . -
execsnoop
, eBPF. —kprobe
, .
In the future, it will be possible to use the helper eBPF function get_fd_path, which is not yet available . After it is added to the kernel, it will be useful for solving our problem. True, the full path to the executable file of the process will have to be obtained using a method that does not provide for reading information from the kernel data structures.
Outcome
None of the APIs we've reviewed are perfect. Below I want to give some guidelines on which approaches to use to get information about processes and when to use them:
- —
auditd
go-audit. , . , , , . , , , - API , , . — , . - , , , , ,
execsnoop
. — . - , , , , , . , . , , eBPF- eBPF-,
perf
... A story about all this is worthy of a separate article. The most important thing to remember when choosing this method of monitoring processes is the following. If you use eBPF programs, check the possibility of their static compilation, which will allow you not to depend on the kernel headers. But it is precisely this dependence that we are trying to avoid using this method. Using this method also means that you cannot work with kernel data structures and that you cannot use frameworks like BCC that compile eBPF programs at runtime. - If you are not interested in short-lived processes and the previous recommendations do not suit you, use netlink capabilities together with
/proc
.
How do you organize monitoring of running processes in Linux?