- Published on
A Practical tour of eBPF in the Linux Kernel: Observability, Security and Networking

- Name
- Luca Cavallin
If you build systems or operate clusters and keep bumping into the limits of agents, iptables, or kernel modules, eBPF is the safer, faster, and more dynamic foundation you've been looking for. It lets you run small, verified programs inside the Linux kernel so you can observe and influence system behavior at runtime. In practice, that unlocks a new wave of observability, security, and networking tools without the operational risk of custom kernel modules.
What eBPF is - and why it matters now
eBPF began life as an evolution of the classic Berkeley Packet Filter, a tiny in-kernel bytecode engine that tools like tcpdump
used for fast packet filtering. Modern eBPF generalizes that idea: you write a small function in restricted C or Rust, compile it to bytecode, and ask the kernel to load it. Before the program is allowed to run, the verifier symbolically executes it to prove safety properties. Your code must never dereference invalid pointers, must bound its loops, must return the right kind of value for the hook it targets, and must always run to completion. If it passes, the kernel can JIT-compile the bytecode into native instructions, which is a big part of why eBPF programs are so fast.
Crucially, eBPF does not replace kernel modules - it gives you a controlled path to extend kernel behavior without writing or shipping a module. You load a program at runtime, attach it to an event - perhaps a network receive hook, a kernel function call, or a security decision point - and later detach it when you're done. You can even pin programs and data structures in a dedicated virtual filesystem (bpffs at /sys/fs/bpf
) so they survive beyond the lifetime of the loader process. Compared to modules, there's markedly less friction and far less operational risk. Combine that with the performance benefits of running in-kernel and avoiding context switches, and you get a technology that fits today's cloud-native reality: high-volume telemetry, low-latency policy decisions, and a need to move quickly without compromising safety.
Your first taste of eBPF
The basic rhythm of working with eBPF is simple: load, attach, observe, detach. The quickest way to feel that loop is with a tiny demo attached to something you can immediately see, like process execution.
Here is a zero-setup bpftrace
example that counts execve()
calls per process name:
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_execve { @[comm] = count(); }'
If you want a slightly deeper look using BCC (the BPF Compiler Collection), this Python snippet ties a small in-kernel program to a kprobe and keeps a per-PID counter in a map:
from bcc import BPF
prog = r"""
BPF_HASH(exec_count, u32, u64);
int on_execve(void *ctx) {
u32 pid = bpf_get_current_pid_tgid() >> 32;
u64 *val = exec_count.lookup(&pid);
u64 one = 1;
if (val) { (*val)++; } else { exec_count.update(&pid, &one); }
return 0;
}
"""
b = BPF(text=prog)
b.attach_kprobe(event="do_execveat_common", fn_name="on_execve")
print("Counting execve() per PID... Ctrl-C to stop.")
try:
b.trace_print()
except KeyboardInterrupt:
pass
for k, v in b.get_table("exec_count").items():
print(f"PID {k.value}: {v.value}")
These examples introduce BPF maps, the in-kernel key/value stores that hold shared state across events. Maps are what make eBPF practical: they store counters and caches, carry configuration and policy, and provide ring buffers that stream events to user space.
From source code to running inside the kernel
An eBPF program feels like a tiny, specialized function. You choose a hook and write a function with a signature that matches its context - xdp_md
for XDP, the record for a tracepoint or kprobe/fentry, or the socket buffer for networking hooks. You compile with clang -target bpf
, producing an object that contains instructions, map definitions, optional debug info, and often BTF (BPF Type Format) metadata. It's common to inspect that object with bpftool
or llvm-objdump
to check what you built and to see roughly what the verifier will reason about.
Loading the program triggers verification. If the checks pass, the kernel may retain the original bytecode, a translated representation used for verifier analysis, and a JIT-compiled native image tailored to your CPU. You then attach the program to an event: at XDP on a network interface to act before the stack; at TC inside the stack for classification and shaping; at fentry/kprobe for tracing function entry; at tracepoints for stable event sites; or at LSM hooks for security decisions. For attachment lifetime, modern practice prefers BPF links, which keep programs attached even if your loader exits; you later detach by unlinking the handle. If you need programs and maps to outlive your process, pin them in bpffs under predictable paths. And when you write XDP code, always remember the essential bounds checks: validate headers using the xdp_md->data
and xdp_md->data_end
pointers before you touch anything in the packet.
Maps, event streaming, and discoverability
Maps come in many flavors - hash, array, per-CPU variants, LRU caches, LPM tries, queues and stacks, and the dedicated ring buffer map. Older examples often rely on perf buffers via perf_event_open()
to stream events into user space. Newer code tends to use the ring buffer, which simplifies coordination by using a single file descriptor and a straightforward producer/consumer model. Operationally, bpftool
is invaluable for listing loaded programs, inspecting immutable tags, dumping translated and JITed instructions, creating and examining maps, and reading BTF data that describes types and function prototypes. Pinning programs and maps into bpffs at /sys/fs/bpf
makes them easy to find across processes and restarts.
Program types and where they hook in
Each eBPF program has a type, and that type dictates the context you receive, the helpers you may call, and the return codes that are legal. In tracing, you have kprobes and kretprobes that follow kernel functions, tracepoints that the kernel exposes as stable events, and BTF-enabled fentry/fexit hooks that attach to function entry and exit with lower overhead. You can instrument user space with uprobes and uretprobes, and you can make fine-grained decisions in the security path by attaching BPF LSM programs to kernel security hooks. On the networking side, XDP sits in the driver's receive path where you can parse headers and decide to pass, drop, redirect, or transmit packets, while TC lets you classify and shape traffic within the stack. Socket and cgroup hooks bring policy close to processes, and the flow dissector and other subsystems provide more specialized entry points. The headline benefit for networking is predictable latency: replacing sprawling iptables chains with compiled datapaths that implement service load-balancing and network policy enforcement.
CO-RE, BTF, and libbpf: portability without per-host builds
Compiling on the target host is great for exploration, but it's painful for production. CO-RE - Compile Once, Run Everywhere - solves this by using BTF type information to adapt a precompiled object to different kernels at load time. Most modern distributions publish a canonical BTF file at /sys/kernel/btf/vmlinux
. Your eBPF object contains relocation records where it references kernel types or fields. When you load it, libbpf resolves those relocations against the host's BTF so that structure layout differences don't break your program. A common workflow is to generate a vmlinux.h
header from BTF and then compile with -O2 -g -target bpf
so the loader has all the information it needs. For example:
bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h
clang -O2 -g -target bpf -c prog.c -o prog.bpf.o
Libbpf also generates "skeletons", tiny C headers that wrap your maps, programs, and attachments, making the user-space loader almost declarative: open, load, attach, read. The practical result is powerful: you can ship a single artifact that provides reliable observability across heterogeneous Linux kernels with a minimal operational footprint.
The verifier, explained plainly
Think of the verifier as a meticulous safety reviewer that makes custom in-kernel code viable in production. It symbolically executes your program, tracks pointer provenance, and proves that all memory accesses are within bounds. It insists that you check pointers before dereferencing, that your loops are provably bounded or unrolled, that you call helper functions with arguments of the proper types, and that you return the correct value for the hook you're attached to - XDP_PASS
, XDP_DROP
, XDP_REDIRECT
, or XDP_ABORTED
in the case of XDP. It also checks the license string in your object because some helpers are reserved for GPL-licensed programs. When verification fails, ask for a verbose log; it reads like a conversation with a very picky reviewer and quickly teaches the idioms the kernel accepts.
Real-world payoffs for observability, security, and networking
For observability, eBPF can watch system calls, file opens, credential changes, and kernel function invocations; enrich events with process lineage, cgroup and container identity, and namespaces; and stream this data to user space in near real time. You get rich insight without patching applications or altering their configuration. For security, BPF LSM puts allow/deny decisions directly in the kernel's security hooks with awareness of container and workload identity, not just uid/gid. Tooling such as Tetragon blends deep tracing with policy so suspicious behavior is detected and stopped early. For networking, XDP handles early fast-path decisions - packet drops, redirects, basic forwarding - while TC applies classification and shaping in the stack. Together, they enable eBPF-based CNIs that implement service load-balancing, network policy, and even coordinate node-to-node encryption with lower, more predictable latency than long iptables chains.
When not to use eBPF
eBPF is powerful, but it isn't the right choice for every problem. If a user-space hook or library call satisfies the need, keep it simple. If the event rate is low and latency is not a concern, the complexity of an in-kernel program may not pay off. If you require long-running or blocking work that doesn't fit within the constraints of eBPF (outside specific sleepable program types), push that logic to user space. And if you're on legacy kernels without headers or BTF, you may need to upgrade, install the BTF package, or ship minimal BTF data before you can benefit from CO-RE.
Common pitfalls and how to debug them
Most stumbling blocks fall into familiar categories. Verifier rejections usually mean you need more explicit bounds checks, fewer ambiguous control-flow paths, or smaller/limited loops; avoid pointer arithmetic on untrusted data without checks. If your system is missing BTF, install the kernel's BTF package or ship a minimal BTF and regenerate vmlinux.h
. Hitting map resource limits often points to RLIMIT_MEMLOCK
or overly large maps; right-size them and consider LRU variants for caches. For attachment lifetime, prefer BPF links so programs remain attached even if your loader exits. And if performance looks suspiciously low, confirm that the JIT is enabled with bpftool feature
and turn it on before benchmarking.
The bpf() syscall and the surrounding plumbing
Although libraries hide the details, everything flows through the bpf()
syscall. You use it - typically via libbpf or another wrapper - to create maps, load programs, and establish attachments. Older tracing code may also call perf_event_open()
to connect perf events to your program; modern approaches lean on BPF links for attachment lifetime and on ring buffers for streaming data. Reading results is straightforward: consume events from the buffer or perform a map lookup. And because programs and maps can be pinned into bpffs at /sys/fs/bpf
, they are easy to discover and reuse across processes and restarts. As your systems grow, BPF-to-BPF calls help you refactor logic across functions, and globals in .rodata
or .bss
let you tweak behavior at load time without recompiling.
Summary
eBPF is a safe, high-performance mechanism for running custom logic inside the Linux kernel. You compile a small program; the verifier proves it safe; the kernel JITs it; and you attach it to events to change system behavior at runtime. Maps carry shared state, ring buffers stream events, BTF and CO-RE provide portability, and libbpf plus skeletons make loaders small and robust. With hooks across tracing, networking, and security - XDP, TC, kprobes/fentry, and LSM - you can instrument applications without touching their code or configuration and enforce precise policy with rich context. From Kubernetes data planes to syscall tracking and preventative security, eBPF has moved from niche trick to mainstream platform. The next step is hands-on: pick a tiny demo that resonates, run it, and iterate - load, attach, observe, detach - until the patterns become second nature.