Reversing eBPF using IDA

Β· 597 words Β· 3 minute read

Reversing eBPF using IDA πŸ”—

eBPF was introduced in the Linux Kernel to add powerful monitoring capabilities. It allow to quickly hook any syscall, or any kernel or user land function, to produce statistics, logs etc… These eBPF programs are compiled in a particularly low-level machine code-named CO-RE (Compile Once - Run Everywhere) executed by a virtual machine inside the Linux kernel. eBPF is a RISC register machine with a total of eleven 64-bits registers, a program counter, and a 512 byte fixed-size stack. 9 registers are general purpose read-write, one is a read-only stack pointer and the program counter is implicit, we can only jump to a certain offset from it. The eBPF registers are always 64-bits wide.

But you can’t do what you want in an eBPF program. When you load an eBPF program, a checking step is performed, which is the target of most of the vulnerabilities on eBPF: For example, the checker will check arbitrary memory readings. To read memory in an eBPF program you need to use the helper function bpf_probe_read or bpf_probe_read_user.

Currently, there are 165 helper functions, used to perform a lot of different tasks. For example, you can write userland memory using bpf_probe_write_user, or send a signal using bpf_send_signal to the current process, or bpf_send_signal_thread or for the current thread (Interesting to create new joke).

It’s not surprising to see more and more security researchers using eBPF for offensive purposes:

  • bad-ebpf is a collection of eBPF programs to perform PID hide, process hijack …
  • ebpfkit is an entire rootkit implemented in eBPF
  • pamspy is a credential stealer

All these programs rely on libbpf. So we focused on how to reverse eBPF program loaded by libbpf, to know if it’s malicious or not.

As we are users of IDA, we want to produce a simple way to produce C code from a program that uses libbpf.

We used the last version of pamspy as a source to reverse.

Extracting eBPF code πŸ”—

eBPF programs handled by libbpf are compiled using llvm to produce an ELF binary.

The first thing is to find the ELF header (which can be easily obfuscated but it’s not the purpose of this blog post):

That’s indeed an interesting function:

This function initialises the libbpf’s structure to load the eBPF program: It declares its name, a pointer to ELF header, and the size of the program.

Here, we need to extract 4008 bytes to have the eBPF program.

Now we have our original ELF with eBPF bytecode inside.

Disassemble eBPF πŸ”—

Unfortunately, IDA will fail to open in because it doesn’t know the compiler ID 247. In IDA, processor plugins are in charge to load new types of architecture. Fortunately for us, It exist an IDA processor for eBPF : eBPF_processor. This is an up-to-date version of the one made by ClΓ©ment Berthaux for a challenge (I suppose for a SSTIC challenge ;-) ), with a lot of additions!

After loading this plugin, IDA will disassemble it perfectly using this new engine.

Decompile eBPF πŸ”—

The famous Hex-Ray decompiler is only available for a restricted set of processors while Ghidra decompiling engine supports a lot more. This is why we developed Yagi.

Yagi is a an intgegration of the Ghidra decompiler in IDA. But Ghidra, in the main branch, doesn’t support eBPF. But a security researcher implement the eBPF part for Ghidra : eBPF-for-Ghidra

In Yagi v1.5.0 we added support for eBPF.

So after adding 165 bpf helpers signatures to help in the decompilation process, here is the result :

Voila! Enjoy!

Ref πŸ”—