DEF CON 29 - PatH - Warping Reality: Creating and Countering the Next Generation of Linux Rootkits
Aug 5, 2021 17:39 · 7377 words · 35 minute read
- Good everyone, my name is Pat, I’m PathToFile on Twitter, Github, Discord, and most other places.
00:09 - And this talk is about Creating and Countering the Next Generation of Linux rootkits using eBPF.
00:15 - So today we’re gonna start with an overview on what Linux Kernel rootkits are, and we’re gonna cover why rootkits are such a powerful tool for attackers, but why they’re so dangerous to use.
00:26 - Next, we’re gonna introduce eBPF, and we’re gonna discuss on how it can enable an attacker to have all the best odds of a kernel rootkit without any of the risks.
00:36 - Then finally, we’re gonna cover how to detect and prevent eBPF based rootkits before they take over as the preferred rootkit type for attackers.
00:45 - So firstly, what are kernel rootkits? Well, once an attacker compromises a machine, they’re going to want to maintain that access.
00:54 - So perhaps they exploited a vulnerability in a web application or use some stolen credentials.
00:59 - These holes can be closed. And when they are, the attacker is gonna want a way to regain access to that machine, preferably with root privileges and preferably in a way that is undetectable to security systems or systems administrators.
01:13 - This is the role of a root kit. And in terms of access, there’s no better places to put a root kit, than in the kernel.
01:20 - When a program wants to list the files in a directory, it’ll use a syscall to ask the kernel to read the data from the hard drive on its behalf.
01:29 - If a rootkit can hook or intercept this call then it can simply remove any sensitive file from the directory listing before passing it back to the kernel.
01:38 - This same technique can be used to hide files, processes, network connections really cannot do anything to hide from user space programs.
01:48 - Now by living in the kernel the rootkit also has the ability to tap into old network traffic before any firewall and has the ability to launch processes with root privileges or alter the privileges of existing processes.
02:00 - So this all sounds incredibly useful for an attacker, so what’s the problem? Well, by running code in the kernel it’s very easy to turn a small mistake into a very big problem.
02:12 - There isn’t any guard rails or safety lines in the kernel.
02:15 - Once code is running there, it has the ability to read and write to almost anything.
02:21 - This means if there’s a bug in your code and you write to the wrong part of memory, you’re likely to crash the kernel and you crash the kernel you crash the entire system.
02:31 - And it could be even worse, if the kernel happened to be riding to a hard drive, when you broke things, you could end up corrupting that disc effectively breaking the entire system.
02:42 - Doing this will almost certainly bring in administrators and incident responders to determine what happened.
02:47 - So this is far from an ideal outcome for an attacker.
02:51 - Now, even if the rootkit developer was very careful, a kernel update has the ability to alter what hooked function looks like or what new kernel object looks like.
03:03 - And all of this just increases the likelihood of a disaster from occurring.
03:08 - This means that a rootkit developer often has to test the rootkit against every single kernel version that it plans to be deployed upon.
03:16 - So the good parts of kernel root kits sound really good for an attacker, but the risks are often too high to make it viable.
03:24 - If only there was a way to keep the only advantages of a kernel rootkit, but have the safety and portability of a user space program.
03:34 - So, how about we add JavaScript-like capabilities to the Linux Kernel? Now to some people, this quote might sound like the most wildest thing they I’ve ever heard.
03:45 - But when Thomas Graff from Isovalent made it, he wasn’t talking literally about putting JavaScript in the Kernel.
03:51 - What he was talking about was introducing a way to run a certain type of code that has the visibility of the Kernel, but with the ease safety and portability of such user space systems, such as JavaScript programs.
04:05 - And really what he was talking about was eBPF.
04:10 - So what is eBPF? So eBPF stands for, Extended Berkeley Packet Filtering, but it has grown so much from the original BPF, particularly in the last two years, that any comparison to this classic version isn’t really relevant today.
04:28 - So what it is, it’s a system within the Linux Kernel that allows you to create programmable trace points known as eBPF programs.
04:37 - These programs can be attached to network interfaces to observe network traffic or the entry or exit points of Kernel functions, including syscalls and can even actually attach to user space programs and functions.
04:51 - If this sounds like the same places as our kernel rootkit, you’d be correct.
04:56 - But unlike the Kernel rootkit eBPF programs are guaranteed to be safe from crashing the system.
05:02 - And they’re even portable across Kernel versions and even system architectures.
05:08 - So to explain how eBPF can achieve this, let’s have a look at how eBPF programs get written and loaded.
05:16 - So we’ll start with writing the eBPF program.
05:20 - So these are typically written in a restricted version of C or Rust, and there’s an example of one in the bottom left.
05:27 - So the programs have variables, loops, IF statements, all the standard parts of the language, but they’re heavily restricted in what external functions they can call.
05:36 - And it’s limited to only a number of BPF helper functions.
05:41 - Now, instead of compiling this code into native assembly, eBPF programs actually get compiled into what’s called BPF bytecode.
05:49 - Which is fairly simple, but straightforward instruction set, but the most important thing about this is that the BPF bytecode is independent of the architecture or kernel version that was compiled on.
06:02 - Now, once this bytecode has been compiled, it is now ready to be sent to the Kernel.
06:08 - So this code is sent to the Kernel using a user space program called a loader, which makes use of the BPF syscall to send it up into the Kernel.
06:18 - Now, technically non root users can load some eBPF programs on some kernel configurations, but these programs are extremely limited in what they can do, so they’re out of scope for this sort of talk.
06:31 - So for the purpose of this talk, this loading has to occur from either the root user or systems administrator.
06:39 - Now the Kernel just doesn’t blindly trust this bytecode, so if the kernel runs what’s called the BPF verifier, which checks every branch and every possible value of every possible variable in this code to make sure that it is not doing things such as trying to read in ballad memory or slow the system down, or by being too big or complex, or do really anything else, that might cause the kernel to crash.
07:07 - So this is where eBPF gets a safety guarantee because only code that passes all of the verifies extensive checks is allowed to actually be loaded and run.
07:18 - Now, once code has passed the verifier, the coder will then actually run a compile to convert the bytecode into the native instructions that match that machine’s architecture in kernel version.
07:30 - So if it’s an X86 machine or compiler it to 2X86 or if it’s (indistinct) for example.
07:36 - Now, by running this native instructions, eBPF code can run as fast as efficient as regular kernel code for that machine.
07:45 - But this is not only this, what this compiler does, it actually also dynamically looks up the addresses of the BPF helper functions or the any Kernel objects that the code is using, and it will actually patch the instructions to match that specific kernel version.
08:05 - And so this is where the code can be portable because this compilation step knows exactly what the format of that helper function and what that object looks like for that specific kernel.
08:16 - And so by patching the instructions as it goes to compile it, it means that that code will be specifically designed to run on that system.
08:25 - Now once these programs are compiled, they are attached to either the network interface or the kernel function that they need to be attached to where they will run once for every packet or function code.
08:38 - Now programs can’t retain state from one run to the next, but they can make use of a global key value store called an eBPF map.
08:46 - And so programs can actually store their state in that between runs, and then the next time it runs, it can read that state and then pick up where it left off.
08:54 - Now that is an extremely quick overview on what eBPF is, I’ll have links to much more, in-depth documentation at the end of this.
09:03 - But for now I wanna go into what are the attacker with the privileges to load and run eBPF programs can do? And how can they use it to achieve the same rootkit functionality as a regular kernel rootkit? So, the first thing we’re gonna cover is using eBPF to warp the network reality.
09:23 - So this is a diagram of a fairly standard web service set up, it has two network interfaces.
09:30 - So on the left, we have the internet facing network interface.
09:35 - Where a firewall only allows traffic to and from a website that’s listening on, say port 443.
09:42 - Now on the right, we have the administrators access.
09:44 - So this is via a separate network interface that is attached to an internal VPN network.
09:50 - And there’s an SSH server listing on this internal section.
09:54 - And that’s how the administrators, when they wanna access the machine they will go through the internal network and SSH onto this machine.
10:00 - But to make things interesting, let’s say this SSH connection requires multi-factor authentication.
10:07 - Now an attacker who’s gained access to this machine will probably want the ability to connect in from the internet, but they still wanna be able to gain the same privilege access from the host, that seems to be limited only to the internal VPN side.
10:23 - So eBPF programs have the ability to read and write all network packets across all interfaces.
10:29 - And before the firewall has the ability to block the connection.
10:32 - What this means is if there was a connection to come in from an attacker’s IP address, even to a closed port, it was coming from the internet.
10:41 - Then eBPF can actually alter both the destination and source IP addresses and ports to make it look like this traffic is actually coming from a fake IP address that matches the internal VPN side.
10:54 - It can then route the traffic into the internal interface, into the SSH service.
10:59 - And so for the prospective of SSH this looks like just a regular connection from the internal systems.
11:07 - Then not only will SSH see this traffic is regular, if an administrator is using tools such as Wireshark or Netstat or TCP Dome, then from the perspective of these tools, the network connection also only appears to be coming from the fake IP address on the internal network.
11:26 - And they will have no idea that the connection is actually being routed from the internet.
11:33 - Now this isn’t the only tactic eBPF can employ because it has the ability to read and write network packets.
11:41 - It has ability to see any network packets before any other system.
11:44 - So it has the ability to receive command and control information from even a port that nothing is listening on, and then it can just silently drop those pockets, so no security system will know that that come out of control data has actually reached the system.
12:00 - Then while eBPF can not create its own connections, it has the ability to clone existing packets.
12:07 - So it could close some existing traffic that some existing, legitimate traffic, that say going to the website, then alter the destination IP addresses to be the attacker’s IP address, and then alter the actual data within the packet to be whatever it wants, and then it can send it off to the attacker.
12:26 - So this technique could be used to exfiltrate arbitrary data from the machine.
12:31 - Then finally, eBPF programs can be attached to the user space programs.
12:37 - So for example, it could be attached to the website and hook into the functions that do the TLS encryption and decryption for the website.
12:46 - Now we’ll explain in more detail how that function hooking works in the next section, but what this does enable, is it actually enables eBPF to change out the data underneath the encrypted TLS connection.
12:59 - So that even from an external network monitor, it would only see legitimate TLS traffic going to it from the website.
13:10 - And it doesn’t actually know that eBPF might be reaching underneath VTLS and swapping out the website’s data to be some exfiltrated data from the system.
13:22 - So altering data across the network is only one type of malicious behavior eBPF can do.
13:28 - The real strength actually lies in its codal hooking functions and even syscall deception.
13:35 - Because it’s the disability that allows it to walk reality around files, processes, and even uses.
13:42 - So going back to my SSH example, it’s not enough to just be able to connect to the service.
13:48 - If logging on requires a valid password and multifactor authentication, then it’s unlikely the attacker we’ll be able to easily log on.
13:56 - But what if there was a way to make SSH ignore this multi-factor requirement or even the username and password requirement, and just allow anybody to log onto the system? SSH knows that there’s extra requirements such as multi-factor due to configuration files that are in the Etsy pam D folder.
14:19 - And when a user is going to authenticate a user’s name and password, it’ll look inside the Etsy password and Etsy shadow files to make sure that the supplied username and passwords are correct.
14:31 - So is there a way that eBPF can lie about the contents of all of these files? So, yes.
14:38 - And to explore how eBPF can do this, let’s first quickly revisit how you user space programs actually read files using the syscalls.
14:47 - So when a person wants to read a file, it actually make true syscalls to the Kernel.
14:53 - So the first is to open or open at, and this will check that the file that the program wants to open, actually exists and the user that’s wanting to read the file, is actually allowed to do so.
15:05 - Now, if they are, the Kernel will return, what’s called a file descriptor number or FD number, which is simply a reference to that file for that process.
15:15 - Then the process will make a second syscall this time to read where asking the kernel to read the file that matches the supplied FD number it got from the open call.
15:27 - And it’ll actually give a memory buffer to the kernel to fill in with the files, content.
15:33 - The kernel then look up that FD number, make sure it’s a valid number, then grab the file and I’ll copy the file of the data into that process as a buffer, before returning to the user space process.
15:45 - What this means is, if we have four different eBPF programs, we can observe what’s going on and we can actually watch, what is both being sent to and from these two different sets of syscalls.
16:01 - What this means is we’re able to track what FD number corresponds to what file name.
16:07 - And we can even actually read what the data is contained within the file before the youth space program, by reading the contents of the buffer after the Etsy call has as exited, which would be that eBPF program at the bottom.
16:21 - But reading buffers, isn’t the only thing eBPF can do, it can also write to them.
16:28 - So let’s have a look at this basic example.
16:31 - On the left is a very simple user space program, it is looking to open the file called read me, and then it’s asking the Kernel using the read syscall to read the data from that file into a buffer called buffer.
16:46 - Now on the right is the eBPF program that is attached to the exit of the read syscall.
16:53 - So after the file has been opened, and after the user space program, asked the Kernel to read the file, the kernel will read the file into the buffer but before the user space programs gets control again, our eBPF program is going run.
17:08 - So you can see at the start of this program that it’s using the BPF probe read user function to read the contents of that buffer, which at this stage we’ll include the file data.
17:20 - But, there was also a BPF program write user function.
17:25 - Now this program, this actually allows us to alter the data within that buffer, and then write it back into user space memory before the user space program sees it.
17:36 - What this means is once a eBDF program exits and controllers returned to the user space program, the program will think that the contents of buffer contains the file data, when in actuality it contains the fetched data been put there by eBPF.
17:52 - This BPF right user code can be used to override any user of space, buffer pointer, or string that gets passed into or out of the syscalls kernel functions.
18:04 - So things like changing what program getting launched by execve or reading an altering net like data heaps of different stuff can be possible with this call.
18:13 - So another thing eBPF can do is bypass the syscall altogether.
18:18 - And instead, just pretend that the function ran and return an arbitrary error code or return value.
18:26 - This can be done using the F mode read type of EPF program.
18:30 - Which while these can be attached to every function, they can be attached to every syscalls at least on newer kernels.
18:37 - So for example, the example in the top, right, this program simply pretends to write to a file and it returns the expected success code to indicate that the file was actually written, but the file is never actually written then the right syscall was never actually called.
18:56 - Now, if the goal is to prevent a process from discovering or stopping the rootkit, a more drastic option can be to simply kill the process.
19:05 - So by using the eBPF sends signal helper function, the program can send an unstoppable SIGKILL signal, which will immediately instruct the code on to start tearing down that process, regardless of whether it wants to what or not.
19:19 - Now, this is a pretty drastic action that could probably be noticed, but it’s certainly a possible way to prevent an action from occurring.
19:29 - Now, killing every process that attempts to open any file is gonna be a quick way to having a really bad time.
19:36 - So thankfully, eBPF programs have lots of ways to tailor an action based on who or what is performing.
19:43 - So eBPF programs can do different things based upon the process name, the user ID, the value of arguments being passed or returned from that function.
19:52 - And it can even take cues from other eBPF programs.
19:56 - So for example, it could only start tampering with the read calls from a file only after connection from a specific IP address has occurred.
20:10 - So looking back to our SSH example, to bypass multifactor eBPF can simply overwrite the data being read from the Pam configuration files to remove any mention of multifactor.
20:22 - It could then even overwrite the data in Etsy password and Etsy shadow, to insert a fake user account and password, which would enable logging in to the machine with a completely fake set of accounts.
20:36 - By having eBPF only target the SSH process, it means to an administrator using tools like cat or vim or a security tool, or even to file forensics looking at the actual file on disk.
20:49 - All of these will only show the normal unedited file.
20:53 - Only SSH will be presented with the warped reality version of this data.
21:00 - Okay, so now it’s time for some demo. The first demonstration that we’re gonna go through is the ability to replace text within the arbitrary files.
21:12 - So we can see in the shell on the top, right, that there is a folder with a file called file.
21:18 - And when we look inside this file, we see the text, this is real data.
21:23 - So now we’re gonna go ahead and load up the first of our eBPF rootkits into this shell on the left.
21:29 - So this rootkit is going to look for any type of process, that means the word file, and it’s gonna replace any data that contains the text real with the word fake.
21:39 - But specifically it’s not going to affect every process, it’s only going to affect children of this specific person’s ID.
21:47 - Which we can see, that matches with our shell on the right.
21:51 - So, we’re gonna go ahead, our rootkit is now started, and now back to our shell, when we go to read this file, we can see that the data has been changed and it says, this is fake data.
22:04 - And in fact, the log from the root kit actually says that it did detect that this process started, which was cut, and it replaced the text that that process read.
22:16 - But what is interesting if we go to this shell now on the bottom, right, because this is a different shell with a different process ID, even though we’re in the same folder, and we you look at the same file, it sees the unaltered data.
22:33 - So this is a technique that has many, many uses.
22:37 - The example that we’ve used in the presentation is this is a way that you can add a user into the Etsy password file, but only do that for an SSH process and not for any auditing software or system administrator, having you look at that file.
23:02 - So the next demonstration we’re gonna go through is the ability to stealthily enable a user to use Sudo to (indistinct).
23:14 - So typically on this machine, if we have a look at this shell in the bottom right, this is running as the user called Lonpriv.
23:22 - And if word priv wanted to become root using Sudo, what we can see…
23:29 - Oh Lonprtiv is not allowed in the sudoers file.
23:33 - And in fact, if we use the root shell on the top right, we can double and triple confirmed that lonpriv is not in this list.
23:42 - So first off we can use the sudo dash L command to say, hey, what privileges does Sudo believe this user to have? And when we run it, we see user Lonpriv is not allowed to run and in fact, we can even add this most basic level, just go and read the Etsy Sudo’s file and look for the user Lonpriv and sure enough, that user is not in there.
24:09 - So this user Lonpriv is definitely not able to become root user in Sudo.
24:16 - That is until we run the second of our eBPF rootkits on the left.
24:24 - So this one is going to look for any time, anything that is opening, specifically Sudo, is going to try to open the Etsy Sudo file.
24:34 - And it’s just going to alter the text in that file to say, hey, Lonpriv is actually in there and they do actually have those privileges.
24:42 - But it’s only going to do this, not only when just Sudo is running, but only when sudo is being run specifically from Lonpriv.
24:52 - So now this root could have started if Lonpriv were to use Sudo again and say, who am I? Well, look at this, it can become root.
25:03 - In fact, it didn’t even need to enter its password.
25:06 - So how does that work? Well, so if as Lonpriv, we now ask Sudo to tell us, hey what privileges do you think I have? Well, we can see that from when Sudo is running as Lonpriv it says, hey lonpriv has the ability to do anything it wants without even needing to add in any password.
25:32 - But even with this rootkit running, if we were to check these permissions as a different user, it still says, hey Lonpriv is no longer here, and if we check that file, it still says that Lonpriv is not in there.
25:46 - So this file is only altered, not only when it’s just a Sudo process running, but only when it is a Sudo process being run by that Lonpriv user.
26:03 - The last example that we’re gonna go through is the ability to kill arbitrary processes as a sort of self protection idea.
26:15 - So, typically an administrator can use a tool called BPF tool, and this lists, the running eBPF programs that are running on that system.
26:29 - And it’s got the ability to see what programs are running.
26:32 - So for example, we can see a lot of eBPF programs that actually relate to that system deed currently running on this system.
26:39 - And BPF tool has the ability to list the running processes, to dump out the instructions and even seeing what process IDs are actually related to that eBPF program.
26:53 - So, this is a good way for administrator to see what eBPF programs are running to potentially discover something like an eBPF rootkit.
27:03 - So we’re gonna load up a rootkit that is running on the left.
27:09 - And now, if we attempt to use BPF tool to list the processes, oh, the process just gets killed before that information happens.
27:18 - So this is pretty extreme, but this demonstrates that eBPF has the ability to sort of protect itself by just killing any process that it’s attempting to do any sort of investigation.
27:43 - So now we’re gonna cover some other features of eBPF and then get into some limitations of it.
27:48 - So three features that we haven’t yet covered, but I think are definitely worth mentioning.
27:53 - Firstly, on some network cards, you can actually run the eBPF programs on the the network kernel hardware itself, instead of in the kernel.
28:03 - So for regular developers, this is great because this can drastically increase the packet processing speeds.
28:09 - But from a rootkit perspective, this is interesting to know because what this means is that any packet alterations made by an eBPF program, that if this program is running on network card, this will occur after the Linux kernel has tentatively scanned that packet for anything malicious.
28:27 - So if you were to send a packet to add malicious IP address, but send it to a benign IP address, then only once it’s left the kernel and then inside the network card do you alter it to the dodgy IP address, well, any security system that’s running in the kernel won’t see that alteration.
28:45 - So, secondly, up until recently eBPF programs that are attached to Kernel functions or syscalls require their user space loader to continue to run once the program is up and running.
28:58 - And in fact, if that you user space loader where to exit, then the kernel would just assume that you also want to stop running eBPF programs and shut them down.
29:07 - Now, if newer kernels there’s been the introduction of these F entry and F exit type of BPF programs.
29:14 - Now these actually have the ability to be pinned to add this SIS SBPF folder, which what this means is a special file gets created under this folder one for each BPF program, and then for as long as this file remains there, the loader is free to exit, delete itself, be completely removed and eBPF of programs will continue to run.
29:36 - Then when you want to stop them, you just delete the files.
29:41 - So finally, the thing worth mentioning is as we said before, the BPF verifier put strict limits in how complex an individual eBPF program can be, but they exist in mechanism to chain multiple programs together using the BPF, using this helper function called a BPF tail code.
30:01 - Now this requires some preparation and making use of those eBPF maps we mentioned, but the end result is the system as a whole can be much more complex than what a single eBPF program is allowed to be.
30:14 - So for example the right, this is full code graphs that are from four different eBPF programs that actually just make up one half of that text, replacing rootkit that we just demonstrated.
30:26 - Now each individual program is pushing the limit on what a single eBPF program is allowed to do, but all combined the system as a whole can be much, much more complex.
30:40 - Now there’s a number of limitations to writing a rootkit using eBPF.
30:45 - So the first one is when using the BPF right user function to overwrite the buffer, there is a small window of time between when the syscall fills the data inside the buffer, and when eBPF overwrites it.
31:00 - Now this time, I know it doesn’t matter in single-threaded programs because the current execution is not gonna return to the user space program until after eBPF has done its thing, but in a multi-threaded program, a second thread could be constantly reading the contents of that buffer and actually get read what is the true data from the syscall before eBPF has a chance to tamper with it.
31:25 - Now, the first major issue do you using eBPF as a rootkit, is program’s don’t persist across the reboot.
31:33 - So what this means is when a machine restarts the user space loader is needed to run again to load and attach all those eBPF programs back into the kernel.
31:45 - Now, the second major thing is that eBPF programs caught right to kernel memory, because this would almost certainly break those safety guarantees of eBPF.
31:55 - What is means if a security tool is running in the kernel such as audit D or a Linux security module, these are gonna be unaffected by eBPF tampering.
32:05 - But one thing to note about these is that wireless security product might be running in the kernel they usually administrated by user space tools.
32:14 - So if a rootkit were to disable the security tool in the kernel, but then lie to the user space controller about the current running status of the system, that might be enough to fool the system into thinking that it’s more secure than it actually is.
32:33 - Okay, let’s talk about the defensive side of things.
32:36 - So we’re gonna start with file forensics. So if you’re looking to detect files that contain eBPF code, there’s a couple of things to think about.
32:44 - So for starters, the file that gets generated by the compiler, when compiling a program to BPF bytecode is actually an ELF file.
32:53 - Where the byte code is inside a named section within that ELF.
32:59 - So what this means is that tools such as read ELF or object jump can actually used to pass these files and extract the BPF bytecode.
33:08 - So for example, the eBPF program on the top right, is being attached to the execve trace point that TP syscall exactly in top.
33:19 - What happens is when this gets compiled, that TP syscall execve is actually becomes the name of the section inside the ELF that contains the raw bytes.
33:28 - So you can use a tool then to read that section and extract the eBPF bytecode.
33:34 - Now it’s important to note, that’s just the object from the compiler, that’s not the user space loader.
33:40 - Because what gets sent to the kernel is just the BPF bites.
33:45 - Now it’s important to note that a lot of loaders are gonna be written using this LIBBPF library.
33:53 - Because this is a library that is actually part of the Linux source tree and makes it a lot easier to read, and write and manage eBPF programs.
34:03 - Now, if a loader is using LIBBBPF, what this library will actually do, it’ll actually embed the entire ELF object from the compiler inside the users space loader.
34:14 - So you end up with an ELF inside of an ELF.
34:17 - And so therefore if you’re wishing to extract the byte code, you would first need to look at the read only data inside the load up, extract the ELF from there, and then possibly extract that data to find the correct section, to get the BPF bytecode.
34:31 - Now, once you do extract the byte code, the biggest thing to look for would be evidence of that BPF probe right user function.
34:40 - It might be more difficult to automatically tell if network packet altering program is malicious or not, but I can’t imagine too many legitimate use cases of this BPF pro right user function.
34:53 - Now I haven’t touched at all on what BPF byte code instructions actually look like, but the instruction code that comes out of the first compiler is that example on disk.
35:08 - So the 85, and then the 24 batches with the BPF probe write user function.
35:13 - But if you remember, the byte code is architectural and kernel version agnostic.
35:18 - And so what happens is when this bytecode gets sent into the kernel as we explained, the kernel will actually patch that 24 to match up with what is the correct address for that kernel.
35:28 - So if you’re looking at the byte code that’s being stored inside the kernel you would actually then to not need to dynamically look up that memory address to determine that is the appropriate user function.
35:40 - And then if you’re looking at the native card after the JIT compiler inside the kernel then this is definitely just gonna look like a regular kernel instruction.
35:48 - So you would need to look up that again, dynamically look up that memory address to determine that is the probe write user function.
35:59 - So to protect a running system, I think one of the strongest defenses would be to monitor that BPF, syscall which, you could use eBPF to monitor eBPF.
36:10 - So monitoring for what programs are loading and running eBPF programs, is gonna be a really good tactic.
36:17 - Because realistically, this should only be a small number of known programs that are actually interacting with eBPF.
36:25 - Now, if a program actually sounds suspicious because eBPF has intercepted the syscall, you could actually extract the program’s bytecode, and then send it to somewhere else to be analyzed where you could detect (indistinct) suspicious behavior in the code.
36:41 - So a lot of the things I’ve covered so far, assume that and eBPF rootkit is not already an installed and tampering with the system.
36:50 - Because if it is already running, it actually would have the ability to either block or hide a bunch of these user memory process scanning file, that a standing files or attempting to load an eBPF program.
37:03 - But even kernel rootkits have a hard time hiding from memory forensics.
37:07 - Which particularly if the machine is virtualized, the memory forensics can be acquired from underneath the kernel hardware, a hypervisor, or even the physical level.
37:18 - Volatility is the name of an excellent memory forensics tool.
37:21 - And in fact, at this years is black card, the team is releasing some new plugins specifically around acquiring and analyzing Linux tracing forensics.
37:30 - Now, as I’m prerecording this talk, I don’t know exactly what they’re gonna cover, but I did speak briefly with a number of the team members before I recorded this, and I’m really excited to actually get ahead a lot and play with the plugins that they’re producing, because they sound incredibly interesting.
37:48 - So one final prevention could be to just straight up disable any use of eBPF within the code.
37:56 - Now this requires you to recompile the kernel with the relevant flags disabled and by doing so, you’d lose all the advantages of using eBPF for your own reasons, but this is definitely an option for some.
38:10 - So a different additionally at the moment, there is some discussions going on within the community about how to cryptographically signed eBPF programs in the same way that you can sign kernel modules.
38:22 - Doing so would allow a system to load only trusted eBPF programs, but prevent unknown or untrusted programs from being loaded and used.
38:32 - Now, implementing this is definitely nontrivial, particularly due to that compilation step, but some smart people are really looking at this.
38:40 - And so in the future, this may end up being the best defense against eBPF base rootkits.
38:48 - Before we finish quickly, what else can eBPF do? So, firstly, eBPF now runs on Windows.
38:56 - So in May Microsoft released the start of a project on GitHub called eBPF for Windows.
39:02 - It’s in the early stages at the moment, it’s only got the network observability side of things and not the function hook or syscall hooking.
39:10 - But a lot of people including myself, are really interested in seeing how this project evolves.
39:15 - So if you’re interested in Windows at all, I would highly recommend checking out this project.
39:22 - Now, another thing I wanted to mention is that warping reality, isn’t just for attackers.
39:27 - So these same ideas around altering file or network data is also incredibly useful to reverse engineers.
39:35 - Either doing malware analysis or even bug hunting.
39:38 - So for example, it’s not uncommon for malware to perform a series of checks to determine if it’s actually running on a victim machine, or if it’s running in an analysis Sandbox, such as Cuckoo.
39:50 - Now, the malware will check things such as the number of CPU calls, the machine uptime, the number of files within the template folder.
39:58 - It might even actually look at the manufacturers, of the network cards to determined is that a real card or a virtual machine.
40:04 - So thanks to eBPF we can fake the responses to all of these questions.
40:08 - And in fact, we can fake them only for the malware, so we don’t accidentally break some critical piece of software that’s running inside the Sandbox.
40:19 - So now at the end of this talk, I’m gonna release a collection of eBPF programs and loaders that I’ve called bad BPF.
40:28 - So these programs demonstrate a number of the techniques we’ve discussed and demonstrated today, and they should have enough documentation and comments to help you understand exactly how they work.
40:39 - They cover a range of actions from hijacking execve calls to load arbitrary programs, allowing the user to become Sudo all the program that replaces the arbitrary text in arbitrary programs.
40:50 - Which because basically everything in Linux is a file, this can be used to hide kernel modules, adding fake users to Etsy password, or faking the Mac address from the network card.
41:04 - So we’ve covered a lot today and honestly, the internals of EBF I’m using eBPF defensively could be entire talks on their own.
41:12 - But, I hope you’ve at least learn things about how kernel rootkits are great, but they’re also incredibly risky.
41:19 - And I hope you’ve learnt how eBPF can remove that risk while keeping the same ability to hide data from administrators and provide backdoor access to a machine.
41:28 - And I really hope you’ve come away with some ideas on how to detect and prevent eBPF rootkit from being deployed.
41:34 - Because I think the safety and portability is gonna mean that we’re definitely gonna start seeing actual eBPF rootkit appear in world before too long.
41:43 - Now there’s a lot of links on this page, I think if you’re interested in eBPF, I would absolutely recommend checking out the community website in Slack, there’s a bunch of really cool people sitting on that Slack who are really great at helping people learn more about the system and answering any questions that people have.
41:59 - Now, there’s also been some other offensive eBPF talks in the past, if you’re interested in the offensive side.
42:05 - So including (indistinct) just the other day from the Data Dog people.
42:09 - So I would definitely recommend checking those talks out, if you’re interested in this.
42:15 - Finally, I’ve got some thanks. Thank you very much to Cory for being incredibly supportive as I’ve delved into the quarters of eBPF.
42:22 - And definitely thanks for Maybe for helping me workshops are more than ridiculous ideas that I had, when I was designing this talk and then definitely thank you to my family.
42:32 - This recording was done during the middle of a pretty hectic time, involving the pandemic and (indistinct) lots of fun.
42:39 - I’ve been very lucky to have a partner that supporting me as I ramble into a camera that is many, many, many miles away from DEFCON.
42:49 - So with that, I will end this talk with a picture of my dog, thanks for watching.
42:54 - I’ll be around on the Discord, if you have any other questions, otherwise, feel free to reach out to me on Twitter, email, GitHub et cetera.
43:02 - And thank you for watching. .