DEF CON 29 - Jeff Dileo - Instrument and Find Out: Parasitic Tracers for High Level Languages
Aug 5, 2021 17:39 · 4093 words · 20 minute read
- Hi and welcome to Instrument and Find Out Writing Parasitic tracers for High Level Languages.
00:06 - I’m Jeff and I’m at NCC group I like to do, hack on stuff and do various things for the purpose of this talk.
00:12 - That means programs, languages, runtimes, memory, and bytes.
00:16 - But first up, a notice. By viewing this presentation, you agree to indemnify and hold harmless the presenter in the event you decide to take any of his advice and find yourself unable to sleep at four in the morning due to “language demons”.
00:27 - So just as an outline of the structure of this talk, I’m gonna talk about kind of the background of what lead me to this work, what parasitic tracers are, how to kind of design them for tracing high level language runtimes, looking at Ruby as some sort of case study, and then some concluding thoughts.
00:48 - So first about me, I’ve done a fair amount of work with the dynamic instrumentation and tracing from a Java Bytecode, various stuff in Android, Linux, both user land and in the kernel with eBPF.
01:02 - Generally I do a lot of this stuff mostly for reversing and learning stuff and also to kind of script up existing things to do other things.
01:12 - So for dynamic instrumentation, just as a quick refresher, this generally means function hooking or instruction instrumentation.
01:21 - The latter of which mostly means that you kind of modify bytecode or assembly to do be different by coder assembly to do something different.
01:30 - Whereas function hooking generally you are doing something that hijacks control flow directly to go somewhere else.
01:38 - Dynamic tracing can refer to dynamically enabling or disabling existing logging functionality, but for our purposes this mostly means adding enhanced logging functionality that wasn’t there before.
01:49 - I’ve also been recently doing tracing with Frida for Ruby which is what this talk is about.
01:56 - So some background on Ruby and myself. Little while back, I had to do some Ruby bytecode transformation stuff and convert more modern bytecode to an older one, an older format so translate newer op codes into equivalent older ones.
02:14 - So that a decompiler that only knew the older format would work.
02:18 - And that worked quite nicely for me at the time.
02:20 - More recently a colleague and I were looking at Ruby’sdRuby protocol.
02:26 - We were writing a scanner for it in Ruby of all things.
02:31 - We gave a talk in this in North Sec earlier this year.
02:35 - There were some weird issues that came up and I spent a lot of time debugging this and going through the Ruby internal source code and see to find out you don’t want to call IO#read on a Socket object.
02:47 - Instead you just want to call receive. This lead me to start writing this parasitic low level Ruby tracer.
02:54 - So what are parasitic tracers? Well what are tracers? So a tracer is an enhanced logger that basically dumps everything you might want about program state running code et cetera.
03:06 - And a parasite is a highly-specialized unwanted organism that symbiotically lives on or off of inside of another organism that is completely adapted to.
03:16 - So a parasitic tracer is a combination of these two.
03:19 - It’s basically a tracer that’s specially adapted to he target process, that it kind of hooks onto and injects itself into.
03:25 - And makes use of internal functionality that wasn’t really intended to be you know, accessible.
03:30 - So the tracing of this part is just kind of a goal.
03:33 - I want to write a tracer for Ruby to better understand it, but the parasitic part is more of an implementation detail.
03:40 - Chances are, you’ve done this if you’ve ever used LD_PRELOAD to inject code into something.
03:45 - So why would you write these things? Well to get a better understanding of where the higher level extractions meet the lower level implementations in say runtimes and things.
03:54 - So for reversing, or debugging, or just plain performance analysis.
03:58 - You could also be writing one of these things mostly to avoid having to maintain a fork of the actual code base if you kind of want to maintain the tracer out of tree.
04:08 - Because you can just do it on the process itself and not have to recompile it against the whole code base.
04:14 - So some examples of these parasitic tracers would be Frida’s Java Bridge API which is actually really two of them.
04:21 - One for Android and one for the JVM itself.
04:24 - They provide basically an API for hooking into higher level Java operations, but in ways that weren’t really intended to be allowed by the platform.
04:34 - So in Android it’s totally hooking the runtime and for the JVM it’s using some of the JVM instrumentation APIs, but it’s definitely doing stuff that doesn’t involve those in weird ways.
04:46 - And so whereas a normal vanilla Java agent that uses those things wouldn’t really qualify as a parasitic tracer because it’s using kind of public APIs specifically for this purpose.
04:58 - The way the Frida does it is a little bit more invasive.
05:02 - But let’s just say that if you’re crawling around the memory of a process or intercepting its syscalls, chances are you just have a tracer.
05:07 - But, you know like S trace. But if you are hooking around in functions inside the process itself or really calling functions from inside the process then you’re doing some parasitic stuff.
05:18 - So let’s talk about designing these things for high-level language runtimes.
05:21 - So first some prereqs, you’re gonna need some means to actually hook the code or instrument it.
05:26 - Generally, ideally, one that allows you to kind of remove those hooks or re-add them at runtime.
05:32 - You could do this with a debugger and breakpoints especially a scripted debugger.
05:35 - You could do this with and instrumentation toolkit like Frida which is what I generally do these days.
05:41 - You also need a way to invoke existing functionality that’s in that code, in that process.
05:46 - So generally speaking you do that with a debugger or with Frida.
05:50 - The debugger would be something like the expression syntax for calling functions.
05:57 - But the thing is , you need to know what you’re gonna call So the hierarchy of how you wanna be preferring things is ideally public APIs that aren’t going to change all that often.
06:08 - Then internal APIs with symbols. Then internal APIs that don’t have symbols but that you can get handles on fairly easily.
06:16 - Say if the pointers to them are passed into other functions you can just catch them there.
06:21 - Then after that, you’re probably just gonna want to opt for re-implementing stuff locally yourself.
06:25 - And then finally all the way at the end, you know, if you need to reuse existing code that’s inside the process that you can’t get a find a good way to find.
06:33 - You might need to search for bytecode sequences and match on them.
06:38 - But moving on, the first step to owning a target is recon and that is the first step to designing a parasitic tracer.
06:45 - And you’re going to need to be doing some reverse engineering.
06:49 - Really to understand the internals of what is you’re gonna be mucking around with.
06:53 - And as you do so, you’ll learn more. But you may actually have sources like I was looking into Ruby, C Ruby.
07:01 - But you still need to know what’s actually going on in native level especially with the way that your instrumentation itself works because at that point C doesn’t really matter anymore.
07:10 - And optimizations can elide out functions or can lead to weird situations where garbage values are sent to functions that don’t process on them anyway and we need to be careful when handling those inputs stuff like that.
07:25 - And then additionally just all these kind of runtimes heavily rely on like super implementation defined behavior.
07:32 - And so you need to be really careful about how you’re interacting with their code from your code.
07:37 - After that you’re gonna wanna identify all of the things you’re gonna wanna hook on or call into to build up all of your, whatever it is you’re going to get out of the runtime or the language.
07:49 - And then next is actually doing all of the hooking and calling of those things.
07:54 - You’re gonna hook all the functionality. You’re gonna extract all of the relevance data that you can get.
07:58 - You’re gonna start invoking you know, function calls that are in the thing to get other pieces of data out of it, et cetera.
08:05 - And then after that you’re gonna kind of bring it all together and orchestrate all that in what I like to call puppeteering to bring together all your hooks, have them coordinate with one another.
08:16 - Possibly be managed by some sort of injected thread or whatnot.
08:21 - But at this point, you’re mostly building up from there to have better interop between your own hooks and better interop between the actual platform you are messing around with.
08:32 - So in this case for me it was Frida which is Javascript.
08:35 - So basically a Javascript to Ruby bridged more or less.
08:39 - So ideally you start small and build big. You compose together a larger set of hooks from a smaller set of modular pieces.
08:48 - You’re in a good position to do this because you’re hooking on to a full program that already exists and runs on its own.
08:54 - So mostly you just need to make sure that you don’t break it with what you’re doing and injecting into it.
08:59 - But other than that, the thing will continue to run on it’s own just fine.
09:03 - So the next thing about this layering stuff is that you can take advantage of you know, layering on abstract calls that are implemented with versions specific behaviors.
09:17 - So for example, if you have a pointer to obstruct between two different versions of the binary the field you want that’s inside of that strucks is at different offsets you need to have some functionality to be able to handle that.
09:29 - But the pointer to the start of it is still the same.
09:32 - So you can do that from per-version implementations or version-based switches kind of like ifdefs or both.
09:39 - But let’s talk about Ruby. So Ruby is a scripting language, that’s right.
09:44 - The most interesting thing about Ruby is that it’s super object oriented and every time you try to access something on an object that’s actually a method call and all the method calls are basically handled via sending messages.
09:59 - Ruby is super feature full, but it doesn’t have really and good low level introspection tracing capabilities.
10:06 - It does have this thing called TracePoint which is an API for various events that go on as Ruby executes.
10:13 - But it can’t really intercept method arguments or native function parameters.
10:17 - It can’t really provide information on bytecode execution.
10:20 - And it doesn’t really provide all that useful information from any time you’re switching back and forth between Ruby and native code.
10:27 - This is mostly an artifact, with the fact that Ruby is a language and CRuby is an implementation and so this bytecode stuff, this lower level stuff are kind of implementation details.
10:38 - And this API needs to theoretically work across multiple different Ruby implementations.
10:43 - But really the CRuby implementation should have better tracing stuff given that it basically functions similarly to Java. .
10:51 - And Java has a very well defined and extensive API for instrumentation.
10:57 - So I wrote this thing called ruby-trace which is a Frida-based CLI tool for instrumenting Ruby and kind of dumping everything that goes on as it executes.
11:07 - So it hooks all the opcodes. The interesting thing about that is the implementation of the opcode handlers are kind of all a bunch of labeled go-to spots and a giant state machine.
11:18 - They’re not really their own function so they don’t have your standard calling convention preludes.
11:25 - And then separately Ruby has a bunch of C functions to call Ruby methods And do a bunch of stuff about handling the methods.
11:33 - And tying them to objects, both native code to Ruby and Ruby back to native code calls.
11:38 - So I hook all that stuff and I hook all the transition between Ruby and native code.
11:43 - And then hook those native functions and et cetera.
11:47 - And then separately it supports kind of hooking into Ruby’s internal exception handling mechanisms.
11:55 - What is pulls out of that is basically all the arguments of all kinds even the special internal ones for the opcodes.
12:03 - And then it basically, Ruby inspects everything which is a stringification.
12:07 - Kind of like repr and python. The one problem with that is many times values aren’t fully initialized or with the Ruby VM itself isn’t fully initialized.
12:17 - You need to be very careful about how to try to call things on things that aren’t fully initialized.
12:22 - So it handles a lot of that. Trying to be very careful about when it’s safe to actually send the inspect method over and doing alternative fallback approaches when it can’t.
12:33 - It dumps out the bytecode whenever it sees something like a method or a block being defined.
12:40 - It dumps all the return values for opcodes in the native functions it hooks.
12:43 - It gets all sorts of other metadata and takes all sorts of things to make it human-readable.
12:49 - It supports Ruby 2. 6-3. 0 and I assume once 3. 1 comes out it won’t be too much effort to get it working on 3. 1.
12:59 - I have a sort of generic implementation with a couple of versions specific behaviors and switches and then for separate lower level, anytime I need to deal with Ruby structs from C I just have a version and specific set of structs to pull fields from essentially using Frida’s C module API.
13:18 - So other cool things that it does is it actually makes use of the TracePoint API but not in the way you would expect.
13:23 - It’s just the TracePoint API has a very good way of controlling whether or not it’s enabled based on various aspects.
13:31 - And so whenever the TracePoint API is enabled that turns it on.
13:36 - And whenever it’s turned off, it turns it off.
13:39 - It gives you fine-grained control to very minutely trace certain pieces of execution.
13:45 - I have a bunch of test cases for various bytecode sequences that seem to cover a greater span of more detail of hedge cases than Ruby’s own internal opcode test suite.
13:56 - Although not necessarily some of the other ones.
14:00 - I also implement support for dead Ruby opcodes that shouldn’t even exist anymore for some reason.
14:06 - But basically RubyTrace is kind of like its own CRuby bytecode interpreter because of how it works.
14:12 - So as a demo, let’s just switch to this view.
14:18 - So I have some Ruby code here that defines TracePoint, tracer and then in the middle of this big block is actually a stringified block of this Ruby code With this Foo method and then some calls into it.
14:32 - And then it redefines symbol to redefine its triple equal operator, then calls a lot of those same things over again.
14:39 - And then it compiles that code from the string and the evaluates it under the tracing.
14:49 - So in this case the tracer that’s being used doesn’t really do anything.
14:52 - So when we run this code it just kind of spits things out.
14:55 - The more interesting thing is that the not found:wat on the left side gets replaced with a symbol on the right side because that triple equal when it hits the comparison for the symbol.
15:05 - It instantly matches stuff so the wat string will on the last case will hit the symbol check against foo, symbol foo and that will just pass.
15:15 - So now let’s run this under RubyTrace.
15:30 - And basically RubyTrace dumps out a whole bunch of stuff.
15:34 - One of the first things you can see is that the instruction sequence part from that compilation.
15:41 - You see it there and then you see the call to the eval on it.
15:44 - And then where inside of that, the first thing that happens is the foo method is defined.
15:48 - And so it dumps out all of the bytecode of that foo method.
15:54 - We can see a bunch of values from it. Next it adds that to the class it’s in.
16:02 - Then we see the first call into foo from the hello string.
16:08 - And then we run through that check operation.
16:11 - So the first thing that happens we see is a call into this opt case dispatch which is a special case bytecode generated for switches that don’t have special types in them, only simple types.
16:25 - And basically it optimizes so that all of the cases get added into a single Ruby hash and then it checks if the value is a member of the hash.
16:35 - But it first checks a bunch of things about the object coming in to make sure it’s a simple type that the comparison would work in the first place.
16:41 - So in this case hello is the simple string.
16:45 - It’s in there. It takes the hello path. We move on.
16:49 - The next thing is we have one. It takes the one path and so forth.
16:55 - But then we eventually see this big decimal 3. 0 value which you’ll see represented variously as 0. 3E1.
17:06 - And that thing will get passed into foo. And the problem is, is that because it is not a simple type. It will fall through.
17:17 - And the way that this works is that the optimization is just a quick check first.
17:21 - And it is kind of a guard on top of what the rest of the switch implementation will be, which is the series of subsequent if else checks.
17:28 - That’s just how Ruby does it. And so it falls through and then starts doing all of the if else’s.
17:34 - It doesn’t match like a string, it doesn’t match a whole bunch of stuff.
17:37 - And then eventually we see a bunch of operations where it’s trying to compare against the float value and the big int has to do a bunch of math conversions to get the stuff out for the comparison.
17:51 - And so then it eventually does the comparison and sees that, you know, it 0. 3E1 is equal to 3. 0 float.
17:59 - And so then it check match passes. That’s the comparison for the branch.
18:03 - And then I jumps to the code that is part of a segment of the branch.
18:07 - We continue doing all this for the rest of the values.
18:11 - We see in this case the string wat doesn’t match anything so it ends up in the else path.
18:16 - But it was a simple type so it actually goes to the else path directly.
18:21 - And then we see the code that redefines symbols, triple equal method.
18:27 - And from this point on, things are gonna get a little bit weird.
18:30 - So we start seeing that all of these opt case dispatches end up falling through because triple equals been redefined.
18:37 - And so basically there’s a short circuit in the implementation where Ruby says, “well if any of these core equals things has been redefined on any of the core types, such as symbol, it just can’t bother to do any comparisons anymore one way or the other. ” And it’s faster to just give up and have them go through all of the checks one at a time.
18:56 - And so we run through this all one after another.
18:59 - And then we get to the end with all the values.
19:02 - And they get percolated up from all the functionality.
19:09 - So Future Work. I have to implement support for Ractors, Ruby’s new multi VM in process concurrency model.
19:19 - Right now I’m just relying on the one global Ruby VM internal of the process.
19:24 - And then just generally keeping up with the Ruby versions.
19:26 - The code will be available here at our github repo shortly after this presentation airs.
19:32 - But in conclusion, it’s been really fun working on this.
19:35 - Although it’s been pretty tiring because of all the tiring that goes on with Ruby and various other things that can fail when you’re messing around in it’s insides.
19:43 - But I think that all of these techniques pretty much apply to other high-level languages and runtimes.
19:49 - Some good examples are Python, Node, Golang, and Haskell.
19:53 - And I really think people should be trying to build some of these things.
19:57 - So to paraphrase Arlo Guthrie, “You know, if one person, just one person does it, they may think he’s really sick.
20:10 - And three people do it, three, can you imagine three people writing parasitic tracers? They may think it’s an organization.
20:17 - And can you, can you imagine 50 people, I said 50 people, writing these tracers? Friends, they may think it’s a movement. “ And that’s what it is.
20:27 - I’d like to thank Addison, my partner in crime on the DRuby stuff that lead us down this rabbit hole for me doing this work.
20:34 - And a wise man once said, you can’t hide secrets from the future using math.
20:38 - I believe that is true, but I also believe it is true that you simply can’t hide from the future.
20:43 - I would take questions, but this is a recording so there are no questions to be had.
20:50 - So instead I will answer a question about why on my intro page I used an image from a Pokemon crystal and not Pokemon Ruby.
21:02 - Well to answer that question. I do not like Ruby lang.
21:07 - I do not like its yuppy gang. I do not like its simple keys.
21:11 - I do not like its optional parentheses. I do not like its method send.
21:16 - I do not like its begin and end. I do not like its magic verbs.
21:20 - I do not like its methodic key words. I do not like its IO dot read.
21:24 - I do not like its lackluster speed. I do not like its deal open jit.
21:29 - I do not like strong params permit. I do not like its if unless.
21:34 - I do not like its dependency mess. I do not like its case in when.
21:38 - I do not like its require middle men. I do not like its polymorphism.
21:44 - I do not like its object fanaticism. Its object nil gives me pain.
21:50 - That Ruby lang is profane. Thank you. .