DEF CON 29 - Kelly Kaoudis, Sick Codes - Rotten code, aging standards, & pwning IPv4 parsing
Aug 5, 2021 17:39 · 7287 words · 35 minute read
- [Sick Codes] Hey, everyone. Welcome to rotten code, aging standards, and pwning IPv4 parsing across nearly every mainstream programming language.
00:12 - Let’s get started. Sort of starting off with a meme here I got from JF or Joe Slowik’s Twitter account.
00:19 - Everyone, please stay calm. We’re releasing PoCs for the unpatched vulnerabilities so you can better evaluate security posture.
00:25 - I thought this was fantastic and I asked Joe if it was cool if I could put it in the presentation.
00:32 - Just as a quick disclaimer. None of the research today was paid for.
00:34 - It’s all in good faith. Nothing today represents our employers, future, past or present.
00:39 - None of us are under any gag orders apart from vulnerabilities that are still pending, which you’ll see on the next slide.
00:44 - All of the content in the slides is creative commons zero, and all the trademarks logos, blah, blah, blah, are property of the respective owners.
00:52 - So what will we be discussing today? So we’ll be discussing today a litany of CVEs.
00:56 - These are CVEs that we accumulated over the course of about the last, I’d say nine months, with some of them still pending because they’re quite large and they take a bit of time to fix, obviously.
01:06 - So, two of them we’ll be releasing today, but they’ve never been seen before.
01:11 - And we confirm with all the vendors and everything like that, that it’s cool to talk about them, including the Oracle one we’ve mentioned below.
01:17 - We’ll get into that a bit soon. So the format of today’s talk is how did we find this type of vulnerability? How you as a researcher or a hacker can horizontally scale your vulnerabilities, whether that be for good or for bad.
01:31 - Some proof of concepts for some of the vulnerabilities.
01:33 - The exploitability of a vulnerability of this caliber or type and further vectors or attacks or research that can be conducted by anyone who is willing to get involved.
01:46 - Some takeaways we want you to take out of this is that every day is patch Tuesday.
01:50 - Patch Tuesday is a meme. No longer exists, except for maybe at Microsoft and Oracle.
01:55 - And how do you patch an entire class of vulnerabilities this size? And just a quick apologies to anyone who has to leave the conference early, because some of these vulnerabilities might affect some of your code bases.
02:05 - And lastly, we just wanna show you how to exponentially magnify some of your research, threat models or thought models to magnify those attacks, and once again, horizontally scaling your research, so applying it across all different types of languages.
02:20 - And also, the ability to pick up existing vulnerabilities and apply that across existing code bases, whether that be finding a vulnerability in one language and then applying it in the next language and so on and so forth.
02:33 - So I just wanna start with a quick picture of a typical NIST CVE scorecard, basically 9. 8.
02:42 - That’s the maximum you can get for a vulnerability, excluding scoping changed or unchanged.
02:48 - And this is or what you can expect from a pretty high caliber vulnerability where you can get remote code execution or server side request forgery or things like that.
02:57 - And just quickly as an intro, you’re listening currently to Sick Codes and you’ll be hearing from Kelly Kaoudis soon.
03:04 - We’ll be presenting work based on some of the work that we did with Johnjhacking, Nick Sahler, Victor Viale, Cheng Xu, and Harold Hunt.
03:14 - So I just wanna start with the octal numbering system.
03:17 - Most people will understand this numbering system, but basically there’s no eight or nine.
03:22 - You can kinda visualize this numbering system as in counting to 10, except you’re counting with closed fists and you’re counting from zero and you’ve only got eight fingers, therefore you’re counting for the maximum of seven, and then you just start from the next number up.
03:35 - So that would be 10, which would be one and zero.
03:37 - And how do you get to these numbering system from what we know as decimal? Well, the easiest way to do it is you just go from left to right on a number that’s being prefixed with the appropriate octal formatting.
03:50 - You basically start from left to right, and then do eight times the power.
03:53 - So this number here has three digits, therefore we go two, one, and zero.
03:57 - So eight times two, eight times one, eight times zero, and then we multiply that by each digit in each box.
04:03 - So one times eight to the power of two, zero times eight to the power of one, and then zero times eight to the power of zero.
04:10 - And this one’s really easy because one times eight to the power of two.
04:13 - Eight to the power of two is 64, so one times 64.
04:15 - And the other two multiply by zero or nothing.
04:18 - Therefore the answer is just adding it all up.
04:20 - You get 64. Now let’s go to a more interesting case.
04:25 - We got the one at the front again. So we’ve got 64.
04:27 - Second one there, eight times two is 16. And then the last one there, anything to the power of zero is one, therefore seven times one is seven.
04:35 - Add it all up, you got 87. And then this is what we traditionally know as decimal or base 10.
04:42 - This is what we’ve got 10 fingers for. One times, this is exactly what you do when you’re normally counting numbers as you got your hundreds, your tens, and your ones.
04:49 - So ten to the power of two is 100, to the power of one is 10, 10 to the power of zero is zero.
04:54 - Pardon me, it’s one. And then you got one times a hundred is 100, and then you got your two times 20, two times 10 is 20, and seven times one is seven.
05:03 - So you get 127. That’s your typical base 10 or your real life numbering system that you currently know as decimal or regular numbers.
05:12 - To go backwards from octal between decimal, to get from decimal to octal, we do remainders.
05:19 - So 127 divided by eight is 15 remainder one.
05:22 - Keep the remainder on the right-hand side. 15 over eight is one remainder seven, and then one remainder eight is zero reminder seven, and then you get the octal number is the right-hand digits basically you lined up next to each other, 177.
05:41 - So notably from that example, you would’ve obviously been looking at the 127, a really recognizable number.
05:47 - This is check host. If anyone doesn’t know, it’s an awesome website, Russian website, I believe.
05:51 - There’s no capture or anything like. You just plug in the IP address.
05:54 - You can ping it. You can mainly, one of the things I get out of it is you can discern between DBIP, MaxMind, and I think it’s IP2address or something like that.
06:02 - You can maximize your (indistinct) by comparing the three and calculating where a database is getting its geolocation from in some cases.
06:13 - So 0177, if you put that in your address bar, you’ll actually go through to your home address.
06:18 - But if you put that without the zero in the front, which we’ll allude to later, you actually get to a Brazilian IP.
06:24 - And I’m not gonna blur this IP address out ‘cause it’s pretty well known, but 177. 0. 0. 1 is somewhere in the middle of Brazil.
06:32 - And then the reverse of this is the resolution of 0127, which goes to 87. 0. 0. 1, and that’s some random one in Italy.
06:41 - I think before we were looking at this, that was owned by Level Three Communications, but it looks like someone else has actually taken up the roster on that IP address there, which is quite interesting.
06:50 - So this actually, this vulnerability, as you can see where it’s kind of heading, is based on the fact that some parsing libraries, I’d say all of them, pretty much, have some sort of nuances in the way that they parse IP addresses, and that Daniel Stenberg, who runs the curl project, he fixed this in 2018 on the Linux or GNU side where 177 with a zero in the front, which means octal, would actually resolve to the funny address that’s located there in Brazil.
07:17 - And he called it a funny host, which I thought was quite hilarious.
07:21 - And the part of this presentation we’re alluding to in the terms of rotten code is that a lot of people have different ideas of how code works and implementations and standards and changing schema, things like that.
07:35 - It’s not a blame game, but a lot of languages have turned bugs into features or things that they’ve added along the way have actually been the incorrect way to add them or there’s typos or there’s manipulation or even there might be code that’s being inserted maliciously through, I guess, rogue commits.
07:55 - But another part we wanna allude to is how well tested is some of the libraries you’ve been using and how well do you trust those libraries? This is interesting excerpt we got from Wikipedia in the term of octal.
08:06 - There’s a lot of different ways you can prefix numbers to represent octal numbers.
08:10 - A lot of people might remember the backslash 73 as being the most common way of referring to octal, where by 0o is actually typical of octal and then 0x is typical of hexadecimal, and that’s true in C.
08:25 - And then furthermore, that was actually taken up by Python, OCaml, Haskell, Raku, Ruby, TCL, PHP, and then we’ve got ECMAScript, which is actually a JavaScript space rule structure or standards.
08:39 - And we’ll get into that a bit later on, but there’s actually some confusion there based around the standards so that ECMA 6 and ECMA 3 and ECMA 5 are all different in the way that they should or shouldn’t pass octal values.
08:55 - And this has been “discouraged,” in quotation marks.
08:58 - That does not mean that it’s been employed, but I’ll just quickly say for the record, you can use use strict to avoid this kind of instance.
09:07 - So a lot of IoT devices, they don’t have the sort of capability or even memory allowed or memory allocated to the device where it can actually realistically or reliably parse an IP address.
09:20 - Parsing IP addresses should actually be the same sort of meme as passing HTML.
09:25 - IP addresses are a bit complicated in that if you do a Wireshark, you’ll see the MAC address pop up, you’ll see the IP address pop up in hexadecimal, but what you want to see is octal, okay? So that’s a little thing that a lot of people are missing is the octal usage of numbers.
09:38 - Everything boils back down to libc. So depending which operating system you’re on, Windows can, we’ll keep Windows aside for the moment, but all of it’s based on translating network addresses from hexadecimal into something readable, into all that 16-bit network representation, 32-bit, et cetera, et cetera, or even into the octal format.
09:58 - The reason is for using the, the reason we do that is to obviously use it in other programs.
10:04 - So programs to program communication. So this is really funny.
10:09 - Me yelling at my computer because my code isn’t doing what I want it to do, and then on the other side, my code is doing exactly what it was told to do.
10:19 - In this picture here, we’ve got a… This is just a Wireshark dump just to sort of emphasize the packet itself and what you can actually find out of the packet.
10:30 - It lets you click on the different parts of the packet and identify what’s there.
10:33 - Obviously if you’re a bit more savvy with the network address translation, you can just see the bytes in themselves, or the other way to do it is just to click on the source address.
10:41 - For example, 10. 8. 160. 235. I believe I’m behind a VPN.
10:46 - And then you can see each individual byte there.
10:49 - The four bytes are represented as the actual integer values, 10, 8, 160, 235, and that’s the full IP address.
10:57 - But where did the dots come in from? Well, the dots come in from translating.
11:01 - So they’re not actually part of the packet.
11:04 - That’s where kind of the issue lies is that most of the translational errors come from people not understanding that or skipping parts or just human error.
11:14 - But the only real reason that people use octal is kind of like just basically for testing out SSRF.
11:21 - And that brings us to our first entry to the story here, which is the nodejs project private-ip, which has got something like 12,000 weekly downloads.
11:29 - We came across this about nine months ago when John Jackson and Harold came across this vulnerability when they were working.
11:38 - And one of the vulnerability reports they received was in relation to SSRF.
11:43 - So someone was able to request server files belonged to the server when in the real world, they weren’t actually supposed to receive those files.
11:51 - And the way they used that was using SSRF techniques involving submitting octal numbers to the module and getting to a public facing module and getting back private files that belong to the server.
12:05 - And just for reiteration here, this has 12,000 weekly downloads to this project.
12:09 - The project hadn’t made updates for about three years, and what it previously had was a bunch of regular expressions in a form that I would probably say is unmaintainable.
12:20 - Some people would beg to differ, but the maintainability of regex in this format is quite hard for someone to just come along as a newbie, particularly if they don’t know regex or if they’re not that strong or familiar with it.
12:31 - But there’s better ways to do this. What we thought at the time, a better way to do it was using big set of ranges.
12:38 - So you can see here, we got let private ranges, we’ve got a bunch of private ranges here that we allocated.
12:46 - And just here just quickly just showing you the private internets.
12:49 - This is just a really classic document from 1996 where they put the reserve ranges as specified.
12:56 - These were the three basic ones at the start.
13:00 - As you clearly see, point number six, security’s not addressed in this memo.
13:04 - Some examples of where public IP addresses may or may not be used is like arrival and departure terminal displays at airports, which we now know clearly need TCP/IP, and then cash registers.
13:22 - Obviously they use TCP/IP connected to the web now.
13:26 - Money machines, all that sort of stuff. So they got some of this wrong.
13:30 - Banks weren’t supposed to be connected to the net, and now everyone’s literally banking on the net.
13:35 - So, and then yeah, two of the addresses that we had in the tests were actually false.
13:43 - They’re actually incorrect ‘cause they’re actually private IPs, so they shouldn’t have been in the public IP address test case.
13:49 - For the original package private IP, we removed those, obviously, and then we added a bunch more test cases there.
13:59 - This is the new IPv6 rejects that Vasco, another maintainer from the lib-ptp, I think, they added IPv6.
14:11 - As you can see, regex is regex. You either love it or hate it.
14:15 - I love it. However, in this kind of context, it’s not really maintainable to a level that we’d expect for application that’s used by X amount of thousand dependencies, and that’s probably the reason why the original package was not updated for three years.
14:34 - And regex. So here we’re looking at the second module that we found was netmask.
14:42 - It’s the one we added. You’re looking at 3 million weekly downloads.
14:45 - This is a big one. This is a lot bigger than that other one.
14:48 - I’d say in magnitude, an order of magnitude larger.
14:54 - 9. 8 critical for this one with Harold and John Jackson.
15:00 - So here’s the first CVE. We had insufficient regex in private IP 1. 0. 5 was insufficiently filtering ranges that were actually supposed to be private ranges.
15:13 - And that’s why people could actually submit URLs to applications using this package and actually get files that belong to the server, which is obviously disastrous.
15:24 - And then after that, because we’d actually entered the netmask project in there, someone actually came and found another vulnerability, and then one thing led to another, and we actually found out that netmask itself was vulnerable, which was hilarious.
15:39 - And you can see here that it’s pretty much exactly the same vulnerability.
15:43 - And then another guy named Ryatack came in midway or another researcher named Ryatack, he came in halfway and actually found another CVE, because the complete fix wasn’t actually correct.
15:57 - And this was only scored a 5. 3, ‘cause I think it was only out for, like, two days, although it is the same impact as the first one.
16:05 - - [Kelly] So we’re in tmux in a container. I just wanted to run through the netmask CVE.
16:10 - So this is the last vulnerable version before we started making changes with the maintainer.
16:15 - And we’re just picking an arbitrary CIDR block, 31. 0. 0. 0/8.
16:20 - First address in this block is 31. 0. 0. 1. Last address.
16:26 - And so then this block should contain 31. 5. 5. 5.
16:31 - However, this block should not contain 031. 5. 5. 5 because 031 is 25 in decimal, if we convert it properly.
16:41 - Now we’ll see the same thing in coffee. I wanted to show coffee because netmask itself was originally written in coffee and it’s transpiled to node.
16:53 - Okay, that’s the same. And unfortunately that is also the same.
16:58 - Now, why does this happen? Underneath in the netmask library, this function parseInt is being called.
17:06 - And so parseInt, if it’s not passed a base as the second argument, will just chop that leading zero right off and pretend it got a decimal.
17:14 - When 031 should be 25, it’ll just come out as 31.
17:19 - And we’re gonna see the same thing in coffee, unfortunately, even though coffee has the advantage of not accepting leading zero notation otherwise, right? So we should be using 0o0 if we really want octal in coffee.
17:33 - All right, so what? So back in 2018 again, Nicolas Gregoire responds to curl guy tweeting about octal.
17:43 - He says, yeah, bad octal conversions are a legit way that we can use to get around anti-SSRF filters.
17:49 - If you haven’t heard of Nicolas, he’s associated with PortSwigger as an official Burp Suite trainer.
17:54 - He teaches people how to use one of the most popular web app hacking tools in the world.
17:59 - And he references his talk from AppSec Euro 2015 and says, yeah, if you’re interested, checkout pages 24 to 28 of my slides.
18:07 - So it’s pretty clear that this is an accepted way to get to a place where we can achieve SSRF.
18:13 - Once we can bypass input validation, we can potentially execute local code.
18:18 - Around the time we were doing this research in March, the node netmask package had about 238 million lifetime downloads.
18:26 - So it’s a bit more popular than private IP.
18:29 - It had around 2. 8 million people downloading it directly every week.
18:33 - About 170 packages public in npm declare it as a dependency.
18:38 - 289,515 packages on GitHub declared it as a dependent last we checked the dependency graph, as well.
18:45 - So it’s pretty popular for a node package, I’d say.
18:49 - In Ax’s original write-up on bleeping computer, he mentioned, perhaps rightfully, the Perl component Net::Netmask, which inspired node netmask.
18:59 - Net::Netmask had the identical flaw, and its maintainer Joelle was very quick to file a CVE and push out a fix pretty much as soon as the article came out or slightly before, which is very commendable.
19:13 - And it also kind of lit a fire under us. We were having a good time looking at things, or at least Sick, John, and I were.
19:20 - So then we started looking at other CIDR parsers and trying to see, well, okay, what else could there be out there in the world that does this exact same thing and doesn’t quite work the way the maintainer intended? So the last time this package was updated was 2016, which is fairly reasonable.
19:38 - CIDR parsers don’t exactly change a whole lot.
19:41 - This is another package, a bit like Net::Netmask, which inspired node netmask within the Perl ecosystem.
19:47 - So we’ve installed the vulnerable version. We’ve got our proof of concept on the right.
19:51 - And the deal with this package is you’re supposed to use a guard function is_ipv4 before you call isprivate_ipv4 or ispublic_ipv4.
20:02 - And we didn’t do that because it wasn’t documented that you needed to do that before the fix.
20:07 - The documentation was changed as part of the fix.
20:10 - So we’re just gonna run it and show you the output here real quick.
20:13 - And so like with netmask and everything else, the idea here is an attacker could submit input data which is an octal which should be private and ends up pointing to something that is a false negative, which results in SSRF bypass.
20:29 - And the same thing with things that should be public and are not.
20:32 - So there’s a couple of things that do work here, a couple of things that don’t work here.
20:38 - If you wanted to exploit this, you would just sort of have to experiment to see what was going on.
20:43 - But the trick is guard function. - [Sick Codes] If you’re a little bit confused, you can actually go to sick. codes, and in the releases section, you’ll find a big writeup about the original netmask vulnerability and how it affects you.
20:56 - It’s actually quite a large writeup. We go into all the details about that vulnerability.
21:01 - And here’s sort of a table where you can compare netmask to decimal and then the reverse of that and what you should expect.
21:10 - But what I’d like to point out is the comments section So in the comments section, “Is there a reason for the double check?” So that was actually a small zero day that came through in the comments section.
21:26 - Another one in the comments section was padding for an anonymous user.
21:31 - So IP address, I mean, these are just zero days getting dropped into the comments section, which we, I didn’t approve them at start, but I did actually end up approving them ‘cause it’s been enough time.
21:42 - So you can zero pad your octals. You can do all sorts of crazy stuff.
21:47 - And then Lorens, “The only reasonable thing is to consider the IP invalid.
21:53 - Trying to convince some people that it should be considered is futile. “ And I tend to agree with Lorens because only IP addresses without eights are in the octal range alphabet, so you literally are missing 20% of your entire address space.
22:14 - So, yeah. And one comment from Tony Chung.
22:19 - I tried to get in contact with Tony. He was one of the guys that mentioned to us early on that Java is probably worth looking at, and we did actually end up looking at that, and emailed Tony.
22:29 - And Tony, I hope you hear this, but yeah, appreciate the heads up.
22:35 - I couldn’t approve your comment at the time because it was technically a zero day, but we’ve been in touch with him.
22:40 - Hopefully ping back. - [Kelly] When you’re initially learning about a type of vulnerability, it may make sense to try and find it everywhere you can.
22:49 - Like if you’re trying to figure out how reflected XSS works, it can be interesting to try that on basically everything you come across.
22:56 - And so we applied that to octal a bit. We knew we had some leads to go on from some of the things people were saying, and there are just so many vulnerable CIDR parsers that even when we were putting this talk together, we were still finding more.
23:10 - So not only is it possible that there are more out there that are vulnerable like this, it is quite likely.
23:19 - - [Sick Codes] So if you didn’t really get what’s going on at the moment, basically we’re just applying the same CVE to a whole bunch of different languages, a Python, Perl, Golang, Java, JavaScript in the form of nodejs, and another Perl one.
23:34 - And here’s the one by Dave. So we opened a issue at Chromium project stating that the V8 engine is wrong because eight doesn’t actually exist in the octal alphabet.
23:47 - After the number seven is 10, and that’s relating to decimal only.
23:53 - So it’s apparently working as intended. And we opened up an issue on ECMA, and they’ve said they don’t wanna fix the issue because it’s too old.
24:04 - It’s not even a bug, apparently. But yeah, so you have to be, just have to deal with it.
24:14 - - [Kelly] Going over to the Mozilla developer reference for a second to check out what it says about these errors that we were discussing with the Chromium team, we see a couple of things here.
24:23 - So firstly, as mentioned by the Chromium team, octal literals, the leading zero format literals which represent octal integers, will throw an error in strict mode, which is nice.
24:32 - But this doesn’t exactly help us with either netmask or with the behavior of parseInt, which still allows zero prefix literals input as strings or integers even in strict mode.
24:42 - So the second thing is JavaScript mixes zero prefix literals that get interpreted as decimal with zero prefix literals that get interpreted as octal.
24:52 - And we find this one of the weirdest footguns we’ve encountered so far with JavaScript aside from parseInt’s behavior.
24:58 - To explain this a little bit, since the only digits in base eight are zero through seven, this means there are some extra zero prefixed integer literals allowed by the ECMA spec, and they get interpreted as decimal.
25:09 - - [Sick Codes] After the chromium project, we opened one at nodejs itself to see if they could do something about it.
25:15 - Mozilla apparently doesn’t treat it as illegal octal constant.
25:19 - That’s because it isn’t. There’s like, there’s only seven numbers.
25:24 - And so V8’s rejected it, nodejs rejected it, and pretty much then the last opportunity we had to raise the issue is with ECMA itself.
25:38 - And I said that these numbers don’t exist. They said websites will break.
25:45 - And the end of the discussion is no one has claimed this doesn’t make sense, just that we can’t change it because it will break a bunch of websites.
25:54 - So the bug has moved into feature status. So here we’ll present a Rust CVE, Golang CVE, a Python one, and some Oracle stuff.
26:10 - - [Kelly] So Cheng sort of decided for Rust that they would just reject zero prefixed octets, which is likely the right decision.
26:19 - So leading zero on IP string interpreted as octal literals.
26:24 - A simple demo here. We see that 0127 becomes 127 when it shouldn’t be.
26:31 - Yeah, this may cause security vulnerabilities in certain cases.
26:38 - The specification also allows hex formats in IP strings.
26:42 - Yeah, so the end result of this is only decimal is allowed in Rust, which is better for user input.
26:51 - Yeah, it’s not at all obvious that we should parse octal IP addresses.
26:54 - It seems exceedingly unlikely to come up outside of security advisories.
26:58 - Yeah. (laughs) So this made it into 1. 5 3. 0, which is super cool.
27:04 - Congrats, Cheng. Fixing the comments, fixing the code and referencing the same RFC draft as the Python thread did, interestingly.
27:13 - So, which says that octal is a rare format for user input for IP addresses and we shouldn’t parse it.
27:20 - Yeah, if number starting with zero is not none, then we don’t, you know, we don’t wanna mess with that octet.
27:27 - So tests to make sure that octal things are error, hex things are error, and let’s just run the little proof of concept from our CVE.
27:36 - So this is the version of Rust from before the one with Cheng’s fix in it.
27:42 - I’m gonna compile it real quick. And then you’ll notice that we’re changing individual octets at a time.
27:49 - So second octet, third octet, fourth octet, and then the first octet.
27:54 - So we’ve got 026, 026, 093, and then 099 at the very bottom, and none of these are translated correctly.
28:01 - You know, same parsing logic applying to all the octets.
28:05 - With Golang, folks have been talking about limiting the size of allowed octets for a while.
28:11 - So Golang would allow you to just throw a crap ton of zeros in front of something and allow it to still be treated as decimal.
28:17 - So, like, 00000192 would still be 192. But really all you need is three digits to get to 255, which is the max size in decimal for an octet.
28:28 - So when our writeup came out, people started talking about it in the same thread, and Golang eventually decided to mimic what Rust did and return an error on zero prefix input, which is good because it avoids the ambiguity of allowing both octal and decimal, especially when zero prefixed octets are probably just a footgun that most would rather not have.
28:56 - So here’s the change. It made it into 1. 17.
29:00 - Thanks, Roland. Docs change. We now reject IPv4 addresses which contain decimal components, octets with leading zeros.
29:18 - And then this is just a little check on that first digit, I guess.
29:22 - Rejection on zero components with leading zeros.
29:25 - Boom. Here’s our proof of concept from the CVE.
29:30 - And we’re looking at octets one at a time again.
29:33 - The last two little sections here are just trying hexadecimal.
29:37 - We’re trying some of the things that we knew worked with netmask and kind of shouldn’t have.
29:43 - So we’re trying, you know, some CIDR pursers will also parse IP addresses that are just a single integer, and so that’s what the last one is.
29:52 - Second to last is just throwing a random hexadecimal octet in there for grins.
29:57 - So we’ve got it built. We’re running it.
29:59 - Yeah, so the hexadecimal ones don’t work, which is encouraging, but the octal ones do.
30:04 - That zero on, that leading zero on those octets there in octal format just gets stripped off, and so just the same as decimal is how they end up being treated, which is not what we want of all the languages we looked at that we found to be vulnerable, I think Java was my favorite.
30:24 - So we’re running 11, which is the LTS version.
30:27 - So compiling it real quick. Yeah. So we’re looking at one octet at a time again, and up to 255, we are interpreting as decimal.
30:43 - We’re just stripping that leading zero right off.
30:45 - And then 0256 becomes 174. Hexadecimal, on the other hand, seems to be fine.
30:52 - Fewer dotted quads, which is a format I noticed in the documentation, seems to work out fine.
30:58 - What else? Get by address. Inet address get by address seems to work fine.
31:01 - Java net URI seems to be fine. Java net URL seems to be fine.
31:05 - It’s just inet address get by name. And funny thing about stopping interpreting as decimal at 255.
31:13 - Probably somewhere there’s something that’s looking at three digits out of an octet and trying to figure out what it is.
31:20 - So after some back and forth with the Oracle security bug triage team, where we were like, well, it’s broken, and they were like, well, we’re not really sure why, and also it performs differently in different environments.
31:31 - We simply weren’t sure whether for ourselves the issue resides with the underlying C libraries per operating system or with the JVM or with the Java compiler.
31:40 - - [Sick Codes] And obviously the question comes in there where Java actually is supposed to run in a virtual machine, AKA JVM.
31:47 - The difference is basically that if you run Java code just on any operating system without a JVM, obviously you’ll get different results, and the benefit of that is that you can see the vulnerability being played out between applications.
32:00 - However, it won’t actually play out in the real world unless you’re using Java outside of the JVM.
32:05 - - [Kelly] So we tested it against a couple of different JVMs just to see what would happen and as well across different operating systems.
32:12 - So JVM wise, we tested using HotSpot and also against Grel, and we saw our proof of concept code performed consistently.
32:20 - However, we did see different results across different operating systems.
32:24 - - [Sick Codes] That’s where obviously the security team from Oracle decided to just say, it’s not really a bug, and it’s the responsibility.
32:31 - the responsibility is actually downstream related to libc and does not affect Oracle’s product.
32:38 - And we asked them if it was cool to actually discuss that, and they said yes, so that’s where we are.
32:43 - - [Kelly] So these are just some examples of CIDR parsers where we took the time to investigate and discovered unique ways for handling octal.
32:52 - The point we’re trying to make in this talk is very few people actually handle the octal IP address format correctly.
32:59 - Some of the other languages which we didn’t look at, but could have, include LeWitt and Ruby, and maybe someone else with interest can come along behind us and take a peek at some of the other things out there.
33:12 - During this work, we came across a wide variety of articles and blog posts and just opinions about how IPv4 parsing should actually happen, but not so much in the way of a single formal definition, from a discussion around a reference IPv4 parser and a blog post that Dave Anderson wrote back in December, for such a simple seeming idea, it is hard in general for people to agree on what an IPv4 address actually should be.
33:37 - Well, I’ve read the spec, but most of it is cursed, so I’ll just implement this subset which fits my definition of acceptable.
33:43 - The textual representation of IPv4 still doesn’t have a single standard.
33:48 - Grammars or other specifications for the IPv4 address format do come up occasionally in RFCs like 1918, which is the private internet one that we talked about earlier, and a variety of drafts that were simply never ratified for one reason or another, and incidentally also in the definition for IPv6.
34:06 - The representation has evolved over time, and this in addition to people simply not knowing where to turn has led to a bunch of slightly different definitions that mostly interoperate, but sometimes just clash in fabulous ways.
34:19 - A handful of the reference definitions out there simply say something like dotted decimal form and just leave the rest as an exercise to the reader.
34:27 - If you want to truly parse IPv4 addresses, this is the bullshit you have to put up with.
34:33 - And we do. IPv4 is still the most widely adopted IP address format currently.
34:39 - IPv6 has, according to the Internet Society, been operationally deployed somewhere since at least 2002 and became standardized just in 2017.
34:47 - Currently only around 35% of the internet even supports IPv6, according to Google stats.
34:54 - So until IPv6 achieves wider adoption, we’ll just have to keep on with IPv4, as well.
34:59 - Pretty much anything we use to purse IP addresses, whether it was originally intended to parse IP addresses or not, is just gonna have some interesting ifs, ands, and gotchas.
35:08 - So back in Cheng’s Rust pull request, he mentioned RFC draft 6943, which is from all the way back in 2013.
35:17 - If the parties involved in a security decision use different algorithms to compare identifiers, then failure scenarios ranging from denial of service to elevation of privilege can result.
35:27 - The authors do a decent job of walking through exactly what we’ve shown in this talk.
35:32 - And so here’s a little table. They’re using a couple of different terms, false positive and false negative.
35:38 - A false positive means two identifiers were equal, but should refer to two different things, like the octal literals parsing as decimal literals in netmask and other places.
35:47 - For example, 0177 equaling 177. So on the other hand, a false negative would happen when two things should have been equal, but were not, like 0177 not equaling 127 when compared.
36:00 - This draft has existed since before a lot of the code that we demoed today.
36:04 - Not everyone reads spec before trying to implement something or even knows it would be useful.
36:10 - But it does come in handy as a researcher when you’re trying to understand how something would behave ideally.
36:15 - Back in 2019, Python decided they wanted to remove any and all checks for octal.
36:22 - Just, you know, it’s a rare IP format. Who really needs that? Gosh.
36:26 - So here’s the pull request where it happened.
36:30 - So here’s changing the tests. No more error message.
36:35 - No more ambiguous rejection. Leading zeros are ignored and are no longer assumed to specify octal octets.
36:46 - Yeah, that’s the good part. And so we’re gonna look at the issue where this pull request is discussed and which was reopened following our Net::Netmask report here again in a second.
36:57 - And so they’re actually referencing an RFC which rightly says octal is a rare format.
37:01 - You shouldn’t handle it. Unfortunately, that leads to inconsistencies with other programming languages, utilities written in other programming languages, the underlying operating system.
37:11 - So you can’t really ignore it, even though you’d rather once, now that we’ve got octal, we’ve sort of all got octal.
37:18 - So in this case, having octal off by default would be a better outcome, but that wasn’t what they did.
37:28 - So we’re gonna look at Python 3. 8. 0, which is a vulnerable version, and we’re gonna run this little proof of concept which Victor and Sick wrote here.
37:37 - We’re gonna run it with sudo so that the network calls ping work.
37:42 - So you’ll notice that Linux is handling 010 correctly, but Python is not, just like we saw in the pull request.
37:50 - Yeah. We’ve got our suspect IP, right? And we’ve translated it using the IP address call and then we’ve fed it into ping twice, once as itself and once as translated into the bad IP.
38:08 - Bad IP stays bad. Suspect gets translated correctly.
38:13 - We got this far because we worked as a team and we were able to pick up across time zones.
38:18 - None of us got paid to do this. We’ve all been collaborating in our off hours, whenever those happened to be.
38:23 - We’d love it if people were a bit more skeptical of the battle hardiness of their dependencies in the future.
38:29 - We’d love it if there were a formal definition for the IPv4 address format, or at the very least for how to handle IP address octets in base eight.
38:37 - We’d love it if ECMA rejected zero prefix literals containing decimal digits.
38:42 - But even if those spec changes happened, there would still be code out there where something’s just slightly wrong.
38:47 - Spec can guide future researchers or programmers in understanding the differences between the ideal world and whatever issues have become features, but it’s not gonna magically make things perfect.
38:58 - If it’s critical for code to behave a certain way, it may be useful to ensure the behavior of your underlying dependencies matches your expectations before you rely on them.
39:08 - One thing we did find really useful was the ability to report these issues to the security teams at all the various organizations and at least discuss and get pointed in a different direction if necessary.
39:18 - If we hadn’t been able to get in contact with the folks at Rust and Golang and Oracle and so on, these vulnerabilities would continue to be live in the most current versions of these languages and libraries.
39:29 - So thank you so much to everyone who is receptive to working with us to fix these issues, and thank you for watching our talk. .