DEF CON Safe Mode - Michael Stay - How we recovered XXX,000 in Bitcoin from an zip file Q&A

Aug 24, 2020 04:35 · 4743 words · 23 minute read go ahead crashed distributed operating

  • (indistinct) I get to cheat (coughing) We’re now Live. We’re Live. All right. So we’re here with Mike Stay, another fantastic speaker for a virtual “DEF CON SAFE MODE.” He is covering how he recovered a six digit sum of worth of Bitcoin from an encrypted Zip file. And I guess if you just wanna like quickly go into your talks spend just like a minute or two and then we’ll start asking you some questions. - Yeah, sure. So, short summary is I used to work as a reverse engineer back in the late 90’s.

00:51 - I broke the Zip encryption that was used by Info-ZIP, which was the open source version. And so everybody accepts (indistinct) based their encryption on that, particularly WinZip, that had like 95% of market at the time. And yeah, so then 20 years later, somebody locked up their Bitcoin in a ZIP file, that they made on their Lenox box and forgot the password. So they came to me and said, “Hey, I found your paper. What’s the current state of the art,- - (laughing) - Can you help me with this?” - And this is your first time talking to DEF CON right? - It is, Yes. - Yep.

So we’ve- 01:32 - - (coughing) - Given him fair warning, but there is a tradition for first time speakers of DEF CON they get to take a shot with us on stage QA session is the closest thing to a stage. So thank you, Mike, for providing DEF CON with some wonderful content and cheers to you man. - Thank you for having me. - Alright. Okay. So I was actually kind of surprised, so I have never thought about ZIP encryption as being something that would be difficult to get around. So you did go in through a couple of different types of encryption. I was also surprised that like, well, I wasn’t surprised that early word was as difficult as it was, but I later on, the 40 bit encryption that was just really difficult to brute force, that one kind of surprised me.

02:22 - Do you have any other have you worked with any other type of encryptions, that have been surprisingly difficult, to like get into for being such a legacy weird proprietary protocol? - Let’s see. - There were a couple where they clearly knew enough to be dangerous, then completely screwed it up, like early word Perfect. The founder of the company had broken that one himself and then when they released their new version saying, Oh, now we’re using strong crypto, nobody, you’re able to break this. We went in and found that they took the password and then ran it through Dez in the wrong way and got out some vector, and then just exort their file with it. It was something ridiculous like that. So they had Dez, but they didn’t use it right. - It was so close.

03:22 - - Just quite, - Yeah, They were ones like Microsoft access 97, I think was one where they had RC four encryption, but it was a fixed key. And so they would RC four encrypt the file with his fixed key. And then you’d go to this offset in the file and look up the password. And it was just sitting there and playing tags. - (indistinct) - Oops. - Some of the details might be off, it’s been 20 years. Yeah(indstinct). - So go ahead.

03:57 - - I want to ask a really, while we wait for people to come up with some really good technical questions to throw at you. Am gonna do one that Alright. So let’s say that I don’t know everything there is to know about encryption out there. Let’s say, I want you to do, when I asked you to do a similar thing that you did in your talk. And I know that my password starts with a word and has some unknown thing after that. Are there things that I can provide you, that I might know about the password that will help you get through this or this, the encryption- - Yes(indistinct) - Dictionary attack, right? - The product you’re using has strong crypto, the guys that built it knew what they were doing, then pretty much a dictionary attack is the only option you’ve got left.

04:44 - And so there are specialized attack software that you can get, One of them is called Hashcat It’s built for running on GPU farms. That was what we were originally looking at as maybe writing a Hashcat module for this. But it’s really designed for processing a key space. And so you can give it a dictionary. You can then say, take this, and then do all alphanumeric strings up to length six after it, or take this and try all different capitalization’s replace vowels with numbers, say, an I goes to a one and an is zero and so on. E to a three. Whenever you do that sort of thing, you can, there are these rule sets that you can say, okay, Hashcat this is what you’re going to start with.

05:38 - And these are the rule sets, that I want you to use when processing and it cause this the best of my memory, what the password looked like. On the other hand, if you’re doing something like correct battery horse staple from XKCD, you’ve got too much entropy. And that’s really the way to protect your files. If you’re doing something it’s just make it longer, right? Cause if you go from 26 characters, which is, all lowercase letters to 97, you’ve roughly tripled, that’s adding a two bits per character to the entropy, right? So if you’ve got a length, eight password 26 to the eighth has, all possible lowercase letters there. But if you go up to 97, all printable characters, that’s only adding two bits per password. I’m sorry, two bits per character.

06:34 - So on a length that password that’s, let’s see, 26 is about five bits and that’s 32. And two times eight is 16. So it’s adding three characters to the length of your password. Adding printable characters to a password of the same length, is just adding a few more. But if you go and add a whole bunch more characters to your password, make it a long one. That’ll make it really secure. - And so- - Your Passphrase instead of a short, random string. - Yeah.

07:09 - Even if you’re using English words, right, if you make it a passphrase rather than pass word, that’ll make it really so vulnerable to a dictionary attack. There may be other attacks if the cryptos bend, but if the crypto is good, then it’ll protect you. - So just, this is entirely for my own curiosity. So after you broke through the Zip file, you got the password that you could use to decrypt the Zip file. - No, we didn’t recover the password. - Ooh you didn’t recover the password. Okay. - Yeah. So the way, the way Zip works is it derives a 96 bit key from the password- - Okay - It way the 96 bit key that we recovered.

07:47 - Now, if we wanted the password, we could take those 96 bits and then go launch a hash cat attack using dictionary. And some other stuff, that others have had worked out to get a few of the initial characters. - That’s where it fits into the type of password cracking that many of us are familiar with ( indistinct) on the river, okay. - (indistinct) - We’ve got 96 bits, then there’s something you can do with the dictionary attack, little, see whether the initialization process gives you those bits for now. - That was great. Yeah. What I was going to ask is if you got far enough to see if like a dictionary attack would actually work.

08:21 - I guess the Zip file in less time than you spending all this time to brute force it. But If you didn’t- - He suspected it was on the order of 20 something characters- So probably not - Or more making quite a while to it’s a brute force. - With this technique that you went through work for any encrypted Zip file or, - Yeah. - Yeah. - This will work on any Zip file, so my original attack back in the late 90’s required five bytes five files in the archive, with the same password. This one we were able to get away with too, because we also knew the the timestamp.

09:04 - So if you’ve got the timestamp and you’ve got two files, then this will work on any of them. - So how does the number of files affect the credibility of as a file? - When suppose you don’t have the timestamp. - Okay. - Okay - In info-ZIP, it was meant to run on Unix machines as well as windows machines. So on PK Zip, they just allocated some memory and used whatever bytes were there. - Yap. - Those random bytes. In info-ZIP on many Unix machines, it would initialize the bytes to zero.

09:39 - And so there would be no randomness there. So that used the process ID and the timestamp to get a little bit of entropy. And then fed the XR of those two into CS, Rand function and generated a bunch of bytes. But they thought maybe that’s too weak. Right? There were some known plain text attacks and they’re like, well, if they brute force the timestamp and the process ID, then they can derive the rest of these bytes And so they took the password and encrypted those bytes once. And that’s what they used as the random bytes when they encrypted that on the rest of the file.

10:18 - But when they encrypted it twice, because of the way the Zip cipher works it produced the same stream byte twice at the beginning. So it encrypted at once and then it decrypted it for the first bite of each file. - So when you say that is it- - When you (indistinct) files in the archive, I have every 10th output of that, of CS, Rand function, and 40 bits were enough, to figure out the 31 bit internal state of CS ran function. So once I knew the internal state of CS ran function, I could generate those first 10 bytes of each file. And then I would do a bunch of bit guesses.

10:56 - And because of the way the cipher was designed, not all 96 bits were used when producing each output byte of the stream. So I guess like 40 something bits up front, and then because I had five files there and I knew what those bites had to be. I could filter all of those bits. I could say, I’ve got to know which of these 40 bit guesses are correct before moving on to the next stage. - Got it. - And so by having five files, I could both derive the internal state of CS ran function and filter my guesses and finish one stage before moving on to the next one. And so it was a parallel divide and conquer attack. In this case, I only had two files.

11:38 - So even though I was making a 40 something bit guess I only had two bites to filter it with. So like, that meant two to the 24th wrong key guesses- - (laughing) - Went to the next page and I had to guess more. And so it just kept getting bigger, bigger, and bigger up to two to the 60 something before I could start pairing it down at the other end. - So just for clarification, it’s resetting the stream cipher every single time that encrypts a separate file, and that’s why you’re able to do this? - Yes, yeah - Okay - So it starts over again with the same password, resets it to the original state, and then starts from there. Coz you want to be able to extract a single file from the Zip file. Without having to extract all of them. - That makes sense.

12:18 - - Answer to one of the questions we got with the new attack, “Is there an acceleration in having more files in the archive?” - Absolutely. In the original attack, more files you had, the faster it went. And so this is just a refinement of the original attack. But certainly having more files means more bits to filter with and getting rid of false positives earlier - And sort of closely associated with that. Would you know if this kind of attack works with the other encrypted archives, like 7 Zip and RAR? - Most archive software now uses AEs like RAR five switched to the S two 56.

12:56 - So this isn’t gonna to work against anything except Zip files. - Going for best standards I like to see that. (laughing). - Even WinZip switched to AEs awhile ago. - So fair enough. We had another question. “Do if your client was the legit owner of the Bitcoin?” - I can’t be certain, but we looked him up online. We knew his real name and we looked him up online and found that he had reason to be owning Bitcoin. - Didn’t seem too shady. It wasn’t someone reaching out across the dark web from- - It was part of his employment, that he would be dealing with Bitcoins. - Fair enough. - Now that makes sense. So now this is really interesting.

13:49 - Do you expect with putting this out here and providing this talk, do you expect to get more of these requests to crack more things? If you do get more of these requests, do you have an answer prebuilt of how you might respond. - As far as breaking into Bitcoin wallets? Yeah. When I first wrote this up on my blog, we got a whole bunch of requests and for most of them, I had to say, Nope, sorry, the best you can do as a dictionary attack. Many of them said I bought Bitcoin with a credit card ages ago, but now I can’t find my wallet. “Can you help me? I’ve got the my credit card records.” So no, we need a little more than that.

14:36 - The one that was most interesting was the guy who claimed that his hard drive had crashed that had Bitcoin on it. And so we were working with him to get some data recovery, but after a while, it became clear that he was perhaps schizophrenia or delusional that he believed that someone was cheating him out of his Bitcoin and it’s stolen. Anyway, it was.. - Interesting- - (indistinct) - There are about four situations where we could potentially recover software. One of them is if you printed out or wrote down the seed phrase for generating the 128 bit key, right? When you generate it, the wallet software always says, “Keep this in a secure place,” right? And it’s this 30 odd word phrase that’ll generate the 128 that key. So if you’ve got that, you can recover the key, you can recover all your Bitcoin.

15:36 - The next case is if you have had damage to your hard drive, right? If the hard drive crashes, then the data in the sector is probably okay. And even if the data in the sector is bad, we only need eight bytes to be okay. That has the encrypted key in it, right? So if we can recover that data, then we can probably recover your wallet. If you have the wallet software, you don’t have the original phrase, but you used a weak password, then we can try and do the dictionary attack approach. - Right. - (indistinct) And then the least probable there have been wallets with security flaws that make them susceptible to breaking more easily.

16:23 - And if you happened to use one of those back when they were being used, most of them have been fixed since then. But if you happened to use one that had a flaw, then we could try to exploit that flaw. - (indistinct) - This was an attack on, a Zip file, but you’re, you’re talking directly about Bitcoin Wallets. Do they also- - Yes Bitcoin Wallets - Do they also use some Ziplike structure? Have you attacked the Bitcoin wallets themselves- - Bitcoin wallets takes the key, the private key information that you sign your transactions with and the password and generates a symmetric key from the password and some salts and then encrypts the private key. So that private key is really what gets you access to the Bitcoin.

17:12 - What we can do is try to either recover the private key by means of that really long phrase, regenerate that same private key, or we can attack the password if you’ve got the wallet so that we can decrypt the private key that you had stored, or we can attack some flaw in the, in the cipher where for instance, when they were coming up with the symmetric key, they didn’t use the entropy properly. And so there’s a much smaller T space that we would have to brute force. There are very few of those that were out there, but there were some. so there’s possibility we could do that. - So I like asking this question of people who know this technology really well. Feel free to tell me, no, you’re not gonna answer this question, but do you yourself hold any value in any cryptocurrencies? Cause you seem to understand how it works.

18:12 - - I don’t because I have, I mean, there’s no inherent when you pay taxes, you pay taxes in dollars because the government says you have to pay taxes in dollars. So there is this built in necessity to own dollars, at some point. There is no built in necessity to own Bitcoin or any other cryptocurrency, right? There’s and for Bitcoin and Ethereum, I think that a proof of work has shown itself to be a susceptible, to attacks like civil attacks, 51% attacks like Bitcoin cash and Ethereum classic have both suffered 51% attacks. They were rebuffed eventually, but if Google wanted to deploy their whole infrastructure, they could completely own Bitcoin. - (laughing) There are existing companies that could do that, not to mention Nation States, right? If the, if the U.

S wanted to take it down, 19:21 - they’ve got this thing in Twilla here in Utah that they could deploy against taking down Bitcoin. So my personal take and we designed a system to do this is that you need to use a consensus algorithm of true Fonality, That proof of stake in bandwidth. And then after a certain point, when you have enough witnesses, you say this block is finalized and it can’t ever change. Ethereum is trying to move in that direction with their proof of stake algorithms. - (indistinct) - But I’ve heard, of proof of stake, but the finality piece is new to me.

20:02 - - You’ve definitely given me a few pieces already that I’m going to need to go Google. - (laughing) - (indistinct) - So we got another question, it’s sort of a meta question. One of the people that watched your talk had a little bit of struggle following your math, they understood all the aspects individually that you talked about. But zooming back out, they seem to like lose pieces in their head. And they’d want to know it’s like, how do you juggle this? And like, are you aware that some people like fucked, like follow your talk might have difficulty zooming in and out like that? - Yeah. So.

20:43 - I had some options when doing this talk one was to go really deep and really hard on the technical stuff. And the other one was to give enough background and the basic idea of how this attack played off and the challenges we faced. And so I chose to be less detailed for the sake of the story, rather than- - Yeah - Go deep into it. - (indistinct) - If anyone has any technical questions, take them offline, I’ll be happy to talk through them with you pointed at lines of code and that sort of thing. - That’s great. That’s awesome. Is there- - As far as keeping it in my head I would have to wake up and then come down and reload everything.

21:25 - I had stuff on whiteboards all over my office pictures. It was a process. I would even have to remind myself about what was going on because I couldn’t keep it all going at once. And it was a month long process of trying to think through over and over again, how things are going wrong and what I might be able to do to fix it. So if you don’t get it from one short 45 minute talk, I certainly don’t blame. - (laughing) - Makes sense. - Did you discuss this at the end? I’m sorry, I missed this point.

22:00 - Did you actually get any compensation for this work? - We did. Yeah. So when he first talked to us, we said we’d like so much upfront. We estimate that the total cost will be about this much. We took longer than we said we would we expected it to be done in three months. That was October, So November, December, January, it was April before we actually got the key back.

22:28 - But because of all of the extra quick analytic work that I did, it took a 10th of the time on the hardware. So the, the hardware constant ended up being only roughly 10, 15 grand, as opposed to the a hundred grand that we thought it would take at the beginning. So he gave us a big bonus afterwards which was nice. - I actually missed this. Another one of the, the speaker goons was mentioned that it was on AWS. And they want to know if the 10 to 15 was about what you were expecting from compute cost.

23:04 - - No, we were expecting it to take far more, right?. The original estimate was around two to the 64th work, which is comparable to finding a collision in shell one, Sorry, shell one was 160 MD Five. - (indistinct) - And there were there was some recent work where to find a collision. They had to deploy an enormous amount of work to do it. I guess, MD Five they were able to do because of flaws in the site, in the, in the hash function.

23:44 - Shell one took roughly a 100K of GPU time to break. And so we were estimating, it would be comparable to do this. - And is this is what your company does like data recovery? Or is it specific to- - Not originally. Originally we were working on distributed operating system. We could get clients interested if we can get in the door, but it was right at the time when cryptocurrency was taking off and we didn’t have to talk to anybody- - (laughing) - To get in.

24:19 - Then we started doing some consulting work there built up a team of about 20 scholar developers that were top of the cream of the crop top of their field. And then built the, our chain cryptocurrency platform. Our chain started having some financial troubles. So we allowed them to hire the Devs early. We had a contract that we’d hold onto them for a while, and then they could hire them after they’d worked for us for a year, but we let them hire them early. So they’ve taken over the Dev team.

24:57 - And then we started working on some other things and this particular consulting job came up with a nice time and it was a whole lot of fun, so it’s okay. But right we’re looking for any interest in consulting work that people have. - So that was exactly what I wanted to say where into. Now that you’ve done this talk. do you have another research item on your, to do lists that you’re trying to aim at? - Sure. At the moment I’m doing some consulting work for the Ethereum foundation. I’ve got some consensus algorithm research that I’m working on. We’re working. We got access to GPT three, So we’re building- - How cool! - An adventure game, kinda liker air dungeon but with more structure using GPT-3- - That’s ow some. - We’ve got various ideas for voice assistance that we’ll be able to carry out, call somebody at a restaurant rather than figure out every different restaurants online ordering system. You just have your assistant call them up and have a conversation. GPT-3 seems to be able to have conversations. So maybe we can use that.

26:12 - - This was probably like a really small piece of what you’re doing, but I used to be like incredibly into muds. So an adventure game that’s generated by GPD-3. Sounds Interesting. - Yeah. So we’re working on the room generation for quests. We’ll have things like you have to convince this character to give it to you. He’s got desires and needs. So, you know you’ll have to be role playing while you’re doing this game, interacting with these characters. - Yeah.

26:42 - Well, there is a person that goes by the handle, evil MOG is running a Devcon CTF mud right now just a shout out to him. - We are really close to being at a time. There are- - (laughing) - Lot of questions over here about specific pieces, specific technologies. I think I’d like for people to bring those to you on a less moderated basis. So we’ll let those go for now. Before I let you go, I want to know what is the thing that you would like us to take away from this? If there is a final idea that we should walk away from your talk. - That attacks on cryptography only get better, right? At the time MD Five was proposed 128 bits for a 64 bit attack was inconceivable.

27:37 - And yet within five to 10 years, they were able to attack that one. And then Shaw, there are attacks AEs with the bike click attack. They’ve now broken, I think, seven or eight of the 10 rounds. If there is something that you need to keep secure, choose the best software and have a plan for upgrading your crypto in any products that you put crypto into, because the attacks are gonna get better. You’ll need at some point to transition from the broken system to a new one.

28:10 - And so that will come up during the lifetime of your product. So be thinking about it. - That’s definitely good advice (laughing) Fallible you got any more questions you want to sneak in under the hood? - No, I think I’m good. I really appreciate the work that you’ve done here. And thank you for coming to present and giving your time- - Thank you so much for your time. - Do this QA session. There are some more people that have some more questions coming in if you’re willing to do so, if you would put your contact information in the track, one channel and discord here, we will get that out there.

28:48 - Folks can, can look you up if you’re willing to be available to that. I also recommend you put all of your company information if you’re willing to do so, because that’s a good way for people to find you for those contracts you were talking about. - Great, thank you very much. - Alright. Take care. .