DEF CON 29 - Eugene Lim, Glenice Tan, Tan Kee Hock - Hacking Humans with AI as a Service
Aug 5, 2021 17:35 · 4654 words · 22 minute read
- Hello, everyone. Thank you for attending our session.
00:03 - My name is Glenice. And together with my team, we will be presenting, Hacking Humans with AI as a Service.
00:09 - We are part of a research team of four from Singapore.
00:12 - We work at a government technology agency’s Cyber Security Group, or CSG for short.
00:17 - Let me introduce to you our team. Firstly, we have Eugene, who has been working at CSG for about one and a half years.
00:24 - He specializes in Application Security and Vulnerability Research.
00:28 - He is also an avid white hat hacker. And many will know him by his handle, Spaceracoon.
00:33 - Next up, this is me. I’m a part of the Red Team and Social Engineering Team in CSG.
00:38 - Apart from human hacking. I’m also interested in web in cloud security.
00:42 - Along with Kee Hock be presenting with us today.
00:45 - We lead the AI phishing exercises. Kee Hock has been at CSG for about two and a half years, where he contributes to our Red Team and Cyber Engineering Capabilities.
00:54 - On top of that, he participates actively in Capture the Flag competitions.
00:59 - Last but not least, Timothy is a member of the Mobile Penetration Testing Team at CSG.
01:04 - And also a (indistinct) in Red Team exercises.
01:06 - He led the Technical Defense Research against AI generated texts.
01:11 - All right, let’s get on to the exciting stuff.
01:13 - Here’s what we will cover today. Starting with an introduction of human hacking.
01:17 - We also experience some challenges faced when conducting social engineering exercises.
01:22 - Next, we’ll have a quick look at the recent advances in AI and how the challenges faced in phishing could be addressed with AI as a service.
01:30 - Then we will examine the viability of human hacking using AI as a service with Tuna Fish, a service pipeline we developed for simulated phishing exercises.
01:39 - Finally, we’re going to the defenses against AI phishing, such as, automated AI detection before summing up important takeaways from our presentation.
01:48 - Let’s begin an introduction to social engineering, the old school method.
01:53 - Social engineering is a psychological manipulation of people into performing actions they usually may not.
01:59 - Three common influencing tactics include, authority, whether a attacker assimilates someone with the right to exercise their power.
02:06 - They may also attempt to manipulate the decision making process by creating a sense of urgency.
02:11 - Providing context specific information tends to lower people’s guard as the attacker attempts to exploit the pattern targets are comfortable in.
02:19 - All this aims to exploit our blind spots and break security checks.
02:23 - Here are three common ways social engineers can deliver their attack.
02:26 - Firstly, email phishing, which is the main focus of our presentation today.
02:30 - I believe many of us here are familiar with it.
02:33 - If you haven’t received a phishing email before, chances are you have, just unknowingly.
02:37 - Phishing can also be done via a phone call, and that is known as vishing.
02:42 - Lastly, impersonation often allows social engineers to enter a basis they shouldn’t have access to.
02:47 - Social engineering is practiced for many different reasons, malicious actors wants your money and information, so they create a pretext in an attempt to beat you.
02:55 - But it can also be used by authorized actors, such as Red Team engineers, and those involved in raising awareness and conducting security training.
03:03 - Speaking of security training and exercises, here are some statistics.
03:08 - According to a 2020 report by Terranova Security, almost one fifth of employees in the targeted companies click on simulated phishing email links, even after going through a training program previously.
03:19 - This rate increased to 43% in the study on spear-phishing emails against more general users.
03:26 - How are phishing emails sent to the victims? Let’s now take a look at email phishing campaign from a victim engineer’s perspective.
03:35 - Here is a typical manual phishing workflow in a simulated phishing exercise.
03:39 - The victim operator would typically start with intelligence gathering of the target.
03:43 - This is also commonly referred to as OSINT.
03:46 - This includes looking at your social media presence.
03:49 - OSINT can be done manually or with the help of tools.
03:53 - Then with the understanding of the targets, the operators work on crafting a suitable phishing email.
03:58 - The amount of time required for context generation varies depending on each target.
04:02 - The Red Team operator will also craft their email based on different weapons of influence, such as, authority, scarcity, and so on.
04:11 - We can see that the entire workflow is manual and dependent on the operators.
04:16 - Even though spear-phishing often yields good results, it may require significant time and effort to analyze the results from OSINT, contextualize it and bring some of possible pretexts.
04:27 - These efforts increase with each new target and these challenges can be broadly categorized as behavior analysis and text generation.
04:36 - Interestingly, these are also strengths of AI.
04:40 - Capabilities of AI have improved over the past few years and we now have a greater understanding and more realistic expectation of what AI can do and cannot do.
04:49 - For a start, here are some tools that are meant to be used by recruiters and promoters to automatically analyze public information such as LinkedIn profiles.
04:58 - Coming from an ethics perspective, we could repurpose it to perform personality analysis.
05:03 - Although the results could only provide a general direction, it is a good foundation for the Red Team engineers to work on.
05:10 - Humantic AI was used in our phishing pipeline, but many of these tools have a freedom that allows anyone to register and use the API right away.
05:18 - Next up, text generation. During last year OpenAI released a text-in, text-out interface for assessing the state of the GPT-3 language model family.
05:27 - The initial release of the API generated a lot of hype, such as, an article in “The Guardian”, that’s claimed to be written by the GBT-3 API.
05:35 - While the amount of curation inputs are unknown, it was good, GPT-3 possessed significant capabilities in natural language generation.
05:43 - In practical terms, the GPT-3 API in this presents a major accessibility.
05:50 - Based on the estimates by Lambda Labs, GPT-3 will have taken a really long copy time and millions of dollars (indistinct).
05:56 - But users can now use it via the API for just a few cents per thousand tokens.
06:03 - Open API provides four language models of different quality and price from Ada to Davinci.
06:08 - The instruct-series beta for Curie and Davinci are recently released, optimized to understand and follow text instruction.
06:15 - As a simple example, we could input the following instructions.
06:19 - Explain quantum physics to a six year old and here’s what the API returned to the instruction.
06:24 - How realistic are the emails generated by OpenAI? With this few sentence, we instruct the model to create an email to encourage target to open an attached document.
06:34 - Right. Nicely done there with a proper email format.
06:38 - We’ve created a context and detailed instructions.
06:41 - No model may generate coherent email content for the considerations of the Red Team operators.
06:47 - When generating context, the temperature parameter in OpenAI comes in handy.
06:52 - It controls the randomness of the text generator, the lower the temperature, the more deterministic area of the particular results.
06:58 - Thus, with the same prompt, a different type of output can be obtained.
07:02 - This will likely ease the process of pretext generation.
07:07 - Having introduced the viability of an AI phishing pipeline, how are the results in the real world? I’ll let Kee Hock elaborate on this and deliver the rest of our presentation.
07:15 - Over to you, Kee Hock. - Thanks, Glenice.
07:20 - Hello, everyone. I’m Kee Hock. I will be sharing more on how we can use AI as a service in Red Team operations, particularly in the conduct of phishing campaigns.
07:30 - In addition, I’ll be sharing our team’s work on detecting synthetic texts and also to highlight some good governance practices in which both AI service suppliers and consumers can adopt.
07:45 - Previously, Glanice did a good job on highlighting some of the AI services available in the market.
07:51 - What you see in this diagram is the revised phishing workflow after we applied various AI services, during the different stages of the phishing campaign.
08:00 - I’ll be focusing on the key differences in the workflow.
08:05 - Firstly, Red Team operators will perform reconnaissance on the target, such as, gathering the target’s LinkedIn profile, Twitter profile, blog posts.
08:18 - Next, we’ll feed this open source information into Humantics AI, which will perform personality analysis on the target.
08:26 - For phishing context generation, we repurposed Humantics AI API demo to perform personality analysis.
08:34 - The service is meant to be used by recruiters and sales people to automatically analyze public information such as LinkedIn profiles, to produce a personality report and generate competition advice.
08:47 - The Humantics AI API will return adjacent output as shown in this snippet here.
08:52 - It contains the personality analysis results of the target.
08:56 - Just a point to note, Humantics AI is just one of many AI services which you can adopt in this pipeline.
09:02 - Depending on your preferences, you can choose to integrate other personality analysis services.
09:09 - We thought it would be cool to use these services from a different speed, especially with a Red Teaming perspective.
09:15 - These services are extremely useful as they help us quickly analyze the targets and allows us to crop up more personalized instruction set to feed OpenAI’s GPT-3.
09:27 - Next, we have passed the adjacent output into simple set of plaintext instructions and feed it to OpenAI’s GPT-3 model.
09:34 - Essentially, the plaintext instruction described to the model about the target and how to approach them.
09:40 - Depending on a team’s requirement, the logic to pass adjacent output can be customized to feed a certain context.
09:47 - For example, you may want the AI to generate a text message instead of an email.
09:52 - In this example, we are simply asking the model to craft an email, to convince the target to take interest in our company.
10:00 - We ask the model to be formal and objective.
10:04 - Finally, we have the output for OpenAI’s GPT-3.
10:08 - At this point, depending on the team’s comfort level, you may choose to edit the output to repay some of the elements of the content.
10:14 - For example, you may choose to change the sender to another team or department.
10:19 - In our experiment, which we will share later on, we made edits to the center and the subject title.
10:24 - We refrained from editing the main email content.
10:28 - As you can see, the generated email is coherent and fairly convincing, applying authority and consistency in its instructions.
10:37 - However, there are a few quotes that require human edits.
10:40 - For example, you may want to add in a phishing link or you are containing the phishing link.
10:46 - Nevertheless, this demonstrates the need for a human in the loop for this AI phishing pipeline.
10:54 - Now, the phishing content is generated and ready to be delivered.
10:58 - For the subsequent steps, it’ll be the same as any other phishing workflow like the one Glenice shared earlier.
11:04 - So, in our case, each phishing email sent to the target has a unique phishing link, which helps to track if the target click on that phishing link.
11:13 - You may choose to use any phishing framework to perform the delivery or the phishing emails.
11:18 - In our case, we built a custom pipeline with a custom phishing tracking system.
11:23 - This will still accommodate some of the constraints we face when conducting the phishing exercises.
11:28 - Before I talk about experiment, I would like to share that, although we build our pipeline with a custom user interface and backend, the possibility of your API’s meant that we could easily integrate this into any existing tools, such as the GoFish open source phish Framework.
11:43 - This highlights how AI service offers a step up in accessibility from open source language models, rather than having to worry about a compute and server-side generation.
11:54 - Adding AI capabilities simply requires an API key and a (indistinct) request.
12:01 - We put the revised workflow to the test for three months.
12:04 - We conducted two types of experiments with over 200 targets across multiple authorized phishing exercises.
12:12 - The first side of our experiment, which we call Type One Experiment, is designed to investigate the effectiveness of convincing the targets to click on the phishing link in a phishing email.
12:23 - Our Type One Experiment, each target receive two emails.
12:27 - One will be generated by the AI, while the other will be generated manually by the Red Team operator.
12:32 - The second type of experiment which we call Type Two Experiment, is designed to investigate effectiveness of convincing the targets to open malicious attachment in the email.
12:43 - For Type Two Experiment, the target group is further divided into two subgroups.
12:47 - One subgroup will receive AI generated phishing content, while the other will receive phishing content generated manually by the Red Team operator.
12:59 - Next, I’ll elaborate more about a set up for a Type One Experiment.
13:02 - Essentially, we conducted in two stages. Stage one, is a mass phishing exercise to identify targets who are susceptible to phishing.
13:12 - Stage two, is a spear-phishing exercise to harvest credentials from the targets identified from stage one.
13:20 - For each stage, we collect a few key metrics.
13:23 - First, we measure the phishing success rate, which is the percentage of targets clicking on the phishing link.
13:29 - Next, I’ll elaborate more about a set up for a Type One Experiment.
13:33 - Essentially, we conducted it in two stages.
13:35 - Stage one is a mass phishing exercise to identify targets who are susceptible to phishing.
13:40 - Stage two is a spear-phishing exercise to harvest credentials from the targets identified from stage one.
13:47 - For each stage, we collect a few key metrics.
13:50 - First, we measure the phishing success rate, which is the percentage of targets clicking on the phishing link.
13:56 - Next, from the victims who click on the links, we observe the interaction and measure the percentage of victims that interacted with the phishing site.
14:04 - The table you see on the right is a sample size for each phishing exercise conducted using Type One Experiment set up.
14:10 - Notice that there are some rows under the human column that are empty, especially for stage two.
14:16 - This is because there are no victims who fell prey in stage one for the content generated by human operators.
14:26 - So this is the results from our Type One Experiment for stage one mass phishing campaign.
14:31 - The results are encouraging. To then interpret the chart, we take a look at the bar chart.
14:46 - Next, we will show the results from our Type One Experiment, for stage one mass phishing campaign.
14:52 - The results are encouraging. To interpret this chart, we take a look at the bar chart first.
14:57 - For example, the column represents the exercise.
15:00 - In exercise A, 20% of the targets picked on the phishing link from the phishing content generated by AI, while none fell prey for the phishing content generated by humans.
15:11 - Then we move down. We look at a corresponding pie chart.
15:14 - For exercise A, 80% of victims who clicked on the phishing link interacted with the phishing site.
15:20 - In our case, they interacted with the forms on our phishing site.
15:24 - While AI outperformed the human counterparts in exercise A and C, exercise B tells us a slightly different story.
15:32 - In the bar chart for exercise B, we see that human generated content had a 9. 4% phishing success rate, while the AI generated content only had an 8. 55% phishing success rate.
15:44 - In terms of raw numbers, the differences in percentages translated just one more victim who fell pray for the human generated content.
15:54 - Now, we move on to stage two results. Just want to reiterate that stage two victims were identified from stage one.
16:01 - Basically, those who click on a phishing link from stage one will be our target for stage two.
16:06 - The only difference here is that we are conducting a spear-phishing exercise, which means that phishing content is more personalized using the workflow revised.
16:15 - In this chart, for exercise A, we see considerable success in convincing the victims to click on the phishing link.
16:21 - And 33. 3% of them attempted to submit their credentials.
16:26 - Exercise B, results were more interesting because in stage one, human generated content outperformed the AI counterpart.
16:33 - But in stage two, AI outperformed the human generated content.
16:38 - For exercise C, our phishing site is (indistinct) by most browsers as a deceptive site.
16:43 - We postulate that the corresponding phishing emails are likely to be caught by email filters or any other defensive controls.
16:50 - Thus, the experiment kind of stopped for exercise C.
16:56 - Finally, let’s talk about Type Two Experiment, whereby we investigated effectiveness of phishing content generated by AI and human in convincing the targets to open malicious documents attached in the email.
17:08 - For Type Two Experiment, we only conducted stage one mass phishing campaign for both target groups.
17:14 - Do note that the target groups are receiving either of the phishing content are mutually exclusive.
17:21 - For Type Two Experiment, we are looking at the percentage of targets who open the malicious document.
17:26 - In this exercise, the AI generated content convinced more targets to open the malicious document.
17:36 - From our experiment, we’ve made thee key observations.
17:39 - Firstly, the revised phishing workflow using AI as a service definitely helped to reduce the time required for phishing content curation and context analysis.
17:47 - This allows Red Team operators to focus on other tasks, such as performing OSINT investigation on targets, which can be difficult to automate.
17:56 - Next, we observed that AI generated content was more convincing compared to the human counterparts.
18:01 - However, this is still not conclusive, as there may be other variables at work.
18:05 - Nonetheless, this is an indication that AI does have a lot of potential in innovating Red Team operations.
18:12 - Lastly, we observed that there’s a varying degree of governance relating to the access of AI services in the market.
18:17 - For example, during our research, some of the AI services can be accessed by simply registering for a trial account, while services like OpenAI was certainly way stricter in terms of the usage, as they validated use cases by the individual when they attempt to register for an account.
18:39 - Having demonstrated the viability of an AI phishing pipeline, we decided to explore potential defenses against it.
18:46 - This is a important question, as success to AI as a service grows over time.
18:51 - In the next segment, I’ll be sharing more about the possible defenses against AI generated phishing content.
18:57 - Kudos to Timothy for the heavy lifting on this aspect of the research.
19:06 - Naturally, the first thing that came into our mind is to detect synthetic text.
19:10 - However, detecting synthetic texts remains a hard problem.
19:14 - There are three main ways that we can go about solving this.
19:17 - Simple classifiers where we define certain rules based on tags or using MLBs and LT classifiers.
19:25 - Fine-tuning based detection, where we train a new model based on inputs from different kinds of language models from GPT-3, GPT-2, BRT, and so on.
19:35 - Zero-shot detection, using this method, the models learn a classifier on a set of labels, and then evaluates on a different set of labels that a classifier had never seen before.
19:46 - We have decided on this approach as it gives us more flexibility, as we will be able to use it to predict other sets of language models.
19:55 - The white paper for GLTR has given us insightful solution to tackle our problem.
20:01 - Again, we want to highlight that detecting synthetic texts still depends on the model that’s used to generate a text and a model use for text detection.
20:12 - (indistinct) Language Model Test Room, or GLTR for short, is a project based on a collaboration between MITIBM (indistinct) and Harvard NRP.
20:23 - Based on the white paper released by them, the approach to detect synthetic texts were the probability of a word given the previous words in sequence.
20:31 - The absolute rank of a word, the entropy of the predicted distribution.
20:37 - Our research used the same metrics along with GPT-3 to evaluate if an email is written by a language model or a human.
20:44 - We have also formed GLTR report and used the GPT-3 model as shown in the screenshot.
20:56 - While deciding the approach of detecting synthetic texts by GPT-3, one major challenge our team faced was that we do not have direct access to the model.
21:04 - Without their access, we are limited in the parameters that we are able to control.
21:09 - We are also limited in the datasets that has returned from generated texts, such as, top 100, lock prop, or choice of words.
21:17 - We have decided to extend GLTR in our research as the data we can get from GPT-3 has transferable patterns from GPT-2, which was used in GLTR.
21:26 - With that you will look at the finding from our research.
21:31 - In our tests, we compared GPT-3, GPT-2, GPT-2 Tuned and human samples.
21:38 - Our GPT-2 Tuned model is based on an email corpus that our team has harvested.
21:44 - For GPT-2 and GPT-2 Tune model, we followd the settings that was described in the white paper.
21:50 - We found that human samples, humans frequently use words that are out of the top 100 predictions from the GPT models.
22:01 - We found that for human samples, humans frequently use words that are out of the top 100 predictions from the GPT-3 model.
22:10 - Looking at the KDE graph reinforces the conclusion that humans frequently use words that are out of the top 100 predictive words from GPT-3.
22:20 - The graph for humans also looks more distributed compared to GPT-3.
22:24 - From our research, we can conclude that evaluating a probability for a sequence of texts is a good indicator of whether the text is synthetic or written by humans.
22:34 - However, there are limitations in this approach, as it is heavily dependent on the model that’s used to predict the sequence of texts and the model used to generate the text.
22:46 - Next, we put our proof of concept GPT-3 detection two to the test.
22:51 - In this slide, you will see three email samples.
22:54 - Only one is crafted manually by hand, while the other two is generated by OpenAI.
22:59 - Do not worry if you cannot figure out which one is generated by AI.
23:02 - Even for our team, we find it difficult to differentiate which content is generated by AI and which isn’t.
23:09 - Let’s take a look at sample A. This is the visual output from the tool for sample E the ratio between the properties of a top predictor and the following. What is that? 0. 8 tree tree.
23:22 - Then we look at sample B, the ratio is zero point fall.
23:27 - So an observation from the GLT, our team is that when you think are to the detection that attacks attacks generated by humans tend to be more or rather have more highlights.
23:39 - This is because human generated content tend to have more random words in terms of the choice of words, and often forced out of the top a hundred predict of what by GB tree needs to be. Look at sample B.
23:50 - The ratio is that 0. 84. So an observation from the GITR team is that when you, you have to detect synthetic text or tax generator by humans tend to have more real highlights.
24:02 - This is the costume and generated content tend to be more random in terms of choice of words, any of the force out of the top hundred period of words, as highlighted earlier.
24:11 - Finally, we look at sample C. The ratio is at 0. 55 for the relative lower ratio is an indication that is human generated content, which in the case, this is true on top of identical defenses.
24:26 - We look at the non-technical aspect of the book.
24:28 - We found that model EA governance, Fremont released by Singapore is a useful reference to address key ethical and governance issues.
24:35 - When deploying AI solutions, we will not be covering the content of framework, but rather we extract the development details of it for both suppliers and consumers, they can adopt the following, use the proposed implementation and self-assessment guide from the framework, establish policies for explanation and practice general disclosure use perform ethical evaluation, implement clear roles and responsibilities for ethical deployment of AI.
25:07 - For consumers, they can adopt the human in the loop approach for AI, the decision-making for suppliers and show traceability and auditability of use also to enforce acceptable use policies.
25:22 - Finally, our conclusion, we somehow are sharing three key points for everyone to take away.
25:29 - Firstly, the rapid growth of AI service has placed advance cost effective AI text generation capabilities in the hands of the global market.
25:38 - They can be used by both authorized or malicious actors, but ultimately the tools can be used to build defenses against AI generated texts.
25:48 - Current approaches are brittle and model dependent.
25:51 - AI assisted human detection of AI generated text could be more effective.
25:57 - Lastly, decision makers have the responsibility to implement sound strategies, governing the supply and consumption of advance AI service tightening the usage of advanced AI as a service can potentially reduce the likelihood of abuse we’ve reached the end of our sharing, and we hope that there are takeaways for you from our work to catch us live at Def con and we are more than willing to discuss our resist review.
26:24 - Thank you. .