Season 3 / Episode 65
Even the best hackers are human, and humans are inescapably unique. Forensic Linguistics, Behavioral Signatures and Cultural Captchas can help defenders identify and (maybe) catch even the best of hackers.
- Episode 22
- Episode 23
- Episode 24
- Episode 25
- Episode 26
- Episode 27
- Episode 28
- Episode 29
- Episode 30
- Episode 31
- Episode 32
- Episode 33
- Episode 34
- Episode 35
- Episode 36
- Episode 37
- Episode 38
- Episode 40
- Episode 42
- Episode 43
- Episode 44
- Episode 45
- Episode 46
- Episode 47
- Episode 48
- Episode 49
- Episode 50
- Episode 51
- Episode 52
- Episode 53
- Episode 54
- Episode 55
- Episode 56
- Episode 57
- Episode 58
- Episode 59
- Episode 60
- Episode 62
- Episode 63
- Episode 64
- Episode 65
- Episode 66
- Episode 67
- Episode 68
- Episode 70
- Episode 71
- Episode 72
Born in Israel in 1975, Ran studied Electrical Engineering at the Technion Institute of Technology, and worked as an electronics engineer and programmer for several High Tech companies in Israel.
In 2007, created the popular Israeli podcast, Making History, with over 14 million downloads as of Oct. 2019.
Author of 3 books (all in Hebrew): Perpetuum Mobile: About the history of Perpetual Motion Machines; The Little University of Science: A book about all of Science (well, the important bits, anyway) in bite-sized chunks; Battle of Minds: About the history of computer malware.
Cyber Security Research - Lead, PwC United Kingdom
Matt leads the research capability on the Ethical Hacking team at PwC. Prior to joining PwC, he worked in law enforcement, leading a technical R&D team. His research interests include air-gap bypasses, RF security, online social engineering, and exploit development.
How do you catch a hacker who’s covered their tracks?
Security experts have all kinds of ways of tracking cybercriminals.
Sometimes it doesn’t take much. If a hacker isn’t careful, like Gary McKinnon whose story we told in an earlier episode of malicious life, they may leave behind personal data like an IP address, or a girlfriend’s email address. Jon Johanssen, the teenager behind DeCSS, openly bragged about his role in creating an illegal software. All anybody had to do to get to him was, well, write an email.
Advanced actors, such as nation-states, are tougher to pin down. When Github was attacked, it was the insight of one independent security professional to use traceroute–an ancient and somewhat obscure system command that tracks internet packets. When Sony, Google and the Democratic National Committee were hacked, circumstantial evidence–such as the nature of the attack, and the victims and systems that were targeted–played heavily into attributing those responsible.
But let’s suppose we’ve got a case where no such clues are available. We’re dealing with a really good hacker. They’ve listened to every Malicious Life episode, and so they know all the pitfalls to avoid. They design new malware from scratch, route their attack paths through various proxy servers, and so on.
How do we begin to approach catching such a hacker?
Human Side Channels
At this year’s Black Hat, I spoke with one cybersecurity expert who’s got a theory.
[Matt] Hi. So my name is Matt Wixey. I lead cyber-security research for PWC’s cyber-security practice in the UK. My role is to look at emerging attack vectors and kind of unconventional approaches to security for both attack and defense and to kind of explore emerging technologies as well.
So I have kind of quite an odd background getting into cyber-security. So my first degree was in English language and literature. So yeah, a bit of an odd path to cyber-security and I kind of really got interested in linguistics there. I started off being really interested in etymology which is kind of the origin of words, but then into linguistics. Then I worked in law enforcement for a few years as well.
Matt’s experience in linguistics & law enforcement combined to inform a concept he calls “human side channels.” It describes a category of information that cybersecurity investigators often use in order to track and attribute the sources of a well-executed hack, where simpler means of doing so are not available.
[Matt] So it’s a term that I’ve coined to describe unintentional leakage as a result of human behavior. So the term kind of comes from computer side channels, which are unintentional leakage in kind of primitive computer output. So things like sound and light and electromagnetic radiation.
So kind of very interesting and very active field in security and human side channels are a development of that looking at humans as kind of like a – a sort of bio computer I guess. So the fact that kind of when humans produce some kind of output, whether that’s writing or speech or typing or something like that, that there will be kind of an unintentional leakage which could be used to attribute activities to a certain individual.
There are three groups of human side channels: forensic linguistics, behavioral signatures and cultural captchas.
Let’s start with forensic linguistics, and how cyber investigators can use this practice to catch hackers much as law enforcement investigators do to catch criminals.
On December 2nd, 1949, police unlocked the door to a tiny wooden outhouse behind a run-down tenement building in West London. In it, they found the cold bodies of a woman and her one-year-old daughter.
Back in police custody, the case’s prime suspect was asked if he committed the murders. He said yes.
What would seem like the end of this story was, ultimately, far from it. Timothy Evans, the man who admitted to murdering his wife and one-year-old daughter, is remembered seventy years later because his case was not straightforward at all. Its evidence was contradictory, its timeline confusing, and its characters idiosyncratic.
Timothy Evans was not a simple man. His father had abandoned the family before his birth. Due to a chronic skin condition he missed years of schooling, and by adulthood was largely illiterate. He was an alcoholic to boot, bad with money, and highly temperamental. According to some accounts, he also had a tendency to make up stories about himself.
The spouse would already be suspect number one in an investigation of this kind, but Evans was a uniquely good suspect. His stories changed–not just a little, but a lot–as new evidence was presented to him. And he wasn’t particularly shy about confessing to the gruesome acts he was charged with.
Despite the gravity of the crimes, a bevy of conflicting evidence, and another major suspect–John Christie, his downstairs neighbor–Evans’ trial took just three days, and the jury took just forty minutes in their deliberation. Really, it’s easy to understand why. How long would it take you to find a man guilty, who had at one point openly admitted to it?
Timothy Evans, just 25 years old, was hanged in March, 1950.
Of course, he didn’t actually do it. But that only came to light three years later, when a neighbor found three bodies stuffed in the kitchen pantry of his downstairs neighbor.
The Evans case was seminal in U.K. legal history, in part because of criticisms leveled at the police overseeing the investigation. Many pointed out issues in how they handled key evidence, and how they drew out the confession.
[Matt] A linguist called Jan Svartvik later looked at the statements that Timothy Evans had given to the police and he – by analyzing those linguistic features in the statements, he found that there were two different writing styles in those statements.
One of them he could attribute to Timothy Evans who was being interviewed. But there was another one which was particularly present in passages which were incriminating and what Svartvik found was that basically the police officers who had interviewed Evans, rather than just kind of writing down word for word what he had said, they had actually influenced what he had said to make him seem guilty of the crime.
The transcripts of Evans’ confession were supposed to be word-for-word. Later analysis demonstrated evidence that they were, in fact, not. Evans, as we mentioned, was uneducated, hardly literate. Some particularly important passages in his statements to police sounded out-of-character–the kind of language more likely to come from a police officer than an illiterate man.
The Timothy Evans case demonstrated to the world how useful forensic linguistics can be in law enforcement. Had it been around in the early ‘50s, Evans likely wouldn’t have been sentenced to death. Instead, he and the other women Christie murdered after Christie was found wrongly innocent became martyrs upon which a new field was born.
[Matt] So that the linguist Svartvik, he then wrote a paper on this and coined the term “forensic linguistics” and that was in kind of 1968 I think that he coined – that term was coined. Since then, it has kind of been used to some degree in kind of law enforcement investigations in the real world.
It’s also used for like plagiarism investigations in academia. It was used to try and work out if Shakespeare had actually written Shakespeare’s plays. Same with the federalist papers and more recently with JK Rowling, with the first book after Harry Potter, where it was kind of written under a pseudonym. Forensic linguistics was used to show that it actually was Rowling who wrote it.
Forensic linguistics rests on one, inescapable aspect of human nature: that we’re all different, unique, and therefore distinguishable from one another in our speech.
[Matt] So the kind of theory that because we will have a unique upbringing and experiences and education, that we have a unique way of looking at the world and a unique way of …
[Ran] Expressing ourselves.
[Matt] Expressing ourselves. Exactly, yeah, and kind of the way we kind of format it. So forensic linguistics is kind of an example with that, particularly something called stylometry, which is writing style. So it’s looking at very kind of granular features of someone’s writing, picking those out and that would be things like average word length, average sentence length, the construction of sentences, particular kinds of terms and phrase and words, which taken together can be used to potentially identify your writing style amongst the set of writing styles.
So if I had kind of 10 pieces of writing that I know you’ve done for instance and then I had a piece of text and I wanted to find out whether or not you had written it, I could kind of pick out the features from those 10 pieces and compare them against that unknown text to see what the kind of probability is that you had written it.
An illiterate man experiencing the murders of his wife and daughter will likely express himself differently than a police officer attempting to impersonate that man. Unintentional leakage occurred when the officers working Evans’ case tried putting words in his mouth, but ended up giving themselves away.
So forensic linguistics can, in some cases, attribute testimony to a speaker by identifying unintentional leakage. The question now is whether we can find unintentional leakages in code, to identify the author of a malware.
According to Matt we can, because forensic analysis is not specific to any one form of writing. It applies to any expression of language, if adjusted properly for context.
[Ran] And that’s even without regards to the content of the actual written piece. It could be just any kind of written material that I wrote just based on the style.
[Matt] Absolutely, yes. So it’s pretty much not to do with content at all. It’s just about the very kind of granular linguistic aspects of it.
Forensic linguistics experts track spelling, grammar, vocabulary, sentence structure and word choice.
When cybersecurity experts try to catch a hacker who’s covered their tracks well, they use these same techniques. For example, a peculiarity in source code that also showed up in another malware program might be useful in tying two distinct malicious programs to the same perpetrator.
[Matt] some of the use cases for defense would include things like detecting sock puppet accounts. So social media accounts that are under different usernames but run by the same person for instance and obviously that’s something that’s very kind of relevant socially at the moment in terms of politics and kind of manipulating consensus and that sort of thing.
Another would be threat intelligence and internet response. So if you have a threat actor who is sending our spearfishing emails and maybe they’re running different campaigns into different organizations, maybe even with different pretext for the social engineering, if you kind of collect those spearfishing emails together, you can still look for common features and potentially attribute them to the same individual who wrote them.
Forensic analysis is also useful in sussing out hackers trying to mask their identities through deception. When the DNC was hacked in 2016, somebody calling themselves Guccifer 2.0 claimed responsibility. In an interview with Vice magazine, the individual claimed to be Romanian–and, you know, definitely not Russian. By 2018, forensic evidence tied the persona to an officer in Russia’s Military Intelligence Directorate, and confirmed what everybody expected in the first place: that Guccifer 2.0 was a decoy–a story meant to confuse news reporting, deflect blame, and troll investigators.
The next type of human side channel–behavioral signatures–is something cybersecurity investigators have been weaponizing for a long time now. It’s particularly useful against the biggest and most powerful hacking groups in the world, like APT1.
With all the hundreds of APTs out there, you’ve got to be pretty badass to earn the title of APT1. Even the Russian groups that hacked the DNC in 2016 only get to be APT28 and 29.
APT1 lived up to its title, though. For over a decade beginning in 2002, they successfully siphoned valuable proprietary information from over 1,000 American companies, including the United States Steel Corp, Westinghouse Electric and Lockheed Martin.
As corporation after corporation fell victim to similar breaches, some behavioral patterns started to emerge as investigators drew closer and closer to an identifiable perpetrator.
[Matt] So it’s based on a technique called “case linkage analysis” which again is a real world investigation technique. It can be kind of repurposed. So case linkage analysis is essentially kind of behavioral profiling. It’s separate from offender profiling. So offender profiling kind of looks at the features of a crime and tries to infer something about the person who committed it.
Behavioral profiling or case linkage analysis takes very granular aspects of a crime scene, of what was done during a crime and compares it to another crime to see how similar they are and then to, as a result, try and work out how likely it is that the same offender committed both of those crimes.
Just as each of us has a certain way of expressing ourselves with language–that can give us away in an investigation–each of us has certain ways that we behave, in both day to day life and in cyberspace. No two crimes are the same because no two criminals are the same.
[Matt] So in burglary for instance, in a burglary case, you might look at did the offender use a crowbar to get into the door. If so, was it at the top of the door or the bottom of the door?
Like a burglar who always breaks into homes with a crowbar, through the bottom of the front door, APT1 had its own, unique calling card. As the security firm Mandiant uncovered in a report it published about the campaign, APT1 had their own, custom method for communicating with a malware it planted inside a network. As Mandiant’s report explains, firewalls can be effective at keeping malware outside the network from communicating with systems inside the network – but less so when it comes to malware already inside the network trying to communicate with a command and control outside of it.
Using a spear-phishing email, APT1 would install a relatively simple and thin-featured backdoor inside the victim network, called a “beachhead”. The beachhead backdoor would then retrieve an HTML webpage from the C&C server. These webpages would contain special HTML tags that the backdoor malware would then attempt to interpret as actual commands. So an HTML comment field, for example, could be used, instead, as a scripting command.
For their particular use of this kind of backdoor, APT1 earned a nickname: Comment Crew. Investigators were able to use this particular leakage as evidence linking one corporate hack to another, and another, and another. By 2014 it had become clear that over 1,000 hacks of U.S. companies were not disparate, independent events, but in fact a coordinated effort by the People’s Liberation Army of China.
While attack methods are the most common sources of information in cyber investigations, there are other behavioral side channels that are even more interesting. Even the most minute features of the way a hacker types can give them away, if they’re not careful. An investigator operating a honeypot can track, for example, the speed with which somebody types, or the little choices they make when navigating around a system.
[Matt] So what I did was I got 10 volunteers who were kind of pen testers and students and that kind of thing. I got each of them to attack two virtual machines that I set up. These virtual machines were configured with deliberate privilege escalation vulnerabilities and kind of had like interesting fake data and that sort of thing.
I asked these 10 volunteers to attack each VM to try and escalate their privileges, to poke around the file system, to ex-filtrate the data and that kind of thing.
While they were doing that, I was logging their keystrokes of SSH. Once the attacks had finished, I took all those keystrokes and separated them into commands. So looking at the command switches that they were using and tools they were using and used that to kind of – to try and see if I could link together where one of those volunteers had attacked both machines, if you see what I mean.
To analyze behavioral signatures, you first identify a class of activity to focus on. For example: what kinds of terminal commands an individual uses when navigating around the computer system. It seems like such a small thing to focus on, but two different people, even when they’re trying to achieve the same goal, are likely going to take different paths to get there. If you gather enough of these little data points, you have a dataset to work with.
Next, you apply what’s called a “similarity coefficient,” and use statistical tools to yield the probability that any two attacks were carried out by the same volunteer. The math involved in reaching these probabilities is complicated, but the idea is not. The more commonalities between two different attacks, the more likely it is that the same malicious actor carried them out.
[Matt] So yeah, we had pretty good results for that. So depending on the kind of behavior type we were looking at, we got between 91 percent and 99 percent accuracy.
If you’ve ever filled out an online form, you’ve probably experienced a captcha. They’re those little boxes with obscured letters that you have to read and then type out. They’re used in order to prove that you’re a human, not a bot.
Cultural captchas are perhaps the strangest human side channel of the three we’re discussing in this episode. They refer to the little cultural signifiers that each of us knows implicitly, but only because of where and when we grew up. Like ordinary captchas, they might be a useful means of snuffing out an online disguise.
Let’s take a hypothetical. Say, I, Ran Levi, am trying to pass as an American instead of as an Israeli, which is the real case. How would you be able to tell that I’m lying? Well, for one thing, my accent will probably give me away. But who knows? Maybe I just have a peculiar speech impediment.
One thing you could do is to take me to a KFC restaurant and ask me to order a meal for the two us. Since I’m used to using the metric system and not the Imperial units of measurements used in the US, I’ll probably order enough wings and mashed potatoes to feed an average family. This might indicate that I’m not, in fact, an American. If you’re wondering – yes, this actually happened to me on my first visit to the states. And don’t get me started on your ridiculously sized coffee cups.
In the cyber world, you can try the same kind of thing. For example: much has been made of social media accounts created by Russian intelligence agents to influence Western politics. Obviously, the reason why these accounts work is because they mask their true origins effectively. They use profile pictures stolen from elsewhere on the web. And, rather than jumping in with inflammatory content from day one, they might begin by posting ordinary material over a long period of time. An account with a long history will seem more legitimate.
So how do you catch a Russian impersonating a Westerner, if they’ve covered all of their tracks?
[Matt] But say you have an account from the UK for instance that’s posting very strong views about Brexit. It would be a good example and maybe trying to kind of manipulate consensus around Brexit or to influence conversations. How do we know that that account is actually from the UK?
So there are things you can look at in terms of metadata and that kind of thing. But obviously those can be spoofed. So cultural captures, as I call them, is potentially a way to try and at least give an indication as to whether an account is actually from the country it claims to be and it’s using something called – or something I call kind of cultural references that for whatever reason haven’t really spread beyond their country of origin.
So there was actually a really interesting example in the film Inglorious Bastards, the Quentin Tarantino film. There’s a scene where – I think it’s Michael Fassbender is kind of undercover as a Nazi officer in France and he’s trying not to give himself away and he signals to the bartender for three glasses and he holds his first three fingers up. That actually gives him away because in Germany, when people signal three, they put their first two fingers and their thumb up.
So that’s kind of a – obviously that wouldn’t work now as a cultural capture because everyone knows about it. But that’s kind of an example of little kind of cultural differences that if you’re not from a particular country, you haven’t spent a lot of time in that country, you wouldn’t necessarily know.
Cultural captchas are very niche, and don’t really factor into common cybersecurity practice today. It doesn’t mean they aren’t useful, just that we’re only just starting to develop these kinds of methods.
[Matt] It needs more research, yeah. It’s kind of an initial idea and again, as with linguistics and as with the behavioral stuff, it’s something I’m kind of trying to get people to get involved with and see whether or not it could work.
How do we catch a hacker who’s covered all their tracks? A hacker who’s created their malware from scratch, spoofed their identity and routed their connection through a maze of servers that you simply can’t track them through.
By analyzing human side channels, we may just have a few newer ways of identifying even the cleverest of malicious actors. If we can capture their malicious program, we can use forensic analysis to piece together clues as to what kind of person or persons may have written it. If we can cross-examine multiple attacks and find similarities between them, it might reveal details about how our attacker tends to operate. Maybe they left little clues, without even realizing it, identifying them as being of a particular nationality.
These analysis methods all rely on the same underlying truth: that hackers are people, and people are unique and imperfect. There is no such thing, really, as covering all your tracks. Even the best hackers leave some clues behind. It’s the job of cyber security experts develop the tools and methods necessary to sniffing out those clues, however well-hidden they are.