China Vs. Github

Centuries before the Aristotle, Cleopatra or Jesus Christ walked the Earth, the Chinese were building walls. In ancient times, they were defenses between warring ethnically Chinese tribes. A thousand years later, walls were erected to keep out warmongering Mongols. Today, the Great Wall of China remains a standing relic of those times.

But walls are about more than just military defense: they are a means for enforcing separation between peoples. We say communities are “gated” when they isolate groups of common tradition, or wealth. Artificial borders such as the Berlin Wall create teams on either side, separating the “good” from the “bad”. For China, walls have always been about not just kinetic, but ideological war. The Mongols were violent conquerors, sure, but they were also seen as barbaric, brutish–their mere presence threatened to taint the more pure Chinese way of life. In this sense, the Great Wall kept out not just foreign armies but foreign cultures.

The Great Wall of our era was built with very much the same purpose in mind. This wall, however, is perhaps even more imposing, more threatening than the real Great Wall of China is. And it’s not an earthen wall. It’s not a stone wall. It is a firewall.

You probably have a firewall installed on your computer, or your router at home. It functions perhaps less like a wall than a gate, or a door: filtering incoming data to allow legitimate network packets through, and keeping unwanted, malicious packets from making chaos on your machine. Just as your front door or front gate allows you to discern who can come into your home and when, so too do firewalls try their best to give you full control over what information is allowed entry onto your computer.

For two decades now, China has viewed their entire national internet in much the same way as you would your home, or your laptop. Beginning in 1997, the Chinese government imposed strict laws on internet usage by its citizens. In 2003, it began the “Golden Shield Project”, a massive cyber surveillance and censorship infrastructure project. Golden Shield would be, essentially, a firewall: through common hacker methods such as IP blocking, DNS spoofing and URL filtering, the Chinese government could gain a firm grasp over what internet resources its citizens would be privy to. That project was completed in 2006, and it has resulted in what is now cheekily referred to as “The Great Firewall of China”.

The problem with China’s internet front gate is how exclusive it is about what kinds of guests are allowed in. Where Mongol “barbarians” were once the enemy that needed to be kept from infecting China’s citizenry, today it is quite often democracy–freedom of speech, thought, even free markets–that is perceived as a threat to the government’s authoritarian, one-party rule. Any news or opinion not overtly friendly to the government is shut out before citizens can be exposed to those alternative views. American companies like Google and Amazon are not allowed to operate within China–instead, equivalent companies like Baidu and Alibaba are created within national borders, offering equivalent products and services. In a sense, the Great Firewall has created less of an internet than an “intranet”.

But, like any wall, rebel and miscreant actors find holes in the structure, or ways to break it down. For China, those rebels were two organizations–a nonprofit called “Great Fire”, and the New York Times–and their means of entry was Github.

Great Fire is a company whose sole purpose is to expose and circumvent China’s Great Firewall. Perhaps their most common tactic over the years has been to find ways of delivering otherwise banned Western information sources–whether it be the BBC or Google–to Chinese citizens. The New York Times got on China’s bad side when, in October of 2012, they published an article detailing the “hidden fortune” of the nation’s head of state, Wen Jiabao. Instead of bowing out, the Times began working around Chinese censors to reinstate their Chinese-language paper.

But how did Github became the vehicle for these two political organizations?

With almost 30 million users, Github is the worldwide industry standard website for programmers to upload and share source code. Both Western- and Eastern-hemisphere entities, including many Chinese businesses, rely on the site not just as a repository, but as a means of cross-business communication. Therefore, Github is one of those few Western-based websites that China can’t so easily get rid of. Think about it: China can ban Facebook and YouTube because it can build its own Chinese Facebook (WeChat) and its own Chinese YouTube (Youku Tudou)–along the way helping grow its internal economy, while taking only minor penalty to international business. Github is different, because Chinese businesses rely on it to communicate with non-Chinese businesses. If China were to pull their usual tricks with Github it simply wouldn’t work. On January 21st of 2013 the government tried just that–using the Great Firewall’s DNS hijacking function to block all traffic to Github flowing in and out China. Businesses and programmers around the country protested, and the site was subsequently reinstated only a few days later.

So Github is kind of untouchable, making it a useful bridge through China’s Firewall. But more than just a bridge over water, Great Fire and the New York Times realized Github could also be a Trojan Horse.

Here’s Ross Rustici, an author and Senior Director of Intelligence Services at Cybereason:

[Ross] So GitHub is a repository of malicious information from the Chinese perspective. The problem with GitHub is the host code and host capabilities that anybody can use and so you’re going from a Whac-A-Mole problem of I’m going to go to GitHub. I’m going to download as a private Chinese citizen the code to get around the Great Firewall or a new VPN that the Great Firewall currently isn’t blocking.

Crucially, Github is served over HTTPS, short for HTTP Secure. HTTP, or Hyper Text Transfer Protocol, is the means by which data is sent between a browser and a website. HTTPS adds a layer of encryption to the data transfer process, making it more difficult for malicious actors to peek in. Because Github operates under HTTPS, it’s impossible to distinguish one Github page from another when looking at it from the outside. So the Great Firewall can’t tell the difference between the Github base for the Great Fire, and the Github base of a Chinese company–it all looks the same, Github “miscellaneous”. To get rid of Great Fire, therefore, would require taking down the whole site.

Knowing this, China’s infiltrators used Github primarily to host “mirror sites”–near-identical copies of websites otherwise blocked by the Great Firewall. Creating mirror sites isn’t necessarily a malicious act in itself: many businesses create copies of their websites to dilute large amounts of traffic–like opening up another lane to a highway. Mirror sites can also be a way to offer faster speeds to internet users in different geographic locations: instead of pushing people in Argentina and Japan to access the same site, you create two versions of the same site, one hosted in Argentina and the other in Japan. The New York Times, instead, created a mirror site on Github to offer Chinese-language news to Chinese citizens, against the will of the Chinese government. Great Fire hosts a number of mirror sites through Github, and post links to proxy servers Chinese citizens can use to circumvent the Great Firewall to access whatever portions of the internet they wish.

[Ross] So the Chinese constantly have to find the new technology, a new circumvention and go after the individuals or they can go after the data broker which is GitHub in this case. You’re not going to prevent people from creating these capabilities. You’re not going to ever be able to prevent roughly 20 percent of the population from being willing to use them.

What you can do is make it very difficult for the people who create them to disseminate them to the ones that want to use them.

Not everyone outside of China was so happy about these political acts, though. Some Github members criticized the Times and Great Fire for politicizing an otherwise, peaceful, apolitical forum, and exposing the whole website to the risk of a Chinese attack. Really, whether you liked it or not, this was activists using an internationally beloved website as a shield for their own, anti-Chinese political goals. Did the ends justify the means? Either way, the means sure did work.

Here China had a gaping hole in its otherwise indestructible wall. With no obvious way of patching it up, they tried something unique: standing at the threshold of the opening, waiting, watching as people came through.

At just around 8:00 p.m., on January 26th of 2013–only five days after trying and failing to ban the website within their borders–the first Chinese-based Github hack occurred: a “man-in-the-middle” operation. Man-in-the-middle works much like it sounds: the hackers position themselves to sit on the sidelines and watch traffic in and out of the website from within their borders. In this case, the Chinese-backed hackers managed to do this by faking SSL certificates. You can think of an SSL certificate like a computer ID–what verifies, say, a company as the company they claim to be. The “Secure” in HTTPS is actually short for “Secure Sockets Layer”, SSL. It’s a handy tool that allows organizations to keep their data from prying eyes, and enables machines to detect malicious knock-off websites before you insert your credit card information into what you thought was “Paypal.com” but was actually “Paypel.com”.

Faking an SSL certificate isn’t usually an effective tactic, because certification authorities make sure of it. So this man-in-the-middle plan had a fault: the fake Github ID was really, obviously fake. Users who visited the URLs in question would see their browser windows pop up with a warning that they were visiting a potentially malicious site. Firefox and Google Chrome users who had visited Github at least once before simply would not be allowed to continue on.

Those who ignored or didn’t notice their warnings gave the Chinese government free view of their web activity on the site, as well as their personal information including IP addresses and login credentials. But here’s the thing: most people who use Github–a website for programmers–are pretty good with computers. Githubs operator’s began to see a flood of reports of invalid SSL certificates. Within an hour, the hackers were booted out of the network.

A postmortem report from Great Fire acknowledged that while the nature of the hack pointed to China, the specific identity and motivation of the hacker or hackers in this case remained unknown. The prevailing theory is that the attack came in response to a petition on whitehouse.gov, titled: “People who help internet censorship, builders of Great Firewall in China for example, should be denied entry to the U.S.” The petition was published on January 25th, one day before the man-in-the-middle attack, and linked to a list of names of people who purportedly had contributed work towards the Great Firewall. That list was hosted on Github. Among the thousands of comments to this list–most of them in Chinese–was yet another list with even more names of supposed Firewall builders, including contact information from these individuals such as personal websites and email addresses.

Due to the blatantly obvious and ineffective nature of the attack, analysts posited that perhaps it wasn’t the Chinese government, but an unknown person or group of people included on those Github lists–an actor without the resources and knowhow of the whole government, but with the motivation to monitor and perhaps stop traffic to this portion of the site. Nobody knew for certain.

By early 2013, Chinese officials must have been racking their brains, trying to figure out some way to patch up the hole in their Great Wall. There was no evident way to take down these Github pages without taking down the whole site, or spying without being caught. I mean, what could they do now, short of nuking the whole site?

“Hmm, nuking…tell me more,” they said.

After failing in their first attack, China retreated. Sometime thereafter, the leadership decided their defense infrastructure wasn’t enough: with a wall you could keep out most foes, but you couldn’t get rid of them entirely. What they needed was to go on the offensive.

Two years later, the People’s Republic of China would come back with more force than anyone would’ve imagined, wielding a weapon that harnessed the strength of millions of computers at once. It was a cannon so powerful it could break through any wall, take down any website. You may have even helped it fire. Yes, you.

This weapon would come to be known as “the Great Cannon”.

One fifth of the entire world’s population–1.35 billion people–live within China’s national borders. That’s as many people as you can find in all the Americas put together, plus Europe and Australia. Brazil, the U.S., Germany, Chile, Mexico, England–throw them all in the same pot and you get no more people than live in China alone.

Such a large number of people connotes a great amount of power for China’s government, which not only sees over 1.35 billion people but actively controls them under an authoritarian rule. That’s why The Great Firewall of China is such an imposing thing: it’s a toll stop along the most popular internet highway in the world. Chinese leadership bears the power to do whatever they want with their highway: to impose whatever toll they deem fit, or close off whatever lanes they wish. But in 2015, the leadership took it one step further, installing an officer’s station at the gate. With a traffic cop manning the influx of cars, they now had the ability to divert one lane of traffic to whatever destination they wished–perhaps a highway exit totally out of everyone’s way. This was the Cannon.

[Ross] So the Chinese have used a couple of different attack methodologies with the Great Firewall. One is using the Great Firewall because it essentially has the ability to hijack any IP address in China. So it is arguably the highest concentration DOS you could possibly do as a single website.

Because theoretically, the Great Firewall would generate traffic for every single IP address in China and direct that towards a website. That’s really what the Great Cannon started out as was – we have this latent bandwidth capability. Let’s just blast websites off the face of the internet.

On March 26th, 2015, the end of that highway exit lead to Github.com. The sheer number of drivers headed straight towards Github caused a huge crash that lasted for days.

The Great Cannon isn’t The Great Firewall: they are distinct mechanisms, one offensive in nature and one defensive. However, they share a good deal of the same source code. And, just like a toll stop and a traffic officer, the two programs are co-located–that is, they’re housed in the same place, built by the same people.

But the Great Cannon is just an artillery mechanism, a means of firing–it still requires an ammunition supply.

[Ross]: Yeah. So Baidu is the best – the Google of China. It is predominantly a Chinese internet Goliath that has its hand in just about everything. So it’s the number one search engine in China. It has a PayPal type equivalent payment system for electronic payments. It has its own chat program that is very popular. It is an ecosystem onto itself within China and it has a massive user base on a continual basis.

Baidu is the fourth most visited website on the Internet, for the same reason that the Great Firewall is imposing and the Great Cannon is dangerous. You may not have ever visited Baidu yourself–maybe you’re not even really sure what it is. But because China’s population is so large, and because their Firewall prevents its citizenry access to unwanted Western companies, purely Chinese-oriented companies like Baidu can become some of the biggest in the world before they even consider catering to the West. There is, however, a devil’s bargain to operating an internet company in China: if the government controls the internet, then they in a very real sense control the companies that operate over it. Unlike in the States, private entities do not hold legal right to counter any government intervention into their product or holdings. That’s how the world’s fourth largest company became enmeshed in a hacking scandal they may have known no part of–ammunition in the Great Cannon they likely had no prior knowledge of–and yet, with all their power and influence, had no way of stopping.

A prerequisite to piecing together the second Github hack is that you really have to grasp how Baidu is Chinese Google, in just about every way. You probably know by now that Google isn’t only operating when you type www.Google.com into your browser search–it’s also there powering other websites’ search functions, targeted ads and analytics data. Google is to the internet what Michael Jordan is to basketball: basketball exists outside of Michael Jordan, but for many people, when they think “basketball” they think “Michael Jordan”. So in the way that Google is almost synonymous with the internet in our minds here, Baidu is the same over there. If you’re a Chinese speaker, no matter what website you’re on, chances are you’re interacting with Baidu services in one way or another.

One of the foremost services Baidu offers is their equivalent to Google Analytics, a tool called “Tongji” that allows websites to track visitor behaviors. On March 26th, 2015, by visiting websites served by Baidu’s analytics tool, a huge number of unwitting Chinese-language internet users became ammunition for the Great Cannon. Put another way: if that first Github hack was a man-in-the-middle attack, the second Github hack was a man-on-the-side. It worked like this…

You’re an innocent, Chinese-speaking citizen, accessing the internet from…let’s say Taiwan. (By the way, an important note here: most of the traffic used in the Great Cannon’s attack came from countries neighboring China, such as Taiwan, rather than within China itself. Anyway…) You type in a URL to visit an online clothing store you’re into–and, like any other day, the webpage on your browser takes a second or two to load. By typing in the URL and hitting enter, you sent a request to a server to access the website, and during the loading time, your connection to that website is being established. What you’re loading is more than just pictures of shirts and shoes, though. Every time you’ve ever visited this online store, a Javascript file from Tongji also loaded as part of the deal. It’s not a visible aspect of the site, so its presence has never even entered your conscious mind. It’s your shopping site’s administrators who use Tongji–to collect data on you and their other shoppers.

Everything I’ve said thus far is boring, right?

What you aren’t aware of today–on March 26th, 2015–is that as your online shopping store loads, as your signal crosses over the internet, into China, where the server hosting your website is located, a passive cyber infrastructure tool just perked up its ears. The Great Cannon has just noticed that you made a query for a Tongji Javascript file. Instead of sending you that file as is, however, the Cannon injects a fake copy of that script, with three malicious injected packets of data.

[Ross] What those extra lines of JavaScript did was essentially tell that web browser that’s now displaying Baidu normally to also try to connect to GitHub’s IP address. So while you’re typing away, doing your research on Baidu or checking WeChat or whatever you’re doing, your computer is sending connection requests to GitHub unbeknownst to you because of this little JavaScript injection that they did on the backbone of the Chinese internet.

Of course, you don’t see any of this. Maybe your browser is running a bit slowly, but the difference is negligible, if not unnoticeable. The clothing store’s webpage loads on-screen and you continue with your day. Maybe you go on and buy some new pants, or some socks. Pretty boring. All you did, of course, was become a pawn in an international cybercrime.

If you really want to understand how big Baidu is, consider this: the Great Cannon is configured to act only on 1.75% of requests that run through it. That’s a highway stop which diverts less than one in every fifty cars it lets pass. And the Cannon only acts on the first packet of a query, meaning the large majority even of an infected computers’ data remains untouched. Making use of only this tiny percentage of a tiny percentage of a single company’s traffic, the Great Cannon managed to take down Github for a period of over 24 hours, and continue to make trouble for a week thereafter. Imagine if the Great Cannon took full advantage of Baidu’s traffic, or used more than just Baidu as a resource pool. You could probably take down, like, the United States of America!

But perhaps equally frightening is the precedent set by the Great Cannon. If this really was a government-sponsored effort, it would mean that China was using largely non-Chinese computers to carry out an attack on a non-Chinese company. The Great Firewall was always limited to China’s own national population–here, not only was the government shifting from being censors to being hackers, but these hackers were manipulating foreign entities to attack independent organizations outside their legal purview. International cybercrime at its juiciest.

For the record, representatives of the People’s Republic denied any involvement in the Github affair. When questioned, a spokesperson for China’s Foreign Ministry expressed his view that, quote, “it is quite odd that every time a website in the US or any other country is under attack, there will be speculation that Chinese hackers are behind it.” Another spokesperson responded to journalists at Vice by saying: “we hope that instead of making accusations without solid evidence, all relevant parties can take a more constructive attitude and work together to address cyber issues.” Notice how both of these responses were not explicit denials.

Were these legitimate pleads of innocence, or attempts to deflect guilt? Some private cyber security researchers decided to take the investigation into their own hands.

Alright Malicious Life listeners, I have a problem: it appears somebody has started a rumor that “Malicious Life” is a bad podcast. How can I find who did this? You probably instinctively know what I should do. First, I ask the person I heard it from: who told you? I find that person, then ask that person who told them. And so on, and so on. Eventually my investigation turns into a long line of telephone: Dave heard it from Sarah who heard it from George who heard it from Dani who heard it from Eliad, yada yada yada.

Then, at the very end of the line I catch the culprit. It was Nate Nelson–our very own senior producer! Why Nate, why?!

Alright, that was just an analogy–everybody knows ‘Malicious Life’ is a good podcast. The reason for the analogy is that rumors work a lot like internet routing does. When you do something online, it might seem very simple: visiting Facebook.com, sending an email–what’s there to it? The internet can feel like a black box, where things just happen: you type in Facebook so Facebook is there. But we know, of course, that it’s much more complicated under the hood. Your request to access Facebook’s website bounces through a whole lot of routers on its way to Facebook’s server, then a whole lot of routers again on its way back to your monitor. It’s like a rumor, where each router passes the information on to the next, creating a chain of information transfer. The whole time, you’re not even aware of these routers existing, where they’re located or how many are involved in the chain. It usually doesn’t matter much.

Traceroute is a quite clever way to track down a malicious agent, by taking advantage of this chain of causality. Just as I had to find Dave, Sarah, George, Dani and Eliad before I caught the perpetrator of that malicious rumor (damn you, Nate!), traceroute tracks backwards over a routing network to find the point at which a malicious procedure was initiated. It does so using a specific category of information included in any data packet: time-to-live, or TTL.

Time-to-live is just what it sounds like: a given period after which a data packet will be terminated. The point of TTL is to avoid router loops. Say there’s a problem along a routing network, because one of the routers is misconfigured or non-functioning for some reason or another. If you send a request over the network but there’s one broken chain in the link, it will never be able to reach a website’s servers. This is where the TTL of your data packet will kick in. A typical TTL value is 64. 64 is the measure of how many lives your packet has: it jumps to its first router, 63, then to the next, 62, then 61, and so on. If you sent your packet with TTL 64 and it reaches its destination with TTL 50, you know it made 14 stops along its way. 64 should be more than enough to get your data where it needs to go.

If one router in the path is problematic, your packet won’t simply proceed as normal. The router before it in line will send your packet, TTL 61. It doesn’t work. The router sends it again. 60. Again, nothing. 59, 58. Eventually, the TTL value will reach zero, at which point the routing network recognizes the request cannot be executed. Now, the last viable router in the network will send what’s called a “Time-Exceeded” message all the way back through the chain, to your computer: your request failed.

Where usually a failed request might be a bad thing, cyber security researchers can leverage traceroute to discover the identities of routers in a network. That’s because, in reporting the TTL 0 error, a server identifies itself. It says “Hi, how are you today? My name is Router X. It appears that, under my watch, your request to reach malicious.life has failed. Sorry!” And so, you know where your information has traveled.

Traceroute utilizes TTL to its advantage, by setting the value purposely low, specifically in order to get those error reports back. First, you send a packet with a TTL of 1. The router receives the information, but can’t pass it on, as its TTL has reached zero. The first router comes back to you to say hi. Then you send another packet through the network with TTL 2. Then 3, 4, 5, until your request actually goes through, at which point you’ve identified the whole chain. Clever, right?

Robert Graham, a cybersecurity blogger, was one of the researchers who attempted to use traceroute to identify the Github hacker. When he initiated the procedure, he discovered a few shady clues right away. First: the requests sent from his monitor took almost three times longer to process once they moved from the U.S.-based server at TTL 8 to the Chinese server at TTL 9. Second, and most importantly: at TTL 16, his requests hit a firewall. It seems the hackers had thought ahead, and put a safeguard in place in case anyone tried to trace their server.

To get around the firewall, Graham had to write some clever code. He designed his program to first send data with normal TTL values, therefore establishing a connection with the hacker’s server. Only once the bridge had been built did the program conduct traceroute, sending packets of smaller TTL values and tracking the pathway step by step.

Up through TTL 11, Graham’s queries failed. Then, at TTL 12, his data got through. It also got appended by a strange new file. This was the forged Javascript file users around the world had received from the Great Cannon! The IP address for the machine at TTL 12 read 61.135.185.140. Using publicly available IP tracking tools, the number was found to trace back to…China Unicom, a major telecommunications operator based in Beijing. Citizens Lab–the group that first came up with the term “Great Cannon”–later used Graham’s same methodology to locate the source at both Unicom, and China Telecom, another major telecommunications provider based in Beijing.

[Nate] And how do you go from spotting the source of the problem to be Baidu or a Chinese telecom company and make the jump between that and the government itself?

[Ross] So given the fact that the target was GitHub in response to these particular domains, the fact that China has done DDoS against GitHub in the past coupled with the fact that it was the Chinese backbone that was doing this targeting, this site, specifically the New York Times part of it, this close to the international uproar over those particular articles. There was a lot of additional contextual data that said this probably isn’t some malicious actor in China. This is probably the Chinese government.

Evidently, China hijacks Baidu as a host body for its malware, and uses major telecom providers in the capital city as that malware’s means of travel. In political terms, this is what you might refer to as “the cost of doing business in China”.

[Ross] There has been no consequences and they will continue to do this when they see things that they view cross the line.

If you expected there to be a next part to the story: the programming community fights back, the U.S. government intervenes, China sees its comeuppance…there’s none of that. By now, if anything, there’s an understanding that this sort of thing does, and will continue to go on behind closed doors.

[Ross] At the end of the day, it’s really hard for anybody to have the moral high ground when it comes to hacking these days and in the bilateral dialogue and the international dialogues that are happening around hacking internet security, cyber-security, the Chinese and the Russians push a very segmented version of what the internet should be based off of a law, a legal system that involves content control and allow each country to regulate things, how they see fit. It’s at direct odds with what the United States has been pushing forward.

Really, there hasn’t been much at all made about the story I told today in the wider world. It made news in tech circles, but rarely elsewhere. Part of the reason this never became a big story: Github knew what they were doing, and turned potentially devastating hacks into manageable problems. The reason why big hacking stories become big is often because of an incompetence on the side of the victim: Target, Aramco, Ashley Madison–places where executives barely know their zeros from their ones until the problem slaps them in the face. Github is a programming community–they’re the last people you want to mess with online, and they showed that simply caring a little bit more about your cyber defense can go a whole way to staying out of the news and staying in business.

In fact, one good thing did come out of the Chinese Github attacks: in the time since, Github began beefing up their security even further. By 2018, the site had put powerful safeguards in place that would, in theory, be able to protect them from even the strongest DDoS attacks.

What happens next? In February 2018, Github gets hit with the largest DDoS attack in recorded history.

More on that in a future episode of Malicious Life.

https://soundcloud.com/danya-smith-997590669/chinese-traditions

https://www.youtube.com/watch?v=c1DKWajEEUk

https://soundcloud.com/alxdmusic/alxd-background-music-tense

https://soundcloud.com/sei_peridot/suspense3

https://soundcloud.com/erikmmusic/emp-intro-alternate

Latest episodes

China Vs. Github

Hosted By

Ran Levi

Special Guest

Ross Rustici

Senior Director for Intelligence Services at Cybereason

Episode transcript:

China vs. Github

The Great Firewall

Untouchable

The First Hack

The Great Cannon

A Malicious Javascript File

Tracing The Hack

The Cost of Doing Business

Credits:

China Vs. Github

Season 3 / Episode 33

Hosted By

Ran Levi

Special Guest

Ross Rustici

Senior Director for Intelligence Services at Cybereason

Episode transcript:

China vs. Github

The Great Firewall

Untouchable

The First Hack

The Great Cannon

A Malicious Javascript File

Tracing The Hack

The Cost of Doing Business

Credits: