Adding a Ghost User to Our Encrypted Communications Is No 'Crocodile Chip'
Note: This is part three of a four-part series where security expert Jon Callas breaks down the fatal flaws of a recent proposal to add a secret user 鈥 the government 鈥 to our encrypted conversations. Part two can be found here.
The latest intelligence community proposal for circumventing encryption suggests a scheme that would enable surveillance on otherwise securely encrypted communications by secretly adding an extra user 鈥 the government 鈥 to a conversation.
The GCHQ authors insist that the proposal does not 鈥渂reak encryption.鈥 They鈥檙e wrong. As multiple computer scientists and security experts鈥攊ncluding , , , and 鈥 have explained, the 鈥済host user鈥 encryption back door won鈥檛 work as promised. It is hard enough to build secure tools; it is impossible to build technology that is impenetrable for everyone except the 鈥渞ight people.鈥
In this third post of a four-part series on the GCHQ 鈥淕host User鈥 proposal, I address a technological assertion that the authors make in their essay: the idea that adding a secret listener to a conversation is just like attaching 鈥溾 to a phone wire. This analogy is a rhetorical device designed to normalize the idea of a secret listener-in. It is wildly inaccurate from a technological point of view, and it serves to obscure major flaws with the GCHQ authors鈥 idea: The 鈥済host user鈥 proposal won鈥檛 keep conversations private from unauthorized attackers, and criminals and terrorists will readily be able to defeat the tool.
Crocodile Clips Are Nearly Extinct
The GCHQ authors wax nostalgic, insisting that all they鈥檙e asking for is for the world 鈥渢o go back a few decades鈥 and enable investigators to apply 鈥渧irtual crocodile clips鈥 to internet communication platforms. After all, long ago, police could sneak over to the phone wires running outside of your home and convert a two-way call into a three-way call (with the government as the third party) using crocodile clips. , called alligator clips here in the United States, are small metal clips that one can attach temporarily to an electrical connection.
The metaphor may be simple and easy to understand, but it doesn鈥檛 work here. True, the early telephone system could be wiretapped with crocodile clips and a tape recorder. But this method quickly proved ineffective 鈥 to say the least. Since the early 1990s, when the first digital phone switches were deployed, wiretaps have been conducted centrally via special interception technology that phone companies are obligated to use under federal law. As charming as the crocodile clip metaphor is, for almost three decades, the reality of wiretapping is that it happens through 鈥 yes 鈥 a mandated backdoor in the telephone architecture, one which itself .
More importantly, the Internet is radically different in design from the phone network. The telephone system is centralized 鈥 an expensive, top-down network connects to cheap, dumb devices in our homes and offices. In contrast, the Internet is decentralized 鈥 instead of a centralized network connecting dumb end-user devices, it鈥檚 a dumb network connecting smart devices. Today鈥檚 smartphones are pocket computers that replace our music players and televisions 鈥 and, by the way, also make phone calls.
That difference in design means that the centralized wiretapping that works on the old telephone system will not work for the Internet. Communications on the Internet can transmit at any time, take different paths, change their paths while in motion, and send information out of sequence.
This brings me to the most important point: On the decentralized Internet, there is no place to put virtual crocodile clips except for on our personal smartphones, computers, and other end-point devices. When Alice and Bob talk to each other, you can either listen on Alice鈥檚 device or Bob鈥檚. There is no one path the communication follows, no one network router or switch that carries all of Alice鈥檚 and Bob鈥檚 conversation.
Crocodile in the Machine
Now you might argue that modern Internet communications are far more centralized than they used to be. This 鈥渃lient-server鈥 architecture is a lot more like the phone network than the decentralized Internet used to be. Communications travel, not from Alice to Bob, but from Alice to Facebook, then to Bob. WhatsApp 鈥 the GCHQ proposal鈥檚 ripest target for a 鈥済host user鈥 implementation with its 1.5 billion users securely communicating via end-to-end encryption 鈥 operates over Facebook servers. So, if you鈥檙e GCHQ, Facebook seems to be a good place to add a secret listener.
That assumption is not accurate, though, because that鈥檚 not the way today鈥檚 cryptography works. Surveillance of otherwise encrypted conversations cannot take place purely at the centralized office (Facebook) because all the work, from encryption to emojis is done on the users鈥 devices, not Facebook鈥檚 network. Eavesdroppers have to put the 鈥済host user鈥 code on cell phones and computers, too. This provides an easy way to evade interception: look for and delete that kind of code on their devices.
This is a bit complicated; bear with me while I explain why this is true.
Two鈥檚 company, three鈥檚 a conference
Today鈥檚 cryptography works point-to-point, between two parties, even when there are more than two people in the conversation. Without belaboring the math, here鈥檚 a common scenario. When you and I want to talk to each other securely, the communications software on our smartphones each selects a random number that will be our keys. The number my phone chooses is what I want you to use as an encryption key when you talk to me, and the number your phone chooses is the key I will use to talk to you. The largest technical problem of encryption is how we tell each other our respective keys, which cryptographers call 鈥渒ey exchange.鈥
There are a number of ways to do key exchange, and they all follow a similar path. The most common one we use today (called Diffie-Hellman) works like this: I do some math on my key. I send you the result of that math. You take the thing I sent you along with your key, and do some more math. You then send that result back to me. At the end of this dance, I know your key and you know mine. Importantly, we can do this dance completely in public and still, no one else knows our keys.
This key exchange is a miracle of the modern age and it is what makes private Internet communications possible. The encrypted communications that we all use most, TLS, is implicitly two-party. The cryptographic models that describe TLS鈥檚 security guarantees depend explicitly on the fact that there are only two parties involved. Authentication interfaces and user-visible indicators also make the same assumption.
If that鈥檚 true, you might ask, how do group chats on our texting apps work securely? Well, many of the present apps (like WhatsApp, iMessage, Signal, and others) emulate a multi-party conversation by encrypting each message for each device of each participant. While the exact details vary from one app to the next, the basic principle is that I am in a two-party conversation with each of the people in the chat at the same time, and they are also in a two-party conversation with each other. Alice is talking to Bob, Alice is talking to Carol, Bob is talking to Carol 鈥 all at the same time. So when Alice sends a message, she鈥檚 sending it to both Bob鈥檚 and Carol鈥檚 devices, even though to all three, it appears they鈥檙e all in the same 鈥渞oom.鈥
This fact leads to a raw truth: in order to have end-to-end encryption with multiple ends, each and every end has to know about all the other ends. When a new person joins the conversation, every device of every person in the conversation must establish a new, encrypted two-party connection with the new person. Let鈥檚 say GCHQ joins the conversation between Alice, Bob, and Carol. Alice鈥檚, Bob鈥檚, and Carol鈥檚 cell phones each must now make an encrypted connection to GCHQ. Even under developing message encryption standards, every participant鈥檚 device requires a complete and accurate roster of the group and their keys.
Ghostly Footsteps
The GCHQ authors know this. But a government鈥檚 exceptional access endpoint must be invisible to everyone in the conversation lest the participants stop speaking freely. What good is a wiretap that everyone knows about? So the proposal includes something else: the app has to lie to the user.
Currently, secure apps alert participants when someone joins or leaves a call, and they all make different choices about what and how much to tell their users. Signal鈥檚 app notifies everyone about every update to the participant list. Apple鈥檚 iMessage system tells the account owner about a change to their own devices; if I get a new phone, for example, my laptop and my tablet all display that there is a new device in my account, and thus my conversations. In the past, has not told people about device changes, but this is . These kinds of notifications are integral to maintaining a secure communications network in the modern age, and they are as such. Without coming out and saying it, the GCHQ essay proposes that this security best-practice would have to stop.
Even if that happened, participants will still be able to find out when they鈥檝e made new connections to the spooks. An app might suppress user notification of a 鈥済host user鈥 joining a conversation, but the device still has all the information it needs for a technologically savvy person to find out whether an interloper is there. After all, in order to transmit the messages of interest, the smartphone must be connected in a two-party conversation with that ghost and have the ghost鈥檚 key stored in its memory. Like the Wizard of Oz, a government agent may be behind the curtain, but they must be in the same room as Dorothy and her friends.
In sum, both the network architecture of smart devices on a decentralized network and the mathematics of encryption force the ghost to be on the participants鈥 devices. The crocodile metaphor describes a situation where the eavesdropper is not present, yet listening in. In reality, that situation breaks down completely, leaving nothing but a nostalgic, rhetorical spin.
In my next essay, I tie all of these threads together and show how a 鈥済host user鈥 will inevitably be exposed, rendering the proposal worthless.
Further Reading
As you have noticed, this issue, that of the differences in the fundamental way communications were done before the computer revolution and afterward along with before the Internet and after, is incredibly complex and hard to summarize. Here are some additional resources:
Steven M. Bellovin, Matt Blaze, Susan Landau, Stephanie K. Pell, ""
Vassilis Prevelakis and Diomidis Spinellis, ""
Whitfield Diffie and Susan Landau, ""