I was recently asked to contribute to a set of essays being assembled in honor of the Electronic Privacy Information Center‘s 20th anniversary. Here’s a draft:
A. Michael Froomkin
Laurie Silvers & Mitchell Rubenstein Distinguished Professor
University of Miami School of Law
Identity Management looms as one of the privacy battlegrounds of the coming decade. The very term is contested. In its most minimal form it means little more than keeping secure track of login credentials, passwords, and other identity tokens. The more capacious version envisions an ‘identity ecosystem’ in which people’s tools carefully measure out the information they reveal, and in which we all have a portfolio of identities and personae tailored to circumstances. What is more, in this more robust vision, many transactions and relationships that currently require verification of identity move instead to a default of only requiring that a person demonstrate capability or authorization.
A privacy-protective Identity Management architecture matters because the drift towards strong binding between identity and online activities enables multiple forms of profiling and surveillance by both the public and private sectors. Moving to a better system would make a substantial part of that monitoring and data aggregation more difficult. Thus, a privacy-protective Identity Management ecosystem has value on its own or as a complement to a more comprehensive reform of privacy protection, whether EU-style or otherwise. Importantly, given present trends, a reformed ID ecosystem would protect privacy against private monitoring and against illicit public sector surveillance also.
In the US the present and future of privacy seems to fall somewhere between grim and apocalyptic. The NSA seeks to capture all digital data. Law enforcement agencies club together to share surveillance data in fusion centers. Corporate data brokers find new ways to collect and use personal data. Yet, it seems all too likely that data-gathering will remain largely unencumbered by EU-style privacy regulation for the foreseeable future. Data privacy is being squeezed by a technological pincer composed of multiple advances in data collection on the one hand and rapid advances in data collation on the other. Big Data gets bigger and faster, and is composed of an ever-wider variety of information sources collected and shared by corporations and governments.
The catalog of threats to privacy runs from the capture of internet-based communications, to location and communications monitoring via cellphones and license plate tracking. Effective facial recognition is on the horizon. Both public and private bodies increasingly deploy cameras in public, and process and store the results; increasingly too they share data – or at least the private sector shares with the government, whether willingly or otherwise. Plus, as people become more used to (and more dependent on) electronic social and economic intermediaries such as Facebook, Twitter, Instegram, Amazon, and Google, they themselves become key sources of data that others can use to track and correlate their movements, associations, and even ideas – not to mention those of the people around them.
In an environment of increasingly pervasive surveillance of communications, transactions, and movements, the average US person is almost defenseless. Legal limits on data collection tend to lag technical developments. As regards private-sector collection, the dominant largely laisser-faire theory of contract means that privacy routinely falls in the face of standard-form extractions of consent. As regards data collection in public and also data use and re-use, First Amendment considerations might make it difficult to outlaw the repetition of many true facts not obtained in confidence. Furthermore, there is relatively little the average person can do about physical privacy in daily lives. Obscuring license plates is illegal in most states. Many states also make it a crime to wear a mask in public, although the constitutionality of that ban is debatable. Most cell phones are locked, rooting them is neither simple nor costsless, nor does it make it possible to solve all the privacy issues.
Electronic privacy has for years seemed to be an area where privacy tools might make significant dent in data collection and surveillance. Unfortunately, cryptography’s potential is yet to be realized; disk encryption software now ships as an option with major operating systems, but encrypted email remains a specialist item. Cell phones leak information not just via location tracking but through the apps and uses that make the devices worthwhile to most users. Estimates suggest that when one counts senders and recipients, one company – Google – sees half the emails sent nationally. And we now know beyond a reasonable doubt that the NSA has adopted a vacuum cleaner policy towards both electronic communications and location data.
One of the first papers I wrote about privacy, back in 1995, contrasted four types of communications in which the sender’s identity was at least partially hidden. Listed in declining order of privacy protection they were: (1) traceable anonymity, (2) untraceable anonymity, (3) untraceable pseudonymity, and (4) traceable pseudonymity. Encouraging untraceable anonymity has for years seemed to me be one of the best routes to the achievement of electronic privacy. “Three can keep a secret if two of them are dead”: If people could transact and communicate anonymously, then the exchange would by its nature remain outside the ever-expanding digital dossiers. But even though we have increasingly reliable privacy-enhanced communications through systems like Tor, and even though at least a segment of the public has demonstrated an appetite for semi-anonymous cryptocurrency (cf. the Bitcoin fiasco), the fact remains that for most people most of the time, anonymous electronic communication, much less anonymous transactions, are further and further out of reach because tracking and correlating technologies are getting better all the time. Whether due to the use of MAC numbers to track equipment, cookies and browser fingerprints to track software and its users, or cross-linking of location data with other data captures be it phones, faces, loyalty cards, self-surveillance, the fact is that anonymity is on the ropes even before we get to the various impediments in the US, and even more in other countries, to real anonymity.
A focus on Identify Management involves a shift from anonymity to pseudonymity. Plus, if one is being realistic about the legal environment, any robust identity management likely will have substantial traceability in it. Useful, attractive, Identity Management tools can only exist if we first create a legal and standards-based infrastructure that supports them. In the US, at least, the legal piece of that infrastructure will require action by the federal government. Although actors within the Obama Administration have signaled support for strong identity management in the “National Strategy for Trusted Identities in Cyberspace (NSTIC)“, not all parts of this Administration are speaking in unison. Worse, the early signs are the NSTIC implementation will fall far short of its potential.
NSTIC is almost unique among recent government pronouncement about the regulation of the Internet domestically.1 The typical government report on cyberspace is long on the threats of cyber-terrorism, money laundering, and (sometimes) so-called cyber-piracy (unlicenced digital copying), and gives at most lip service to the importance of privacy and individual data security. The exceptions are reports on the dangers of ID theft – which seem mostly to stress caution in Internet rather than secure software – and NSTIC itself. NSTIC envisions an “Identity Ecosystem” guided by four key values:
- Identity solutions will be privacy-enhancing and voluntary
- Identity solutions will be secure and resilient
- Identity solutions will be interoperable
- Identity solutions will be cost-effective and easy to use
These are good goals, and to realize them would be a substantial achievement. Even if it is limited to cyberspace – in other words, even if it does not directly address the problems of surveillance in the physical world – in this list lie the seeds for an ‘ecosystem’ based on enabling law and voluntary standards that could very substantially enhance data privacy by allowing people to compartmentalize their lives and by creating obstacles to marketers and others stitching those compartments together.
The problem that NSTIC could solve is that without some sort of intervention both the interests of marketers, law enforcement, and (in part as a result) hardware and software designer most frequently tend towards making technology surveillance-friendly and towards making communications and transactions easily linkable. If we each have only one identity capable of transacting, and if our access to communications resources, such as ISPs and email, requires payment – or even just authentication – then all too quickly everything we do online is at risk of being joined to our dossier. The growth of real-world surveillance, and the ease with which cell phone tracking and face recognition will allow linkage to virtual identities, only adds to the potential linkage. The consequences are that one is, effectively, always being watched as one speaks or reads, buys or sells, or joins with friends, colleagues, co-religionists, fellow activists, or hobbyists. In the long term, a world of near-total surveillance and endless record-keeping is likely to be one with less liberty, less experimentation, and certainly far less joy (except maybe for the watchers).
Robust privacy-enhancing identities – pseudonyms – could put some breaks on this totalizing future. But in order for identities to genuinely serve privacy in a new digital privacy ecosystem, these roles need to have capabilities to transact, at least in amounts large enough to purchase ISP and cell phone services. And we need a standards that ensure our hardware does not betray our identities: using different identities on the same computer or the same cell phone must not result in the easy collapse of multiple identities into one. Thus, given the current communications infrastructure, computers and phones must have a way of alternating among multiple identities, down to the technical (MAC, IPv6, and IMEI number) level.
In its most robust form, we would have true untraceable pseudonymity powered by payer-anonymous digital cash. But even a weaker form, one that built in something as ugly as identity escrow – ways in which the government might pierce the identity veil when armed with sufficient cause and legal process – would still be a substantial improvement over the path we are on. It is possible to imagine the outlines of a privacy-hardened identity infrastructure that fully caters to all but the very most unreasonable demands of the law enforcement and security communities. In this ecosystem, we would each have a root identity, as we do now, and we would normally use that identity for large financial transactions. In addition, however, everyone would have the ability to create limited-purpose identities that would be backed up by digital certificates issued by an ID guarantor – a role banks for example might be happy to play. Some of these certificates would be ‘attribute’ certs, stating that the holder is, for example, over 18, or a veteran, or a member of the AAA for 2015. Others would be capability certs, much like credit cards today, stating that the identity has an annual pass to ride the bus, or has a credit line to draw on. (There could be limits on the size of the credit line if there are money laundering concerns, although several banks already offer an option of throw-away credit card numbers for people concerned about using their credit cards online; those cards, however, carry the name of the underlying card-holder while in a privacy-enhanced ID system they would not need to.) We might define a flag that distinguished between personae that are anchored to a real identity and those that are not; the anchored ones would deserve more trust, even if we didn’t know who was behind them.
In time, we would learn to interact online through virtualized compartments – configurable persona. Doing so would enable a stricter, cryptographically enforced, separation between work, home, and play. It would also provide for defense in depth against identity theft – if someone, say, broke into one’s Facebook persona, the attacker would be able to leverage this to the work persona. Furthermore, there would be less need for tight security controls imposed at work to limit (or monitor) private personae – already an increasing problem with corporate-issued cell phones and laptops.
Even this – a much watered-down recipie for limited privacy – is a tall order in today’s United States. It is hard enough to persuade even democratic governments of the virtues of free speech, and even harder to find any enthusiasm for the freer speech that comes from strong pesudonyms. When one gets to the even freer speech that comes from untraceable anonymity, governments get cold feet – and when money is involved, the opposition is only stronger.
The Obama Administration’s National Strategy for Trusted Identities in Cyberspace (NSTIC) raised hopes that the US government might swing its weight towards the design of legal and technical architectures designed to simultaneously increase online security while reducing the privacy costs increasingly imposed as a condition of even access to online content. At present those hopes have yet to be realized. There is much to be done.
- The caveat is important: the US government often seems more willing to talk of anonymization on the Internet as potentially empowering tool for dissidents abroad than for citizens at home. [↩]