Algorithmic Bias: How the Clinton Campaign May Have Lost the Presidency or Why You Should Care

1 December 2016, 1717 EST

This post is a co-authored piece:

Heather M. Roff, Jamie Winterton and Nadya Bliss of Arizona State’s Global Security Initiative

We’ve recently been informed that the Clinton campaign relied heavily on an automated decision aid to inform senior campaign leaders about likely scenarios in the election.  This algorithm—known as “Ada”—was a key component, if not “the” component in how senior staffers formulated campaigning strategy.   Unfortunately, we know little about the algorithm itself.  We do not know all of the data that was used in the various simulations that it ran, or what its programming looked like.   Nevertheless, we can be fairly sure that demographic information, prior voting behavior, prior election results, and the like, were a part of the variables as these are stock for any social scientist studying voting behavior.  What is more interesting, however, is that we are fairly sure there were other variables that were less straightforward and ultimately led to Clinton’s inability to see the potential loss in states like Wisconsin and Michigan, and almost lose Minnesota.

But to see why “Ada” didn’t live up to her namesake (Ada, Countess of Lovelace, who is the progenitor of computing) is to delve into what an algorithm is, what it does, and how humans interact with its findings. It is an important point to make for many of us trying to understand not merely what happened this election, but also how increasing reliance on algorithms like Ada can fundamentally shift our politics and blind us to the limitations of big data.   Let us begin, then, at the beginning.

For the non-tech savvy, an algorithm is a procedure that has certain characteristics: finiteness (it terminates or halts); definiteness (every step is precisely specified); effectiveness (each operation is capable of being performed in a finite length of time); possesses at least one input and at least one output.   It is as simple as that.  The algorithm itself can be highly complex, with many steps or subroutines, but it is in the abstract a written procedure.

Where we get into trouble is in the procedural knowledge representation.  Other than thinking about variables like race, sex, party identification, zip code, past voting behavior and the like, we need to think about how these variables interact with one another, as well as the assumptions that we make about them when formulating or designing the algorithm.  We want algorithms, moreover, to have external and internal validity.  We want them to possess the characteristics above (internal validity) and we want its outputs to have a high fidelity with the real world (external validity).  This fidelity can be as basic as in the task of pattern matching, or more complex, like in the case of Clinton, where the task was prediction.

So what happened with Clinton’s Ada?  We submit there are three problems: heuristics; computational complexity; and over demanding cybersecurity.  Heuristics, particularly in simple search algorithms, act like a rule of thumb.  For instance, when I have lost my car keys, I usually undertake a search of them in my house.  My rule of thumb is to look first in the common places I usually deposit my keys upon entering.  Next, I might expand my search to countertops generally, or tables, or desks.  If that fails, I update my search again, this time maybe retracing my steps.  Each of these ways of looking for my keys is a heuristic.  It is a process, but a process that possess a level of uncertainty in outcome.   We are willing to wage that some of the heuristics used in Ada’s simulations were the result of faulty reasoning and assumptions about voter behavior, apathy or discontent.

Second, is a deeper problem about computational complexity.  We cannot do justice to much of these complex ideas here in full, but this is about the very nature of figuring out what people will do in an interdependent and non-rational world.  In terms of algorithm design, we think that Clinton’s Ada really found herself bounded in ways that she could not predict what would happen because prediction in this election was essentially a problem of exponential complexity (or a NP-complete problem in Computer Science-speak – where no polynomial time solution has been found).  An NP-complete problem does not mean that the problem is unsolvable, but it does mean that solving it exactly requires exponential time in the number of inputs. Typically, to solve such a problem, the approach is to develop an approximation algorithm – an algorithm that can quickly can find a reasonable solution. The approximation can be well bounded (essentially saying that you have a guarantee that your solution is within some multiple of the optimal one), or that the solution is not bounded and thus may not be defined (that would be our guess here). Given the need for an approximation to this kind of NP-complete problem of predicting voter behavior in key battleground states, as well as leveraging heuristics, it is likely that Ada did not actually have well the kind of confidence to predict accurately.  The algorithm may have yielded a potential solution, but the humans on the other side may not have understood that the validity of that solution.  In short, the predictions Ada made didn’t match reality.

Finally, there is also something about the way the Campaign used Ada that is worrisome: they tried to secure it to such a point that only a few top advisors had access to it and it was kept away from other computers and internet connections for cybersecurity purposes. The Clinton campaign knew they had to protect Ada from hackers, nation-state adversaries, and even the press to ensure that its data sources and outputs were pristine and reliable. Manipulating Ada would’ve meant affecting campaign strategy at all levels and in ways that may not have been immediately apparent. But did the campaign’s efforts to ensure the cybersecurity of a critical strategic component actually end up being a vulnerability?

Few people knew of Ada’s existence, much less how the algorithms worked. The insular environment provided an effective barrier against manipulation (or so we assume), but it also protected Ada’s outputs from feedback and criticism. Harsh restrictions on access created a false sense of security, and they potentially limited the campaign’s understanding of where Ada may have gone wrong. Cybersecurity isn’t just about building walls to keep bad things out. Cybersecurity must also include the concept of resilience, or the ability to operate under adverse circumstances. By attempting to provide Ada with the ultimate security, the campaign failed to understand the fragility of their algorithm, which may have then led to fatal errors in strategy.

While algorithmic bias is a hotly debated topic, these debates usually center on secretive algorithms that significantly impact the fate of an individual through racial profiling, housing discrimination, or judicial bias.  Ada’s flaws and their consequences show that the effects of algorithmic bias can happen to any of us, regardless of our status or power. Perhaps this is the necessary impetus to demand diversity in development, more responsible algorithms, and less secrecy behind their inner workings.