Not-So-Risky Business

Using Risk-Limiting Audits to Keep Elections Safe

Introduction

To err is human. Whether it’s quality control at a factory, sending a meal back to the kitchen, a robust return policy, or the human capacity to forgive, our society has developed a myriad of ways to catch and correct our mistakes. Do we have an equivalent method to catch errors in our elections, whether malicious or otherwise? Turns out, we do: auditing! Today we’ll be exploring how we currently double-check our vote counts and why we should incorporate new techniques that will make it easier than ever to keep our elections safe and secure.

What we do now

Currently, 43 states—and Washington, DC—require some type of post-election audit to ensure the accuracy of the initially reported tally. Of these states, the vast majority use a so-called "traditional" post-election audit. Typically, this means manually examining a specific percentage of the paper records produced by a ballot to ensure they match the counts produced by the voting system, which in many cases is electronic.

Below, we have simulated a traditional post-election audit. This is a simulated election between red and green where there were 100 ballots cast. You can fiddle with both the margin of victory and the percentage of ballots you'd like to manually check. The yellow squares represent the ballots that were randomly chosen to audit. Play around with the sliders to see what happens in different scenarios.

Did you notice anything interesting? What happens if green wins in a landslide? Regardless of the margin of victory, we always count the same number of votes in a percentage-based audit. That's not very clever! When one side wins in a blow-out, election administrators are legally obligated to count every last ballot required by law. Imagine doing this for a large state, like the state of California. Even if green won by 99%, California counties would have to audit 1% of the total ballots cast. That’s over 130,000 ballots to examine by hand, if we use 2016 turnout numbers (when 13.7 million ballots were cast).

Working smarter, not harder

Several years ago, Professor Phillip Stark at the University of California, Berkeley came up with a new method to achieve a higher level of statistical confidence in election results while counting significantly fewer ballots. This technique is called a “risk-limiting audit,” and election security experts consider it to be the gold standard.

There are several different types of risk-limiting audits. They differ primarily in how they are carried out, but all offer the same strong statistical guarantee in the result. These guarantees are provided by something called the sequential probability ratio test (SPRT).

First, we have what is called a null hypothesis: we assume that the initial count is correct. The alternative hypothesis is the opposite assumption —that for whatever reason (dust in the scanner lens, voter error, Russian interference), the initial reported count is incorrect.

Next, we begin randomly choosing and examining data points. In our case, these are the hand-marked paper ballots cast during the election. Each ballot that we examine that lists the supposed winner gives us more confidence in the initial result; in other words, it favors the null hypothesis. Likewise, each ballot we see that was cast for the loser dampens our confidence; it favors the alternative hypothesis.

Think of this like the old adage, “two steps forward, one step back”. We might not always take a sample that gives us confidence, but on the balance, the totality of samples will give us more confidence in our initial result than not, assuming the initial result is, in fact, correct.

So how do we know when to stop collecting samples? We want to do so at the point when we are confident that the supposed winner is the correct one. (We can never be 100 percent sure, of course.) Here, we use a value called the risk-limit, which is just the error rate you are willing to tolerate. For example, a risk-limit of 10% means there is a 90% chance that the audit will detect an error in the results, if one exists. You can tune this value to your liking based on your risk profile and resource constraints.

The margin of victory will also determine how many ballots we need to count. If the margin of victory is greater, then it would be harder for any individual ballot to affect the results, so we can start with more confidence in the outcome. The sequential probability ratio test takes into account both confidence in the result and margin of victory when it calculates how many ballots to count.

One last thing about a risk-limiting audit that is somewhat strange is the concept of sampling with replacement. After you check every ballot, you put it back in exactly the same spot before choosing another ballot. This means that you may choose the same ballot more than once! Why would we do this? We already know that a ballot is accurate once we’ve checked it, there shouldn’t be any need to check it again!

Remember, we are most concerned with the percentage of votes cast for the supposed winner. Have you ever been at the pool when the lifeguards check the chlorine levels? They use a small vial, no matter if it’s a neighborhood pool or a giant, Olympic-sized one. This is because the amount of water in the pool isn’t important, but rather the percentage of chlorine. Similarly, taking a sample with a teaspoon and a giant bucket will both give the same percentage of chlorine in the pool.

If we were to remove the ballot once we’ve checked it, we would establish a relationship between our samples. Simply picking one ballot changes the probabilities of selecting all the other ballots, and changing these probabilities mucks with the assumptions of the SPRT.

Now let’s look at the actual steps required to conduct a risk-limiting audit. There are several different types of risk-limiting audits. The differences between the methods mainly concern how samples are collected. The method a given election administrator chooses is largely dependent on the voting equipment they use. Different voting systems have different capabilities for identifying and storing individual ballots, which is critical in the risk-limiting audit math because it affects our ability to ensure we can sample with replacement.

The method we will examine today is called “ballot polling.” This means we can select any ballot for examination, regardless of where or how it was cast.

The solution on the next page represents a heavily simplified election. We want to illustrate the process of conducting a risk-limiting audit, so we’ve fixed the values so you can focus on the steps. In this election, 100 votes were cast, 10 at each polling place. Green won the election with 80% of the vote and the recorded votes reflect this reality. We also fix a risk limit of 10%. The slider represents how many votes we’ve counted during our audit. Move the slider to walk through the audit step by step. You’ll see how every ballot we examine changes our confidence level, and you can see how many ballots it takes to reach the appropriate confidence threshold.

Precinct 1

Precinct 2

Precinct 3

Precinct 4

Precinct 5

Precinct 6

Precinct 7

Precinct 8

Precinct 9

Precinct 10

A couple quick things to point out here. The first is that we use multiplication as it allows us to "accelerate" our progress as we gain confidence. Another is that we can look at the same ballot more than once. This is pretty counterintuitive! Why would we need to verify something if we’ve already manually double-checked it? The answer is that we want to treat every ballot the exact same. Removing some ballots means it gets slightly more likely that we pick a new ballot each time, which could bias our results.

The Power of RLAs

Now that we’ve seen how different types of audit are conducted, it’s time to put them to the test. Here we’ll see the real power of risk-limiting audits. You’ll have three sliders at your disposal, giving you the ability to set the total number of votes cast, the margin of victory, and the percentage of votes to check for the traditional type of audit. Play around with the sliders. Do you notice anything interesting?

Number of ballots audited by traditional, fixed-percentage audit:

Number of ballots audited by RLAs:

Now, try moving the “total ballots cast” slider to the end (14 million). Next, move the margin of victory slider to 11%. Finally, move the traditional audit percentage to ‘3’. This represents the 2008 Presidential race in California, where Barack Obama won 61% of the 13.7 million votes cast. A risk-limiting audit would need to check just 97 ballots to have a high level of confidence in the result (a 90% chance of catching an error. Meanwhile, the current law would have required checking significantly more ballots.

How is this possible? The sequential probability ratio test is our secret weapon. As we saw before, with a traditional audit, we always count the same number of ballots. With SPRT, the greater the margin of victory, the fewer ballots we have to count to achieve a similar level of confidence in the result. This is the true power of risk-limiting audits. With significantly fewer ballots to check, election officials can spend their resources auditing other races farther down the ballot and upgrading voting equipment.

Conclusion

So, what have we learned? Here are some key takeaways:

  • The procedure to conduct a ballot-polling, risk-limiting audit is incredibly simple, as is the math behind it.
  • Because we return ballots to the pool after we’ve checked them, the number of ballots we need to check depends on the margin of victory, not the total number of ballots cast.
  • Risk-limiting audits allow us to adjust the amount of risk we want to take on for a given municipality’s risk profile and tolerance.

We hope you’ve enjoyed this whirlwind tour of risk-limiting audits! But this lesson is just the beginning. There are many other great resources at your disposal, some of which are listed below: