Monday, January 12, 2009

Why we keep questioning the recount 

Suppose you are an election official in a precinct. You are collecting ballots with a machine that measures people as they come through the door. At one point your machine breaks down. You replace the machine with a backup, but you fail to turn that machine on for awhile. Later it is turned on, and you record your totals.

Later there is a recount, and in the process you discover the error by finding the extra, uncounted ballots. Naturally, since they appear by all appearances to be valid votes, you count them and include them in your totals.

This is not fiction. Ballots were thus discovered in Maplewood, Precinct 6 during the recount process for the Coleman-Franken race.* Folks at the Uptake who reported the incident noted in that article that the Election Night returns in that precinct favored Franken, 45.4% to 39.2%. If the 171 ballots are like the remainder, you'd expect 45.4% of the 171 to go to Franken and 39.2% to go to Coleman, which would give Al 78 and Norm 67. You can just imagine how bummed Norm would be; that finding is expected to cost him 11 votes.

Visiting the precinct results, however, we find Franken added 91 votes from that precinct and Coleman 54. He's out 37 votes, not 11. The Coleman campaign cannot believe its luck.

How likely is this to happen? To understand this I use the heuristic developed during the Rossi-Gregoire recount battle.
...assume that we have a large bag of marbles, they are either red (Rossi votes) or blue (Gregoire votes), there are a total of 856,963 (505,836 blue + 351,127 red) marbles in the bag (it's a BIG bag of marbles) - these numbers are inclusive of the 'new' ballots discovered (336 apparently), or 'enhanced' during the process.

These marbles are uniformly distributed in the bag - like people, they are all mixed up together and there is no formulaic method to tell where one of the 336 new ballots came from or what precincts or demographics characterize those ballots that required 'enhancing'.

We don't know anything about 171 people who showed up at the time in Maplewood when the machine was failed. There isn't any reason to assume they should be distributed differently than the 1100 or so who had their votes already counted. So for this the binomial distribution should work as a method of asking the question "how likely is it, if the bag is 45% blue marbles (Franken), that in a pull of 171 marbles from the bag I would get 91 blue marbles?"

The answer is 0.6%, or about 168 to 1. That's about on a par with the odds you got in March 2008 on the Tampa Bay Rays winning the World Series. The Rays didn't beat those odds. Franken did. As we'll see, he won lots of longshots.

------------------------------------------------------

Now that is purely a forensic exercise. Critics of this piece will say that "well, you can't prove what happened in that precinct." And I can't. I'm not saying it's fraud. You can't tell that from this type of analysis. I'm just saying it's pretty unusual to get that draw of ballots in that particular precinct. If there was no other story, I would probably shrug it off as a curiosity.

But because the race is so close and because there have been other stories, I wondered: Could we use that type of analysis elsewhere? I haven't had a chance yet to drill down to the precinct level across the state. But the 87 counties of Minnesota make an interesting level of analysis. Overall the recount added 1572 votes for the two top candidates. You are right to wonder: How did they miss so many ballots? Wasn't there doublecounting? We'll get to that in a bit. But the idea is that sometimes machines miss ballots, and sometimes they get misread. Machines do make mistakes; if they didn't we'd never need a hand recount.

The added totals were 1056 to Franken and 516 to Coleman (total 1572; I'm not talking about any vote changes for Barkley and the rest.) 786 of these came from the dreaded "fifth pile" of wrongly rejected absentee ballots that the two campaigns agreed to.** The Coleman campaign is fighting for more. Given that those ballots broke 481-305 to Franken, again in a race where you'd think they'd go about 50-50, one might hope that the remaining ones, currently frozen out and awaiting adjudication, might lean towards Coleman.

As to the remainder, they still broke Franken's way 575 to 211. That's pretty staggering. Just think about that a minute. You flip a coin 786 times and it comes up heads 575 times. Would you think it's a fair coin? I took the post-election review county totals (which had Coleman +215) and the final recount totals from Monday (which, skipping the absentees, had Coleman -49), and used the same calculation as I just did for Maplewood P-6. The larger the number of ballots the better this calculation is, as the random draw story I'm using depends on not appreciably changing the proportion of Franken and Coleman votes pre- and post-recount. You can view the spreadsheet here.

Some counties cannot be used this way because the manual recount showed fewer votes; it's hard to undraw a marble from a bag. Those changes weren't too large, as you can see on the spreadsheet. Franken lost nine votes and Coleman seven in Washington County (Franken had 44% of the two-party vote); Franken lost ten and Coleman five in Clay County (Franken 48% of two-party share.)

Of the remainder (all county percentages are for two-party vote share), here are the five with the largest impact that added at least ten votes:
Five others had very low randomness probabilities: Cass; Kandiyohi; Pine; Polk; and Sherburne; but the changes there were six or less. In short, the result came from changes in three very Democratic counties that are implausibly tilted towards Franken. One of them by chance? Maybe. All three? About as likely as your next two NBA champions being the Timberwolves and Oklahoma City. Franken hit more than one longshot.

---------------------------------

So where did it all go wrong for Norm? Did the Coleman campaign do a poorer job on its challenges than the Franken campaign? One hopes not. I could go back and look at the challenges for patterns and I might, but the randomness assumption is harder to hold for challenges.

Could it be this just shows that the larger cities have higher error rates in counting votes than the other areas? That is, could this all be fine and I'm just getting fooled by randomness, as it were? Yes, I suppose that's possible, though given the vitriol hurled at anyone who cast aspersions on Twin Cities election officials, it would be quite an admission. Should Secretary of State Ritchie initiate a review of why Hennepin, Ramsey and St. Louis counties had such large errors in counting? You'll forgive me if I don't hold my breath.

There is also the disturbing question about absentee ballots. Why did they break so badly against Coleman? Did the Franken campaign do a better job of kicking the Coleman-rich absentee ballots to the litigation phase than the Coleman campaign did? I don't know. One way to get at this might be to take the absentee ballot data by precinct, compare it to the vote shares in those precincts and see which were which.

Or it may be that, as I mentioned earlier, the challenge process favored Franken either by aggressiveness, legal skill, dumb luck, or something more nefarious. We may look at that process too. There are many places to look. But the point of this article is simple -- one has to wonder whether the recount got the vote right. I suspect it's the question nobody will really ever answer.

*--I have wondered how the election officials missed this on election night. Why didn't someone reconcile the count of votes on the tape to the count of signatures? It's not important to my story, so I've skipped over this. Maybe someone already has an explanation.

**--On our radio show Saturday, Sarah Janecek pointed, inter alia, to that decision as one of the things that will be addressed in the contest phase of the recount before a judicial panel. The datum offered might be evidence of the effect of that decision, in allowing one side to skew the absentee ballots. But I would not say more about this unless I knew the overall distribution of absentee balloting.

*** -- UPDATE (11:15am): Gary Gross points out to me that in the Times chat, someone mentioned the possibility that the ballot errors are due to senior citizens "that overwelmingly voted for Franken." Well, no. Over 65s broke 43-42 for Coleman, according to a STrib exit poll.

Labels: , ,


[Top]