8307
Comments (417)
sorted by:
You're viewing a single comment thread. View all comments, or full comment thread.
98
TheGreatMutato 98 points ago +98 / -0

Former coal miner here. I've also spent some time analyzing the NYT data. The sheer amount of inconsistencies in this data is astonishing. The first major flaw I see is that the json dumps use floating point for the vote shares. This makes precise analysis all but impossible. I suppose you could parse it as fixed point and do some tricky arithmetic to get more accurate results, but I haven't had the time to mess with that yet. This is assuming NYT is serializing these floats from fixed point values on their backend, which is unlikely.

I took a different approach to analyzing this data. u/PedeInspector had a much simpler and more elegant method which shows the votes switching. I think my analysis is different in that it just points out how utterly nonsensical this data is.

There appears to be a new vote snapshot every 20 minutes. As I'm writing this, there's about 568 of these in my Pennsylvania dump. My idea was to convert the data into a much more reasonable form and analyze the deltas between the number of votes the data is reporting (votes - prev_votes) and the actual number of votes Trump and Biden received per snapshot (votes * votes_shares[candidate] - prev_votes * prev_vote_shares[candidate]). If there is indeed an inconsistency between the total number of votes and the rate at which the vote shares are increasing, this should identify it.

After running my script, it seems there's a lot of minor inconsistencies and some huge ones. The minor ones can be explained away by skeptics as floating point rounding errors (which they may be in some cases). The ones that concern me are the big ones like the one below. Here's a sample from Pennsylvania:

    {
      id: 447,
      vote_delta: 9975,
      expected_delta: 16417,
      trump_votes: 3272692,
      trump_delta: 4967,
      biden_votes: 3220119,
      biden_delta: 11450,
      time: 1604626360
   }

Here we can see the json dump is underreporting the amount of votes actually received by Trump and Biden, with a huge portion of these unreported votes going to Biden. Our script sees 16,417 votes coming in, but the data only reports 9,975 votes. Trump receives 4,967 votes, whereas Biden receives 11,450. In other words, Biden is receiving more votes than the total number of votes reported for this particuar update.

This seems to indicate that the vote_shares percentages are increasing in a manner that is incongruent with the total number of votes reported (this is basically what u/PedeInspector and u/TrumanBlack concluded as well). Perhaps the data is including votes for Jo Jorgensen, but not listing her as a candidate in the vote_shares object? Maybe, but it still wouldn't account for these huge discrepancies... unless Jo Jorgensen is way more popular than we realize.

There are cases where it swings in favor of Trump too. I've been looking at this for a few hours now and I honestly can't make sense of any of this data. I think NYT should give an explanation. If their explanation is "we are just reporting the numbers given to us", then the election officials need to give us an explanation. This data is unnacceptable. If one of my employees had written a backend and deployed it to production with results like these, he would be fired, and that's with a lot less at stake.

Before looking at this data I originally just thought some states' numbers needed to be called into question and investigated. Now I'm thinking the entire election needs to be re-done. It's irresponsible to do otherwise.

Whatever this so-called "glitch" is, it needs to be identified. Publish the code for the Dominion voting system and let the coal miners audit and test it.

Script here: https://pastebin.com/dJn8DXQa

35
deleted 35 points ago +35 / -0
26
hitchitch2013 26 points ago +26 / -0

MAN! Some of you guys are so smart!

16
IvIA6A 16 points ago +16 / -0

Nice analysis. The data is indeed fucky, and I agree with your assessment. We need a new, in person vote. One day, no mail in bullshit. Paper only, signatures and voter id required. You cheat, or try to cheat, 4 years in prison and you never get to vote again.

8
nowrongwrong 8 points ago +8 / -0

My relatives are all simple folk. Is there any way to turn this script into discrete datapoints that can be plotted on a chart so it can be visualized?

5
purple_nitrile 5 points ago +5 / -0

I'm very jelly of your skills.