As a web developer, I'd say there's a potential plausible explanation for oscillating counts like this. If you have a few different caching servers that the election sites are frequently hitting to fetch data from, and somehow hit an old count, depending on how the servers are setup there's a potential to temporarily show old data until the databases get consistent. In order to prove this you'd have to examine the frontend and backend code. I'd put odds at this being the explanation at extremely small, but there's a possibility that something like this or other technical issues could be the reason.
When you say “extremely small” are you talking about the caching issue alternate causality? In a normal world I’d give your explanation dramatically higher probability...
Always great to get other ideas/explanations.
When you think about implementation of what NYT is showing (client side is still accessible), is there any way to get a sense of how they might have generated the graph? That might have clues if graph generation happens by, client side JS. I’m not a web developer. What do you think?
I have no real knowledge of how media outlets get their data from states, what technology is involved and how this industry works, what the server setup is, etc.
The source cited on the website is edison research which appears to be this company. https://www.edisonresearch.com/ I have no idea where they get their data, where and how it's transferred to the NYT, etc.
You can look at the timeseries object in the API output to examine the data.
Look at Timestamp 2020-11-04T05:07:23Z
Biden had a 52.4% to 46% lead with 3,572,807 votes in, and an estimated total of 76% of the votes processed (I assume that's what eevp stands for)
**The next timestamp at 2020-11-04T05:12:38Z shows Biden at 48.2% and Trump at 50.2% with 3,199,165 votes in (fewer than the last time stamp) at an estimated 76% of the votes processed. (fewer votes)
The next timestamp at 2020-11-04T05:26:21Z shows 3,390,813 votes (also less than the 2020-11-04T05:07:23Z timestamp) in and the estimated votes at 76%**
The next timestamp at 2020-11-04T05:26:48Z has the estimated vote count go to 80% with 3,782,386 votes.
The bolded data in the middle is odd, why did the NYT API give out old data and showing the estimated vote total stay at 76%? Was it New York Times or Edison that might have screwed up? There's more than 1 explanation, here's a few possibilities.
It was some attempt at rigging votes, and somehow they screwed up the process. It's impossible to know if it was Edison, NYT, or the state that did this. A mathematician diving into the timestamps here would probably be able to better tell if the numbers make sense.
Some data entry somewhere in the process was wrong and some intern entered in old data by accident in their election reporting system.
Software bugs. Running a big live event like this is really hard. You can run tests and hope everything is working, but you're under a lot of pressure with I assume a lot of moving parts while your website is under extreme traffic load.
Somewhere in the process from state to edison to NYT, some caching server served up old data somehow. I don't know election technology setups at all so it's tough to comment further.
As a web developer, I'd say there's a potential plausible explanation for oscillating counts like this. If you have a few different caching servers that the election sites are frequently hitting to fetch data from, and somehow hit an old count, depending on how the servers are setup there's a potential to temporarily show old data until the databases get consistent. In order to prove this you'd have to examine the frontend and backend code. I'd put odds at this being the explanation at extremely small, but there's a possibility that something like this or other technical issues could be the reason.
When you say “extremely small” are you talking about the caching issue alternate causality? In a normal world I’d give your explanation dramatically higher probability...
Always great to get other ideas/explanations.
When you think about implementation of what NYT is showing (client side is still accessible), is there any way to get a sense of how they might have generated the graph? That might have clues if graph generation happens by, client side JS. I’m not a web developer. What do you think?
I have no real knowledge of how media outlets get their data from states, what technology is involved and how this industry works, what the server setup is, etc.
All I can really contribute is this. Here is the page that graph appears on. https://www.nytimes.com/interactive/2020/11/03/us/elections/results-virginia-president.html
The source cited on the website is edison research which appears to be this company. https://www.edisonresearch.com/ I have no idea where they get their data, where and how it's transferred to the NYT, etc.
This appears to be the API endpoint for NYT where that page populates the data from. https://static01.nyt.com/elections-assets/2020/data/api/2020-11-03/race-page/virginia/president.json
You can look at the timeseries object in the API output to examine the data. Look at Timestamp 2020-11-04T05:07:23Z Biden had a 52.4% to 46% lead with 3,572,807 votes in, and an estimated total of 76% of the votes processed (I assume that's what eevp stands for)
**The next timestamp at 2020-11-04T05:12:38Z shows Biden at 48.2% and Trump at 50.2% with 3,199,165 votes in (fewer than the last time stamp) at an estimated 76% of the votes processed. (fewer votes)
The next timestamp at 2020-11-04T05:26:21Z shows 3,390,813 votes (also less than the 2020-11-04T05:07:23Z timestamp) in and the estimated votes at 76%**
The next timestamp at 2020-11-04T05:26:48Z has the estimated vote count go to 80% with 3,782,386 votes.
The bolded data in the middle is odd, why did the NYT API give out old data and showing the estimated vote total stay at 76%? Was it New York Times or Edison that might have screwed up? There's more than 1 explanation, here's a few possibilities.