Imagine you're looking at the first digit of numbers 1 through 20. 50% of the numbers start with one.
Now imagine you're looking at the first digit of numbers from 1 to 50. 20% of them will start with one.
Now imagine you're looking at the first digit of numbers from 1 to 99. 10% of them will start with one.
So we have a trend here. In most situations, the number one is a more common first digit than other numbers. Only in very specific situations is it equally likely as other numbers.
Now imagine that you're looking at the first digit of numbers from 1 to 200. It's back up to 50% of those numbers starting with one. So you can say that the odds of a number starting with one varies between 10% and 50%, depending on the range of numbers you're looking at.
There's more to it than that, but that should at least give you an idea of the basic concept.
Approach it like this: think about counting to 100. Benford’s Law basically just talks about how often each first digit appears. So as you count, keep nine separate tallies; each time a number starts with “1” (such as 11, 15, 122, etc.) you make a tally mark in the “ones” column. Same for “twos,” etc.
Let’s count to 3 first. One tally mark in the ones column, twos column, and threes column, but no tally marks in the fours through nines columns. We see that 1 appeared the most often (but was tied with 2 and 3, which also appeared a single time).
If we count to 10, then “1” appears as first digit twice, and all the other columns only get one tally mark. So 1 appeared most often but everything else was tied for second place.
Now count to 20. The numbers 10 through 19 all result in tally marks for the “ones column”; altogether, you get eleven tally marks in the ones column. There are two tally marks in the twos column (for “2” and “20”). Every other column only gets one tally mark. So the ones column is in first place (has the most tally marks), the twos column is in second place, and all the other columns are tied for third place.
See the pattern? You can do this out to 100 if you want. The smaller the digit, the more often it appears.
If a sloppy cheater makes up numbers without thinking of Benford’s Law, the numbers will seem “random”/“natural”, but further scrutiny will reveal that they are fishy/improbable. So let’s just pick the numbers 27358, 524, 18327, 937, etc. These look “random”/“natural” to the cheater. The key oversight is that the cheater doesn’t think about how “natural” the SET OF NUMBERS is. So 524 and 5673 look “natural” individually, but taken together you have an improbably high frequency of “5” as the first digit.
Yeah statistics are gonna blow this shit apart
WE are the Party of Facts
Watch them bring in some MIT guy with decades of experience tell the court that this is 99.999% unlikely lol
this is all so fucking juicy that it's hard to sleep
Anyone wanna bet we're still gonna see Twitter disclaimers after people with 4 MIT degrees prove the maths beyond doubt?
Is it?
from what i've read . . . still learning the formula
I’m gonna need a serious ELI5 on it I’ve read about it several times but can’t say I understand how it works. My brain no likey mathy.
i keep reading it and still have 'mathy issues'
KEK Patriot!
Imagine you're looking at the first digit of numbers 1 through 20. 50% of the numbers start with one.
Now imagine you're looking at the first digit of numbers from 1 to 50. 20% of them will start with one.
Now imagine you're looking at the first digit of numbers from 1 to 99. 10% of them will start with one.
So we have a trend here. In most situations, the number one is a more common first digit than other numbers. Only in very specific situations is it equally likely as other numbers.
Now imagine that you're looking at the first digit of numbers from 1 to 200. It's back up to 50% of those numbers starting with one. So you can say that the odds of a number starting with one varies between 10% and 50%, depending on the range of numbers you're looking at.
There's more to it than that, but that should at least give you an idea of the basic concept.
Thanks for taking your time to explain buddy! Makes sense.
Approach it like this: think about counting to 100. Benford’s Law basically just talks about how often each first digit appears. So as you count, keep nine separate tallies; each time a number starts with “1” (such as 11, 15, 122, etc.) you make a tally mark in the “ones” column. Same for “twos,” etc.
Let’s count to 3 first. One tally mark in the ones column, twos column, and threes column, but no tally marks in the fours through nines columns. We see that 1 appeared the most often (but was tied with 2 and 3, which also appeared a single time).
If we count to 10, then “1” appears as first digit twice, and all the other columns only get one tally mark. So 1 appeared most often but everything else was tied for second place.
Now count to 20. The numbers 10 through 19 all result in tally marks for the “ones column”; altogether, you get eleven tally marks in the ones column. There are two tally marks in the twos column (for “2” and “20”). Every other column only gets one tally mark. So the ones column is in first place (has the most tally marks), the twos column is in second place, and all the other columns are tied for third place.
See the pattern? You can do this out to 100 if you want. The smaller the digit, the more often it appears.
If a sloppy cheater makes up numbers without thinking of Benford’s Law, the numbers will seem “random”/“natural”, but further scrutiny will reveal that they are fishy/improbable. So let’s just pick the numbers 27358, 524, 18327, 937, etc. These look “random”/“natural” to the cheater. The key oversight is that the cheater doesn’t think about how “natural” the SET OF NUMBERS is. So 524 and 5673 look “natural” individually, but taken together you have an improbably high frequency of “5” as the first digit.
Did you forget to add 5673 to your pool of numbers or am I misunderstanding?
I didn’t phrase that part very well. I shouldn’t have said “so let’s pick”; I should have just said “don’t these numbers look random?”
In the last sentence I should have made up another number besides 524. How about 57?
I got ya. Thanks for the clarification!
It's a forensic accounting technique that is admissible in court as evidence the data has been manipulated.
I don't know if you can convict solely on that, but with the rest of the shit they uncovered, I wouldn't worry too much
Source that it’s admissible in court ? Is there a precedent of it being held as valid in court ?
i don't believe it's ever been used before, but it will be now
This is the statistical model they used to bust ENRON, used in forensic accounting to detect fraud.
that is what i have learned thus far Patriopede
Good old ENRON. I'm adding an American election to the ole list of things ENRON's case has impacted.
It’s a good claim but the odds are pretty stacked against us.. note that all these election issues should be closed. Y December 8th
It would be much more easier to cancel out invalid absentee ballots than explaining complex mathematical concepts to judges lol
my birthday is the 6th, the clarification for Electoral College is the 8th
i believe God & Trump are going to give me a sweet gift