Yea, it's a bit annoying, but it seems the obit follow a consistent naming convention so you could parse via the file names? This was what I was going to do.
http://files.usgwarchives.net/pa/allegheny/obits/
You're right to be cautious though. I'm not sure who is hosting- I just came across it in another comment. The about and history of the the project is not particularly informative: http://usgwarchives.net/projecthistory.htm
And whois is just godaddy...
I was looking for official state maintained death records but I was not able to find much. Maybe I'm not looking in the right place?
Here's the dataset for the obituaries by county per USGW archives
I think it's a list of all registered voters: https://www.pavoterservices.pa.gov/pages/purchasepafullvoterexport.aspx
But you're right, the data seems to be only up to the 2020 General Primary. There's no value for 2020 General Election...
Needs to be upvoted. Was looking for this data set for cross reference! Thanks so much!
You can modify the link to get the data set for other states as well.
Data seems legitimate. It's broken up by county and is more comprehensive than the WI data set (this one has birthdays etc).
Pretty big, numbers are reasonable and seem up to date (number of lines corresponds with registration count).
942321 ALLEGHENY FVE 20201109.txt
311062 YORK FVE 20201109.txt
I am writing some scripts to evaluate the data. What are some ideas for evaluation? The data set values are listed below. Unfortunately, there's no birthday to use as cross reference for death certificates.
What I am checking for now whether a voter had voter registration from previous year and have never voted except for in Nov 2020 via absentee.
'Voter Reg Number'
'LastName'
'FirstName'
'MiddleName'
'Suffix'
'PhoneNumber'
'EmailAddress'
'Address1'
'Address2'
'MailingAddress1'
'MailingAddress2'
'MailingCityStateZip'
'HouseNumber'
'StreetName'
'UnitType'
'UnitNumber'
'ZipCode'
'Jurisdiction'
'DistrictCombo'
'Ward'
'Congressional'
'State Senate'
'State Assembly'
'Court of Appeals'
'Multi-Jurisdictional Judge'
'County'
'County Supervisory'
'Municipality'
'Aldermanic'
'School'
'High School'
'Sanitary'
'Technical College'
'Representational School'
'State'
'District Attorney'
'Circuit Court'
'First Class School'
'Incorporation'
'Voter Status'
'Voter Status Reason'
'ApplicationDate'
'ApplicationSource'
'IsPermanentAbsentee'
'Voter Type'
And all voter submission type (At poll, Absentee, None/NaN) from Feb 2006 - Nov 2020.
Thank you so much for doing this!
Not sure you will see this, and apologies for perhaps asking too much, but would you be able to provide a csv of the dataset as you did for the previous set you posted ($345k one)?
Running some scripts through the data and a csv version would just be a tad more friendly to work with.
Quick tip: File is huge so if you want just a quick look and you're familiar with command prompt, use head or cat. You can also output parts of the file to another file to work in small batches.
Head by default will print the top ten lines, but you can pass a numeric argument to just print one or two lines.
head -n 1 2152_5757.txt
Voter Reg Number LastName FirstName MiddleName Suffix PhoneNumber EmailAddress Address1Address2 MailingAddress1 MailingAddress2 MailingCityStateZip HouseNumber StreetName UnitType UnitNumber ZipCode Jurisdiction DistrictCombo Ward Congressional State Senate State Assembly Court of Appeals Multi-Jurisdictional Judge County County Supervisory Municipality Aldermanic School High School Sanitary Technical College Representational School State District Attorney Circuit Court First Class School Incorporation Voter Status Voter Status Reason ApplicationDate ApplicationSource IsPermanentAbsentee Voter Type November2020 August2020 May2020 April2020 February2020 April2019 February2019 November2018 October2018 August2018 June2018 May2018 April2018 February2018 January2018 December2017 April2017 February2017 November2016 August2016 April2016 February2016 December2015 November2015 October2015 September2015 July2015 June2015 May2015 April2015 February2015 November2014 October2014 September2014 August2014 May2014 April2014 February2014 December2013 November2013 October2013 September2013 May2013 April2013 February2013 December2012 November2012 August2012 June2012 May2012 April2012 February2012 November2011 October2011 August2011 July2011 May2011 April2011 February2011 November2010 September2010 April2010 February2010 April2009 February2009 November2008 September2008 April2008 February2008 April2007 February2007 November2006 September2006 April2006 February2006
Oh, I am working on something similar for the WI dataset that was posted. That dataset seems better suited for checking the voter history because it has entries for all elections by mo and yr, and it will be NaN, Absentee, or At Poll.
The thing to check though is that the voter registration date earlier than 2020. This is because all fields except Nov2020 would be NaN if they just registered prior to the Nov election.