It is often said in genealogy (or any field of research, really) that “extraordinary claims require extraordinary evidence”. Fair enough. But how does one know what is an extraordinary claim? We can extrapolate from our own experience and society, but our ancestors often lived in very different times and circumstances. What might seem extraordinary or strange to us might have been quite common to them and vice versa.
I'm a moderator on the Mathematica.SE site, so I’m aware of the social network analysis capabilities in Mathematica, even though I’ve never really used them. They, and a recent post on Stephen Wolfram’s blog got me thinking about the genealogical equivalent.
The big sites like Ancestry.com, Familysearch.org and FindMyPast.co.uk have all collected literally millions of trees and built up profiles of tens of millions of individuals and families. It seems obvious that one could build up statistical statements about what was most common historically, and then people could compare their current hypotheses about their own ancestors to these statements to help decide if they are making “extraordinary claims”. I’ll give some examples to show what I mean. Bear in mind that the following are all completely made up, but are the sort of thing I had in mind:
15% of men and 16% of women born in 18th century England married more than once; only 2% of either men or women married more than twice. ( ok so my great^4 grandmother probably didn't marry five times and those are different Alice Smiths... )
94% of people who were not nobility born in Kent between 1650 and 1800 married someone born within twenty miles of them. Less than 1% married someone born more than 50 miles away. ( ok so I can probably rule out that baptism from Lancashire as being for my great^5 grandfather who married and died in Dorset...)
Only 10% of men born in the 1700s in England who subsequently married did so for the first time after they turned 30, and if their parents married before the age of 25, only 3% married after the age of 30. ( hmm, so my great^5 uncle is probably that guy, not that guy who married eight years later. So his wife is most likely to be this woman not that one...)
So my question is: has anyone done this, even over a restricted field such as a particular century or country? Surely there is a statistically-minded Mormon over at FamilySearch or wherever who has thought of doing this? While the exceptional cases clearly can still have happened, it would surely help people formulate their hypotheses if they had an idea how likely or unlikely certain outcomes might have been.
Lots of good responses in the answers and comments. I agree that the data are very dirty. But I'm an economist and thus used to dealing with messy, poorly measured data. The question is whether the errors in the data are enough to materially skew the kinds of statistical statements I have in mind. I can imagine that there would be some skew away from correct conclusions that happened to be unusual (the thrice-married woman, the 17th century couple born 100 miles apart etc). But for a lot of things, the errors in people’s trees would cancel each other out. I think this is the difference between a statistical approach and an historical approach.