Are there any tools to search unstructured, digital data?


While I have been searching for ancestors in early America (late 1700s) I have found multiple records that have been OCR'd and made available online. These are extremely useful, however all that data is "dumped" online with no structure. Because these data dumps are usually pretty large with common names it can take a bit to find the ones I've been looking for.

Example: Searching for Johan Neu

Are there any tools to assist in:

  1. Finding these records through name searches?
  2. Finding names within these records?


Posted 2016-02-04T14:15:57.777

Reputation: 141

Noremac, welcome. Can you clarify your question a little bit as the example you provided is text searchable in a browser. Are you for example looking for a search engine specifically for for example? – CRSouser – 2016-02-04T15:04:02.897


Since this is potentially a question where every answer could be just as valid as another -- see What types of questions should I avoid asking? -- I'd like the users answering to please review the guidelines about constructive subjective questions, and write answers that explain “why” and “how” -- highlighting the advantages and disadvantages of whatever tool you suggest.

– Jan Murphy – 2016-02-04T18:06:13.077

@CRSouser The text is searchable in a browser and makes it great to find what I'm looking for. I do, however want to take it a step further and be able to search in a more structured way, like family search or ancestry's searches of indexed data, where the fields are labelled. – Noremac – 2016-02-05T04:31:08.743



Your best bet is probably Data Drop from Wolfram Alpha.

You create an online "databin" into which you deposit your data, then use the Wolfram language to analyse it.

The system does take a bit of time getting used to. There is also a fee after an initial trial period.

If a lot of your data is graphical or scanned and you wish to use a system to perform OCR too, then this may not be for you. Although Wofram's Mathematica can perform OCR.


Posted 2016-02-04T14:15:57.777

Reputation: 1 102

Nice I hadn't though of using that before for Genealogy, is there a simple example or tutorial you can also point to that is applicable to his example? – CRSouser – 2016-02-04T15:27:41.890

I'm not aware of any tutorials - I would have referenced them in the answer. I've been experimenting with Wolfram as a means of storing my own tree - an alternative to GEDCOM with much better analysis features. But I can see a great use for it in searching, for example, census records to find specific people. I keep threatening myself with trying that on a wet Sunday afternoon. – Chenmunka – 2016-02-04T15:46:38.687