I'm writing my thesis at the moment, and for some time - due to a lack of a proper alternative - I've stuck with "unstructured data" for referring to natural, free flowing text, e.g. Wikipedia articles.
This nomenclature has bothered me from the very beginning, since it opens a debate that I don't want to get into. Namely, that "unstructured" implies that natural language lacks structure, which it does not - the most obvious being syntax. It also gives a negative impression, since it is the opposite of "structured", which is accepted as being positive. This is not the focus of my thesis, though the "unstructured" part itself plays an important role.
I completely agree with the writer of this article, but he proposes no alternative except for "rich data", which doesn't cover my point. The point I'm trying to make that the text lacks a traditional database-like (e.g. tabular) structure of the data, with every piece of data having a clear data type and semantics that is easy to interpret using computer programs. Of course I'd like to condense this definition into a term, but so far I've been unsuccessful coming up with, or discovering an acceptable taxonomy in literature.