This puzzling mystery is completely resolved when you realize that recursive embedded grammar is a feature that is not present in ancient languages, and appears only well after the evolution of writing. When you need to handle recursion the case-systems and complex morphology of pre-literate language becomes unnatural.
All modern fully embedded grammars are essentially the same--- they are described by a context-free replacement grammar which allows adjectives, adverbs, and verb arguments to be replaced by multi-word phrases which serve the same role. The reason is not that this is a fundamental defining property of language. The reason is because the qualitative ideas behind context-free grammars were invented in Greek and Roman times, and Cicero and Aristotle explicitly prescriptively advocated writing this way.
This type of embedded recursive grammar is extremely successful at producing convenient written expressions of complex ideas in a short, but not unduly taxing, form. Due to its convenience, all of the old-world languages adopted the recursive grammar of Cicero et al., one by one, as they acquired bi-lingual speakers of European languages and translations of European recursive works. Once you have multiple embedding, it is very difficult to stop doing it, and you can easily invent a way to do it in any language.
This is the reason that in modern European languages, and in those of India, Asia, or Africa, recursive clausal embedding works in almost the exact same way, a way described well by a context free grammar, with potentially unlimited center, initial, and final embedding. This is like a virus, spreading via bilingual speakers, and only languages which were isolated from Europe by oceans were immune.
Thankfully, a few languages maintained their non-embedded form, due to cultural isolation, most notably Piraha (which has no center embedding, as described in the revolutionary work of Everett) and Warlpiri (which has no full recursive grammar as well). The Native American language as a rule did not have a full context-free structure, and neither do ancient Sanskrit, ancient Chinese, ancient Hebrew, or any ancient language other than (remarkably) ancient Greek and Latin.
This idea is explicitly described and argued by Fred Karlsson in "Constraints on multiple center-embedding of clauses".
Cicero's Remarkable Invention
Embedding with context-free recursive structure became so ubiquitous, that every literate person learns this structure before adolescence, and forgets that it did not come naturally. This structure was invented, not discovered, and it was invented by structure-conscious writers in Greek and Roman times. It spread by emulation to other languages, sometimes by conscious effort of literary folks to popularize this form of expression.
This means that scholars, who for obvious reasons structurally tend to be the most highly literate members of society, all see that every language that they learn has a roughly isomorphic recursive grammar that describes how to produce complex sentences, a grammar which is fundamentally based on a context-free replacement generative grammar. This comes as a shock--- it is a jarring realization which begs for an explanation
I am a native Hebrew speaker, and I remember learning English as a child. I remember that I was miserable for a while, because everything was new. When I finally learned enough vocabulary to make complex sentences, I was immediately struck by the fact that these sentneces, unlike simple constructions, are word-for-word identical to Hebrew complex sentences. I didn't have to learn anything more! I knew immediately how to produce any complex sentence without effort.
It is the same as you learn a new computer language. After learning a few function words and idioms, the structure of the complex expressions is immediately apparent, if you already know another programming language. The reason is that computer languages are all based on the notion of context-free grammars, explicitly abstracted from natural language by Chomsky and Schutzenberger. In the 1950s, Noam Chomsky gave a definition of a language grammar which made the embedding structure the primary ingredient. A language grammar is context free when it allows an arbitrarily deep center-embedding, and Chomsky hypothesized that all the world's languages are described by context free grammars because the original human language was described by a context free grammars.
This is true of all old-world languages, and without a historical appreciation, just by looking at the structure of the languages, one can mistakenly come to believe that this structure is very ancient, and the common source is in prehistoric times. This fallacy is so compelling, that it was unchallenged dogma until Everett's work of 2005.
A Language evolution fallacy
If you see that all birds share a hole in the hip-bone, and all dinosaurs do, you are justified in concluding that birds and dinosaurs have a common ancestor which had a hole in the hip-bone. The reason is Darwin's evolution--- this was the main prediction of the theory. The characteristics of the common ancestor are preserved by all descendents, and if you see two species with a common trait, you can be pretty sure that it was because they evolved in the same family.
This explains why life-forms organize in a heirarchical cladistics tree. Languages also come in a cladistics-like tree, and this is because the transmission of language is much like the transmission of genes, it preserves certain word-sounds and structures in a diverging evolving form.
But unlike evolution, bilingual speakers can transmit nontrivial structure horizontally between very distantly related languages. So that in languages, you find creoles, which in biology would be like an oak-tree/lizard hybrid. You find languages like English whose vocabulary is split almost 50/50 between Germanic and Latin roots, and which are clearly Germanic with enormous Latin influence. You find completely alien loan-words in English like "Kimono" and "Feng-Shui" which come from some of the most distantly related languages in the world.
But most significantly, grammatical constructions are also shared. The fact that all languages recurse the same way suggests one of two things:
- The common ancestor of all languages recursed this way
- Recursion was invented at one spot, and spread horizontally.
Experience with Darwinian evolution suggests the first option, and this is Chomsky's hypothesis. It's dead wrong. The correct answer is number 2.
This means that every one of the world's languages (except for Greek and Latin and their descendents) has a grammatical discontinuity, the moment when it became recursive. This is usually something you can see--- it is a sharp revolutionary advance on the past, and it leads to a golden-age of literature in the coming centuries.
Morphological pressures in pre-recursive and post-recursive languages
In pre-recursive languages, there is no fundamental reason to put the preposition marker before the word, and a very good reason to put it after--- there is already a definite/indefinite marker before the word taking up space.
If you say, in Hebrew "the mountain", you say "ha-har", which in syllable terms, puts a syllable before the word. Now if you say "I walked to the mountain" "halachti la-har, you are putting two syllables before the word. In Hebrew, the two syllables are merged to one, "la" which is much like "to-the" becoming "t'a", as in "I walked t'a mountain". But ignore that.
A word has two ends, and it is much clearer to put the definite marker on one end, and the case-marker (the preposition) on the other end, so that they don't have to fight. This makes best use of the phoneme space, and this is the preferred solution.
"I walked the-mountain-ward" (Halachti ha-har-a)
But this solution is the casing solution, and it interferes with embedding in a way described in the body of this question: Did case systems dissappear to make embedding easier? . When you replace the mountain by an embedded phrase, it puts a syllable in the middle of the embedded phrase in such a way that it is difficult to shear off.
This puts pressure on languages to shed case systems and other morphological transformations in favor of stand-alone function words with a common syntax, beginning at the date that recursive embedding becomes common among all speakers of the language.
Ancient Embedding styles
Just so that I am clear--- all languages embed conceptually, they only don't embed grammatically. The concepts in a non-recursive language are not simpler than in a recursive language, they are just expressed more verbosely.
I want something. Namely, to be clear. All languages have conceptual embedding. They do not need grammatical embedding, not necessarily. If a language has no embedding, then the concepts are not simpler. You just say things more verbosely.
So there is no implication that speakers of Piraha are somehow less than human, or not fully capable of philosophizing, or anything like that. These ideas only come if you associate grammatical recursion with language, an association which is false.