“Who you two? I five … “

March 4, 2009UncategorizedAnthropology, Archaeology, Indo European, language evolution, Language phylogeny, linguistics, Mark Pagel, oldest wordsGreg Laden

And with this, a five year old catapulted back in time, say 10,000 years in West Asia or Southern Europe, encountering two people, would make perfectly intelligible sentence that wold be understood by all. Assuming all the people who were listening were at least reasonably savvy about language and a little patient. This is because a handful of words, including Who, You, Two, Five, Three and I exist across a range of languages as close cognates, and can be reconstructed as similar ancestral utterances in ancestral languages.

It’s like an elephant and a mammoth meeting up in the Twilight Zone. Close enough to know there is a similarity, yet different enough to be a bit freaky.

This is from the work of Mark Pagel, of Reading (England) and his team. And it isn’t quite as simple as I’ve characterized it above. As Pagel told me in a recent interview, “… when I say ‘I’ or ‘two’ are very old, I mean that they derive from cognate (homologous) sounds . Every speaker of every Indo European language uses a homologous form of ‘two’ such as ‘dos,’ ‘due,’ ‘dou,’ ‘do,’ etc. It is an amazing thought because there are billions of Indo European speakers and hundreds of thousands of ‘language-years’ of speaking across all the unique branches of the phylogeny of these languages. In all that time ‘two’ has remained cognate. Cognate does not mean identical … it is a bit like my hand being homologous but not identical to that of a gorilla.”

Pagel acknowledges that may linguists are ‘upset’ with the assertion that there are numerous cognates that share a common ancestor …. which is also a cognate … that must be over 10,000 years old. But he indicates that this dislike for the proposed reconstruction is more of a misunderstanding of this concept of homology than anything else.

Indeed, most linguists reject the idea of even being able to begin to think about maybe planning in the most preliminary way to even maybe consider doing something like what Pagel and his team have done. And these days, the main reason that linguists give for not being able to reconstruct either individual words or linkages between languages and language groups is something like “… You can’t do that because it is long discredited.” But in fact, this is alchemy. Most modern linguists, in my experience, can not provide an actual coherent reason for this discreditation.

Linguists long ago rejected the very methods that were used in the old days (back when linguists thought they could and should reconstruct language phylogenies). There are almost no living linguists trained in this area. The previous generation, which did engage in this activity, were using methods that at the time were cutting edge but today are outdated. So, Pagel is using updated methods for working with words in a similar way that we work with genes, and getting results that are statistically valid.

As with a genetic study, the reconstructed phylogeny is complex. There are meaning-sound links that go back to a certain time period, but not before, because of a change at that node. There are some that are perhaps 40,000 years old (based on an estimate of cultural divergence, which in turn becomes less certain as one goes farther back in time) and others that are only a few thousand years old. As has been demonstrated in other research projects, words that are used frequently are more likely to stay relatively unchanged than are rarely used words. Also, according to Pagel, nouns change more slowly than verbs, and verbs more slowly than adjectives.

So, the phrase “colorless green ideas sleep furiously” uttered in the far distant future might be “blifnork orgonst idears sloop firooslnitch.” According to me, not Pagel. (Pagel refused to comment on that question.)

But seriously, I’m glad to see the linguistic phylogeny challenge taken up again, despite the naysayers, and I’m especially glad that Pagel is doing it because he’s got the methodologies necessary to make this work.

Background.

8 thoughts on ““Who you two? I five … “”

Romeo Vitelli says:

March 4, 2009 at 1:30 pm

So why doesn’t this ever seem to work when I visit a country where I don’t know the language?

Reply
prn says:

March 4, 2009 at 1:59 pm

Disclaimer: I am no longer a professional linguist and even when I was, this was not my specialty. However, just about anybody who doesn’t sleep through 100% of the standard courses ought to have some familiarity with the relevant questions.

I took a look at some of the top links from the “background” above. and I have to say that they’re pretty garbled. I will admit up front that it’s not entirely clear who did the garbling, but the incoherence appears to be sufficiently comparable across reports that I am forced to wonder how much of it comes directly from Pagel.

A brief summary: Greg is correct that cognates are like evolutionary homologies. One of the major problems in historical linguistics has always been distinguishing cognates from borrowings. Unlike gene transfers from ancestor to descendent, words frequently undergo horizontal transfer, e.g. from French to English, English to Japanese, Japanese to English, etc. Some of the greatest achievements in the field of historical linguistics, by the folks that Greg described as “using methods that at the time were cutting edge but today are outdated,” have been directed precisely at the question of how to distinguish cognates (homologies) from borrowings. As a (simple) example, in English, we have a number of “doublets”, e.g., ‘shirt’ vs. ‘skirt’ or ‘yard’ vs. ‘garden’ where both members of the pair come from the same root further back, but one (the first) is “native” and the other is borrowed (in these cases, from old Scandinavian). We can tell which is which because the former underwent well-established historical sound changes that the latter did not.

The examples of ‘I’, ‘two’, ‘three’, ‘five’ and ‘thou’, are sort of good and sort of not. The Stone age phrasebook link reports Pagel’s work as indicating that these words “have changed so little in tends [sic] of thousands of years that ancient hunter-gatherers would have been able to understand them.” This is one of the cases where it is hard to tell whether the problem is with Pagel or with the reporting, but ‘I’ is known to come from a PIE (Proto-Indo-European) root like *ego (where the * indicates a reconstructed form. In this case, attested Latin or Greek forms appear to have been very conservative. In other cases, not so much.) Personally, I find it implausible that someone living say 20,000 years ago (about the minimum to be considered “tens of thousands”), even speaking a directly ancestral language, would have been particularly likely to recognize Modern English ‘I’ (phonetically [ay]) as related to his/her word from which PIE *ego would later develop (where the period for which PIE is reconstructed represents a time somewhere near halfway between the postulated 20,000 and now). Modern English “five” is even less likely, corresponding as it does to PIE *pent.

The whole issue to which Pagel’s work appears to be related has been controversial in linguistics for at least the last 50 years (search on “Greenberg mass comparison” or on “Amerind” — I’ve already put one link into this post and I’m not sure where the risk point begins) and the controversial part is most emphatically NOT whether English (and other languages too, of course) has root that extend beyond PIE, but how much we can reliably conclude. AFAICT, nearly everybody believes that all human languages are probably descendants of a single “original language” used among some relatively small group of ancestors who first innovated a “real” language. Evidence of meaningful vocalization among non-human primates notwithstanding, there is a qualitative difference and most linguist appear to believe that the qualitative difference was probably a single “event” equivalent to the mutation that resulted in something like the amnion, though less obviously traceable in the genome.

Personally (and as an outsider to the details of the controversy) I tend to suspect that we are likely to be able to work back further than we have so far, but that the conclusions we can draw will become increasingly fuzzy. The further back you go, the more likely you are to run into chance resemblances as well as borrowings.

So far, we have very good “traditional” diachronic linguistic data on language families running back to most recent common ancestors with a time-depth on the order of several thousand years, perhaps 7-10 thousand. Greenberg’s Amerind hypothesis, first published in 1987, escalated the controversy enormously. “Merely” recognizing cognates is far from trivial, especially at time-depths in the range of 20,000 years as usually estimated for the controversial Amerind, let alone at the (possibly???) 60,000 to 70,000 years that looks like the best guess I have seen for “Proto-World”. Whether Pagel has “got the methodologies necessary to make this work” may or may not turn out to be the case, but the links I have seen so far really don’t say much of anything about his methodologies.

I don’t want to be a nay-sayer. I think that as a research topic a language phylogeny for the world is fascinating. Furthermore, in order to devote one’s professional life to a topic like that, one must have a great deal of optimism and enthusiasm. OTOH, the rest of us should not allow ourselves to get too carried away by the claims of enthusiasts. (However attractive they may be, and I admit to more than a little tendency to be attracted to grand hypotheses.)

I could go on all day, but I’m sure I have already overdone it for a blog posting.

Paul

Reply
Nathan Myers says:

March 4, 2009 at 2:25 pm

Language Log has this to offer:

Scrabble tips for time travelers
Imperial BS flows?
Tips for William the Conqueror fanboys

They are not kind, and in detail. The detail is much more interesting than anything Pagel offers.

Reply
Lilian Nattel says:

March 4, 2009 at 3:49 pm

Interesting post and comment. Thank you both. Linguistics and what it might tell us about history and anthropology is fascinating.

Reply
Greg Laden says:

March 4, 2009 at 4:55 pm

but the incoherence appears to be sufficiently comparable across reports that I am forced to wonder how much of it comes directly from Pagel.

If you knew Pagel you would laugh at yourself for thinking of him as a scholar capable of garbling.

I’m off to see Dawkins just now, but I’ll be following up on this later, including some further suggestions as to why we should not reject this work.

Reply
Adrian Morgan says:

March 4, 2009 at 8:15 pm

Have you been following the discussion on Language Log?
http://languagelog.ldc.upenn.edu/nll/?p=1186
http://languagelog.ldc.upenn.edu/nll/?p=1191
http://languagelog.ldc.upenn.edu/nll/?p=1199

Reply
prn says:

March 4, 2009 at 8:50 pm

@Greg:

If you knew Pagel you would laugh at yourself for thinking of him as a scholar capable of garbling.

I don’t know Pagel. I just found the commentary in the first few links to be pretty garbled. I don’t know how much of the garbled, incoherent commentary was created independently of each other, but they seemed remarkably similar. The bit about a “phrasebook” of ancient words strikes me as at best pretty silly if you were to expect someone to actually recognize the modern forms of the ancient words. For that matter, there are plenty of cognates across modern European languages that most English speakers would not even recognize without specific knowledge.

For example, this site quotes Pagel to the effect that:

the model canâ??t guess what the ancestral words were, but can only estimate the likelihood that the sound from a modern English word might make some sense if called out during the Battle of Hastings.

But in many cases, we do have a pretty good idea of what the words would probably have sounded like 1,000 years ago and a sentence like “We will die in the night.” in which three of the six words are in Pagel’s top 20 oldest words list and only words that we are absolutely certain have been in the language for at least that long, would not, in my never sufficiently humble opinion, be readily understood by anyone living at that time. The sounds of the words retain quite a lot of similarity, but are also different enough that I would be very surprised if they were “readily” understood.

Sure, it’s well-known that some words have been in the English language (and its precursors), with relatively small meaning changes, for a long time. There has also been a fair amount of research on a possible outgroup to the I-E language family (generally called “Nostratic”). There have certainly been claims that particular words or morphemes descend from Nostratic into I-E. I’d be interested in finding out more about Pagel’s work and I intend to look into it.

So far, the reports have not contained much that is clearly both new and true. The top few words on his list are well-known to have been inherited from long back. Another term you might look up is “glottochronology”. The word lists for that purpose are quite similar to his and I don’t think this is a coincidence.

It’s more than plausible that the popular reports of Pagel’s work have oversimplified and overspectacularized it. I don’t doubt that Pagel is a smart guy, but there is a lot known that I don’t see being taken into account. I do plan to look into it further.

Paul

Reply

Greg Laden's Blog

“Who you two? I five … “

Have you read the breakthrough novel of the year? When you are done with that, try:

Like this:

8 thoughts on ““Who you two? I five … “”

Leave a Reply to prn Cancel reply