<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	
	xmlns:georss="http://www.georss.org/georss"
	xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
	>

<channel>
	<title>lexicon &#8211; Greg Laden&#039;s Blog</title>
	<atom:link href="https://gregladen.com/blog/tag/lexicon/feed/" rel="self" type="application/rss+xml" />
	<link>https://gregladen.com/blog</link>
	<description></description>
	<lastBuildDate>Tue, 01 Jun 2010 13:54:19 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.4.8</generator>

<image>
	<url>https://i0.wp.com/gregladen.com/blog/wp-content/uploads/2017/10/Greg_Ladens_Blog_Favicon_black_GLb.png?fit=32%2C32&#038;ssl=1</url>
	<title>lexicon &#8211; Greg Laden&#039;s Blog</title>
	<link>https://gregladen.com/blog</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">77525483</site>	<item>
		<title>A run in my stocking is not a worn out salmon: Response to Mark Liberman</title>
		<link>https://gregladen.com/blog/2010/06/01/a-run-in-my-stocking-is-not-a/</link>
					<comments>https://gregladen.com/blog/2010/06/01/a-run-in-my-stocking-is-not-a/#respond</comments>
		
		<dc:creator><![CDATA[Greg Laden]]></dc:creator>
		<pubDate>Tue, 01 Jun 2010 13:54:19 +0000</pubDate>
				<category><![CDATA[Anthropology]]></category>
		<category><![CDATA[dictionaries]]></category>
		<category><![CDATA[Falsehoods II]]></category>
		<category><![CDATA[language]]></category>
		<category><![CDATA[Language]]></category>
		<category><![CDATA[lexicon]]></category>
		<category><![CDATA[linguistics]]></category>
		<category><![CDATA[vocabulary]]></category>
		<guid isPermaLink="false">http://scienceblogs.com/gregladen/2010/06/01/a-run-in-my-stocking-is-not-a/</guid>

					<description><![CDATA[I&#8217;m very please that my discussion of the &#8220;we can&#8217;t ever know what a word is&#8221; Internet meme has elicited a response from Mark Liberman at Language Log. (here) Mark was very systematic in his comments, so I will be very systematic in my responses. 1. Without a careful definition of what you mean by &#8230; <a href="https://gregladen.com/blog/2010/06/01/a-run-in-my-stocking-is-not-a/" class="more-link">Continue reading <span class="screen-reader-text">A run in my stocking is not a worn out salmon: Response to Mark Liberman</span> <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>I&#8217;m very please that my discussion of the &#8220;we can&#8217;t ever know what a word is&#8221; Internet meme has elicited a response from Mark Liberman at Language Log. (<a href="http://languagelog.ldc.upenn.edu/nll/?p=2363">here</a>) Mark was very systematic in his comments, so I will be very systematic in my responses.</p>
<p><span id="more-25529"></span></p>
<blockquote><p>1. Without a careful definition of what you mean by &#8220;word&#8221; and by &#8220;language X&#8221;, questions like &#8220;how many words are there in language X&#8221; are pretty much meaningless, because different definitions will yield very different numbers.</p></blockquote>
<p>This is very much off the mark.  I can measure the distance from the earth to the moon using a variety of techniques, and get different measurements for a variety of reasons.  The measurements may differ but they still tell me a great deal about the initial question especially when compared with other measurements (like how far away the sun is in comparison).</p>
<p>The way you have worded your paragraph tells me that if I wanted to examine different languages (say, grouped by language family or geography or whatever) to see if there were big difference in lexicon size, it would be impossible.  Are you certain you want to make that argument?</p>
<p>In fact, we are mostly in agreement about the difficulties (see below) but that is not the point of the original post. The original post is about an Internet meme that claims that it is all utterly impossible.</p>
<blockquote><p>2. The same thing applies, with the added issue of what you mean by &#8220;know&#8221;, to the question of &#8220;how many words of language X does a specific person know?&#8221; Another layer of variation is added by generalizing the question to &#8220;how many words of language X does an average four-year-old or 18-year-old know?&#8221; There&#8217;s an obvious answer, subject to the usual sampling-error problems, but the result is a bit like asking about average income &#8212; the mean value may not be very useful in telling you what you really want to know about the distribution.</p></blockquote>
<p>I agree that mean values are not especially interesting without understanding variance (though you&#8217;ve objected to my quest for variance in item one) but this is not really related to anything I&#8217;ve said in my post or comments thereon.</p>
<blockquote><p>3. Most sensible definitions for (1) and (2) above create serious practical difficulties for counting. That is, they define an answer, but the prescribed process for finding it is hard to carry out, and especially hard to automate in a way that produces an accurate result.</p></blockquote>
<p>Interesting point, and it fits with what a lot of linguists seem to think of language. I don&#8217;t happen to subscribe to the approach that it is all to big and mysterious to study systematically.</p>
<blockquote><p>4. Extrapolating accurately from samples raises its own special problems here&#8230;</p></blockquote>
<p>I can&#8217;t find the place where I scorned this.</p>
<blockquote><p>5. Despite all these difficulties, researchers over the years have gone through the steps of defining carefully what they mean by &#8220;word&#8221;, &#8220;language&#8221;, &#8220;know&#8221;, etc., and then carried out these steps&#8230;</p></blockquote>
<p>Please seem my comment regarding a room full of beer loving linguists.  I don&#8217;t think I ever said that defining &#8220;word&#8221; or &#8220;meaning&#8221; is easy or something that can be done with precision.  What I did imply is that comments such as your number 1 (above) are very serious overstatements of the impossibility of it all, and more specifically, when we see an entry in a dictionary with dozens of meanings listed, we are not really faced with the question: &#8220;Is this one word or fifty?&#8221; while acknowledging that we may still be faced with the question &#8220;is this 32 words or 50?&#8221;</p>
<blockquote><p>6. Comparisons across languages are made more difficult by the fact that the most natural and sensible answers to questions like those in (1) tend to be different in different languages. Furthermore, a decision that may have only a small effect on the results in language X, may turn out to change things by an order of magnitude or more in language Y. Again, this doesn&#8217;t make it impossible to answer the questions, it just increases yet again the range of sensible values that answers might have.</p></blockquote>
<p>Yes, it does increase the range of possible values, and I would add these two points:  The degree to which two languages can be compared is very strongly affected by the data collection.  Comparing English Lexicon to Central Sudanic languages is impossible because the English dictionaries have hundreds or thousands of authors and centuries of development (if you count the whole written source), while the Central Sudanic language lexicons have between zero and three authors each, decades of study, and were carried out mainly for the purposes of bible translation.  (Mostly, zero written lexicon).</p>
<blockquote><p>Laden is radically impatient with all this talk about how it all depends and it&#8217;s hard to tell, but his impatience doesn&#8217;t change the facts. Nor does it change the fact that there are plenty of attempts to answer such questions&#8230;</p></blockquote>
<p>Actually, that was not the point of the post.  I was speaking specifically of a certain meme on the Internet, not linguistics in general.</p>
<blockquote><p>Laden seems to be aware of these issues &#8212; for example, he found the Nagy and Anderson reference &#8212; but his goal in the cited post seems to be to make fun of people rather than to clarify the questions and answers.  (He suggests, towards the start of his post, that he wants to evaluate claims about the rate of word learning by children &#8212; but I couldn&#8217;t see any connection between this issue and the rest of his hyper-kinetic complaining about the difficulty of getting a simple answer to the word-counting question.)</p></blockquote>
<p>Oh dear, I stepped on your field of study and you got all icky about it.  I didn&#8217;t &#8220;find&#8221; the reference.  It is part of the literature of which I became aware while studying for my PhD in anthropology.  And your statement about my goal is essentially correct.  It is not true that my goal is what you later imagined it could have been.  I&#8217;m not sure how I would have managed to write the post you were expecting!</p>
<p>Mark, I appreciate your comments, but you are mostly constructing and attacking a straw man.</p>
<p>Response to comment by Nick Lamb(<a href="http://languagelog.ldc.upenn.edu/nll/?p=2363#comment-69915">here</a>):</p>
<blockquote><p>I presumed from the fact that almost every other word in the &#8220;rant&#8221; is made up that he&#8217;s very conscious of what the problem is with counting words, and is actually using this opportunity to show the reader why this is all very tricky, but has chosen the form of a rant which pretends to assert the contrary. &#8230; Excuse me if that was so obvious that everyone already knows it and I missed some hint that Mark dropped.</p></blockquote>
<p>Nick: No excuses.  It was utterly obvious and some people certainly missed it. Glad you didn&#8217;t.</p>
<p>I&#8217;d like to add it is probably helpful to understand the commentary in the broader context of the &#8220;falsehoods&#8221; writing of which it is a small part.  My blog is bit dangerous that way:  My posts often do not stand alone but require context.  To get the context, you click on the tags near the top of the post and read everything.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://gregladen.com/blog/2010/06/01/a-run-in-my-stocking-is-not-a/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">25529</post-id>	</item>
		<item>
		<title>Minifalsehood: We can&#8217;t tell what a word is!?!?</title>
		<link>https://gregladen.com/blog/2010/05/31/minifalsehood-we-cant-tell-wha/</link>
					<comments>https://gregladen.com/blog/2010/05/31/minifalsehood-we-cant-tell-wha/#respond</comments>
		
		<dc:creator><![CDATA[Greg Laden]]></dc:creator>
		<pubDate>Mon, 31 May 2010 11:07:23 +0000</pubDate>
				<category><![CDATA[Falsehoods II]]></category>
		<category><![CDATA[langauge evolution]]></category>
		<category><![CDATA[Language]]></category>
		<category><![CDATA[lexicon]]></category>
		<category><![CDATA[lingusitics]]></category>
		<guid isPermaLink="false">http://scienceblogs.com/gregladen/2010/05/31/minifalsehood-we-cant-tell-wha/</guid>

					<description><![CDATA[I am looking at the question: How many words are there in a language? I&#8217;d like to know for languages in general, comparatively, and for pedagogical reasons, in some well known western language which may as well be English. What I found quite incidentally is a hornets nest of curmudgeonistic pedanticmaniacal jibberishosity. (There. Whatever the &#8230; <a href="https://gregladen.com/blog/2010/05/31/minifalsehood-we-cant-tell-wha/" class="more-link">Continue reading <span class="screen-reader-text">Minifalsehood: We can&#8217;t tell what a word is!?!?</span> <span class="meta-nav">&#8594;</span></a>]]></description>
										<content:encoded><![CDATA[<p>I am looking at the question: <em>How many words are there in a language?</em> I&#8217;d like to know for languages in general, comparatively, and for pedagogical reasons, in some well known western language which may as well be English.</p>
<p>What I found quite incidentally is a hornets nest of curmudgeonistic pedanticmaniacal jibberishosity.  (There. Whatever the count was, it is now N+3)</p>
<p>(For more Falsehoods, <a href="http://scienceblogs.com/gregladen/falsehoods_ii/">click here</a>. Also, listen to &#8220;<a href="http://scienceblogs.com/gregladen/2010/05/everything_you_know_is_sort_of.php">Everything You Know is Sort of Wrong</a>,&#8221; on Skeptically Speaking Talk Radio. )<br />
<span id="more-25516"></span><br />
First I want to explain why I was interested in this at all.  There has for some years been discussion of the vastosity of language, and how impressive this vastosity is in relation to the ability of a child to enlearn it all.  Various studies have shown that children of a certain age know (as in recognize) a waylot of words, a virtual spoorload of lexicon.  When you do the maths, it turns out that children are learning some horrific number of words per day from the time they are yajabbering infants in order to reach that number by said age.</p>
<p>Indeed, it has been guestimated that the number of words in English is far far greater than the number of words we tend to think there is in English, and rugrats know way more of them than anyone has ever ponderified.  The usual story goes like this: The English dictionary you can find with the mostest of words is probably Funk and Wagnall&#8217;s New Standard Unabridged, with just under a hemimillion (maybe 450,000) entries.  When this list is adjusted to account for the fact that words are not really what they seem when they are listed in the dictionary, a sublist can be generated.  If this list, about a quatromillion in size, is sampled one can make a test to see how many words a person, perhaps a rugrat, knows.  Call the result the Lexiknowitall Quotient, if you will.  Or, for simpleness sake, &#8220;L.&#8221; (I will not be using the variable &#8220;L&#8221; for the rest of this post, so there really was no reason to tell you that.)</p>
<p>Given this, a fully growd adult with a high school education knows about 45,000 words.  A six year old knows 13,000.  Do the maths. To get from zero to 13,000 a child has to learn one new word every two hours.  Watch them. You can see them doing it.</p>
<p>Well, really, you can&#8217;t see it.  Which is why this is all very interesting.  Is it really happening, or is this just some fantasy of Steven Pinker, who would really prefer to think that the words are practically encoded in our genomes somehow.  Perhaps, I imagine him thinking, we have a lexinome from which these words spring to be spoken in the context of our grammarome.</p>
<p>Anyway, if you go to Teh Google and ask it &#8220;how many words are there, huh?&#8221; you will get this one answer that is repeatedly plagiarized, and it is little more than curmudgeonistic pedantistery.  In fact, I have identified it as a Falsehood of sorts.  It goes something like this:</p>
<p><em>How can we tell how many words there are!!???? We can&#8217;t even tell what a word is!!!???11??  </em>(That&#8217;s  the falsehood part &#8230; that we can&#8217;t even tell what a word is.)<em> And these are the reasons given that we can&#8217;t tell what a word is:</p>
<p>1) What IS a word?  If &#8220;run&#8221; is a verb, is the noun &#8220;run&#8221; another word?</em></p>
<p>OMG. I can&#8217;t believe they start out with this one.  Run to run and the run of a mill are utterly different things.  r-u-n is a spelling, and ru-nh is a pronunciation. Run the verb and run the noun are two words, and there are many many things called &#8220;run&#8221; that are nouns. Each and every one of them is a different homophone, a different word.  Duh.</p>
<p><em>2) What about inflected forms, like ran, runs, and stuff???11??</em></p>
<p>Ah &#8230; no &#8230; those are tenses and such.  Not different words.  And, in the study I mentioned above where the toddlers are learning a new word every several minutes, run, ran, runs etc. are NOT counted as different words.  Or at least, that is the story as I have gorfed it.</p>
<p><em>3) Are compounds, such as man-child or man-eater or man-bites-dog different words?</em></p>
<p>Well, ok, there is a tiny bit of ambiguity here.  Man-child, man-eater, and similar cases are clearly words. In English, this is easy to figure out.  Take out the dash (or space). Does it still work? Then it&#8217;s a distinct word.  One example given was &#8220;man-bites-dog.&#8221;  That is not a word.  It is a sentence where someone has put dashes in where the spaces are supposed to go.  &#8220;Manbitesdog&#8221; is not a word.  For the most part, the &#8220;compound&#8221; issue is as goosechasingness as one can get.</p>
<p><em>4) What is English, anyway?  What about &#8220;veal&#8221; which comes from The French.  Is that a word!!!??? Huh?!?!!??</em></p>
<p>More stupidosity.  Yes, veal is a word in English.  Jeesh.  So is spaghetti.  And pho.  Give me a break.</p>
<p><em>5) What about obsolete words?  Are they words!!!/????  Do you count them? </em></p>
<p>Well, no.  they are words but we are looking for a lexicon, not a word list.  If it&#8217;s on the line and still rarely used it&#8217;s in, otherwise it&#8217;s off the list.  Obviously.  Duh.</p>
<p><em>6) What about the names of chemicals and stuff?E?E?E? </em></p>
<p>Well, there you&#8217;ve got me. That is a little ambiguous.  Yes, they are words, but no, since chemists have a systematic way of creating the words in advance and there are a lot of combinations of chemicals (even those with a low existostiy index) then we can&#8217;t count this any more than we&#8217;d count the arbitrary assignation of morphemes to, say, items on a Mexican Restaurant menu so we could create a word for every combination of taco, burrito, enchilada, quesadilla, etc. where a given meal can have from one up to six per plate.  I totally ate ni a place with a menu like that in San Diego once. The menu itself was dozens of pages long, and only a summary of the actual theoretical menu.   (&#8220;I&#8217;ll have the bitacoqadroburrito, please.&#8221;)</p>
<p>Either way it is an arbitrary non-lexicographic alinguistic expansion of a word list. Really, it is a verbose numbering system.  Numbering systemds don&#8217;t count.</p>
<p>But yes, &#8220;busigagor&#8221; (the Magic School bus transformed into an alligator) is a word. Bugigator is &#8230; the word for the Magic School bus when it is an alligator.  This is not hard.</p>
<p>(The above statements about the hardosity of counting words are cribbed from <a href="http://www.askoxford.com/asktheexperts/faq/aboutenglish/numberwords">here</a> and <a href="http://www.slate.com/id/2139611">here</a>.  See also <a href="http://www.worldwidewords.org/articles/howmany.htm">this</a>, <a href="http://www.chacha.com/question/how-many-words-are-there-in-the-english-language">this</a>, <a href="http://wiki.answers.com/Q/How_many_words_are_there_in_English">this</a>, and <a href="http://wiki.answers.com/Q/How_many_words_are_in_the_English_language">this</a>.  The discussion of how many words there are is cribbed from <a href="http://www.amazon.com/gp/product/0061336467?ie=UTF8&#038;tag=wwwgregladenc-20&#038;linkCode=as2&#038;camp=1789&#038;creative=9325&#038;creativeASIN=0061336467">The Language Instinct: How the Mind Creates Language (P.S.)</a><img decoding="async" src="https://www.assoc-amazon.com/e/ir?t=wwwgregladenc-20&#038;l=as2&#038;o=1&#038;a=0061336467" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> and refers mainly to the research of Nagy and Richard Anderson.)</p>
<p>So how many words are there? Actually, it&#8217;s kind of irrelevant because words mean little more than what they mean, and meaning has only a vague association with the details of the lexicon, which gives the curmudgeons and pedants nightmares. Or would, if they noticed. I mean, really, did you have any trouble understanding the meanings of minifalsehood,  curmudgeonistic, pedanticmaniacal, vastosity, jibberishosity, spoorload, yajabbering, waylot, ponderified, mostest, hemimillion, quatromillion, Lexiknowitall, existostiy, alinguistic, or hardosity?</p>
<p>No, I mentated negatorially.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://gregladen.com/blog/2010/05/31/minifalsehood-we-cant-tell-wha/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">25516</post-id>	</item>
	</channel>
</rss>
