This seems inadequate to make the kinds of claims the researchers are quoted as asserting in the article.
Icelandic has a bunch of dictionary abbrevations: medic(al), temp(us), germ(anic), veg(etation). Tarifit is dominated by linguistic terminology. German has a few German words that look like English words meaning something completely different (mantel, tier, boot, stall), one loanword (angst) and what might be dictionary abbrevations again: humor(ous), miner(alogy), spa(nish)...
From my perspective this is the hoax. I come from the alps and we have dozens of terms for snow. Only those people without snow might have one word, because they have no need to describe different versions of snow. I remember Sulz, Firn, Neu, Kunst, Matsch, Harsch, Papp, Pulver, ... (left 35 years ago).
> Geoffrey K. Pullum's explanation in Language Log: The list of snow-referring roots to stick [suffixes] on isn't that long [in the Eskimoan language group]: qani- for a snowflake, apu- for snow considered as stuff lying on the ground and covering things up, a root meaning "slush", a root meaning "blizzard", a root meaning "drift", and a few others -- very roughly the same number of roots as in English. Nonetheless, the number of distinct words you can derive from them is not 50, or 150, or 1500, or a million, but simply unbounded. Only stamina sets a limit.
https://en.wikipedia.org/wiki/Eskimo_words_for_snow#cite_not...
The Lexical Elaboration Explorer app does not allow one to see the actual words for snow for any language, so the tool is mostly a geographic and word-density plotter, but neither the article nor the website add much nuance to this debate. The hypothesis is fairly obvious: languages have words for common things. It's not really falsifiable and I find this type of analysis typical of modern research. Sloppy, surface-level, coding-tutorial demonstrations of mostly useless data display.
I'm not going to say that language doesn't say anything about culture in general. But I do think that most specific analyses chasing after this idea are doomed to say more about the analyst than they do about the analyzed.
No because, Firn and Harsch are words on their own.
Yes, because of the way the German language works. It tends to create new words by combining old words not by creating new short words (Dialects like Bavarian work differently though, they often tend to create new words).
Then after centuries people forget that and think it's one word. Like "Enttäuschung" (disappointment) which people no longer realize what the two words are and that "Enttäuschung" really means that you had been deceived ("Täuschung") and now are not longer - the deeper meaning of "Enttäuschung" in German. Same for "Werkzeug" (Tool) - the words get their own identity.
What I found most interesting was Rücksicht, Vorsicht, Nachsicht, Einsicht, Weitsicht (and more) where probably no German would think they are the same word, "Sicht" combined with another one. All of those words have their own, distinctive identity.
At the time the literature suggested that the cognitive processes are the same across populations of different mother-tongues but that language can influence the data those processes work upon, EG: exposed to the same events, what details get picked up, built into narratives and remembered.
I would move that language constitutes a very strong mnemonic anchor if nothing else.
I'm pretty sure that doing this a few times made my English permanently worse. I guess it's ok since I'm not a literary stylist or anything like that, but it's something to be aware of.
Oh, that's just your brain suffering from hash collisions during lookup for words. After a while it adapts and switches to a new data structure, speaking from experience.
Just for fun, I theorised that it’s a rewinding to the emotional states - and consequently behaviours - I had when getting proficient in the language.
This effect is at its strongest when I'm visiting my parents in the Netherlands, so there's an obvious location-component to it as well. Linking places to emotional states is pretty well established IIUC, so the latter shouldn't be surprise.
English-speaking skiers have more words for "snow" than Inuktitut-speakers. It's the culture that shapes the language, not the language that shapes the mind.
Now, this does affect how you think. We need different words (including multi-word combinations) to point to different things. But we also need to see similarities between things and this is how we choose the form for these words or word combinations.
To rewrite a problem in different and more generic terms is a know heuristic to get a better understanding of it and maybe gain an insight. (Note that more generic terms mean that you start using more multi-word combinations and will see both differences and similarities.)
A different metaphor may also open up different possibilities. E.g we are trying to model a permission system and phrase it in terms of users, groups, and resources. We may come up with a different solution if we switch to users, keys, and rooms. Entities in the model are neither groups nor keys; they have their own nature we try to imperfectly capture with words pointing to things that have a superficially same relationship. The words are not quite good; but we need some pointers.
This is why trying to find the perfect name for a variable is a fallacy. There are no perfect names. What helps more are names that are different in one way and similar in another and form a consistent set.
English has so many "standard" words because the speech community became unusually eager to borrow them from other languages after the Norman conquest (The Anglo-Saxons were actually quite reluctant to do so before). Many speakers of other languages are now just as eager to borrow words from English (or other languages). Look at the pervasiveness of Spanglish and Hinglish, even though they are broadly considered non-standard by the speech community. A Spanish speaker will find just as many different words to describe something as an English speaker if they bother to try.
The hoax is that a language systematically shapes the patterns of thought of the entire speech community in profound ways. There is actually consensus among linguists that it can shape thinking of very subtle ways, however. Noun-gendering is one. A bridge is masculine in French: "le pont". It's feminine in German: "die Bruecke". A French-speaker is more likely to think of a bridge as "strong" or "sturdy", and a German is more likely to describe it as "elegant". But this does not rise to the level of the Sapir–Whorf hypothesis that a language limits its speakers' view of the world.
i dont see the connection to the structure of thought at all. words are just arbitrary utterances that come after thoughts. how could they possibly affect thought?
Portuguese is an example of this - there was a deliberate narrowing of the lexicon in the 20th century, even extending to losing certain tenses like the future perfect, and this has resulted in a narrowing of the field of expression.
For example, in English you can say “by next week, I will have finished the work”, but in Portuguese you boil it down to the simple perfect, and it becomes unclear as to whether you have already done the work or not.
You also just literally can’t translate stuff like “she must have been going to go” or “she would have had to have gone” or “I will have been living here for five years in two weeks”.
This results in a loss of temporal thinking, hypothetical chains, and allocations of causality and responsibility, and while I don’t at all wish to besmirch the good Portuguese people, the results are a real and long-lasting impact on things like economic productivity and the ability to forward plan.
Or maybe it’s just a hangover from Salazar and I’m barking up the wrong tree, but often when I’ve attempted clarification on this stuff the distinction has just not been comprehensible to my interlocutor. I try to use stuff like the future perfect (as it still theoretically exists, but is almost entirely disused) and people just do not understand - either the structure or the concept.
Indo-European languages tend to have a subjunctive mood, and while it's nearly gone in English, we still have the robust distinction between real and unreal situations that mood reflects. This is much hazier in Chinese; it's hard (for an English speaker, and I assume any Indo-European speaker) not to notice that Chinese sentences often don't bother to make a distinction.
The thing about western countries like the USA, which people often don't get is: they are old, they've built their sh*t decades and centuries ago, they were the first one building and use new fancy technologies, and now they have to live with it and can't just switch them as easily as people wish for it. Countries like China, who just now start building modern stuff have the benefit of coming late to the game, have not technical debt and old expensive infrastructure they have to honor.
So they advantage is not better planning, or throwing more money at it, but mainly being late to the game and learning from the mistakes of others.
Planning for future capacity is a mystery to English speakers for some reason.
>So they advantage is not better planning, or throwing more money at it, but mainly being late to the game and learning from the mistakes of others.
Have you ever looked at the five-year plans the Chinese government publishes? It's quite interesting how much of their economy is planned in advance.
There was no planning involved, it was just a housing-bubble which busted some years ago and devastated some companies along the way. USA had this in 2008 too.
Also, do you seriously think china will reach 4+ billion citizen in the next 10 years? They have less than half of this now and aim to have even less.
> Have you ever looked at the five-year plans the Chinese government publishes? It's quite interesting how much of their economy is planned in advance.
Yes, it is, and so do other countries. And all those plans are also struggling and failing regularly, in all countries. China is overall as good or bad as every other country in what they are doing, it's mainly their situation which makes a different in the outcome.
If you genuinely believe this, we must live in different realities or something.
Have a nice day, PurpleRamen.
As opposed to the US which has no high speed rail whatsoever. It's good to have infrastructure for the future!
Source: trust me bro.
History books are littered with authors trying to explain that their language/culture is superior and the source of their society's success. They _never_ establish a connection besides a few examples like you did, a lot of handwaving, and the fact that their society is currently thriving.
And no one in this thread said superior. They said different.
Portuguese does have a future perfect, and it's basically the same as in English.
Aput: Snow on the ground. Qana: Falling snow. Piqsirpoq: Drifting snow. Kaniq: Frost. Kanevvluk: Fine snow. Muruaneq: Soft deep snow. Nutaryuk: Fresh snow. Pirta: Blizzard. Qengaruk: Snow bank.
Common words off the top of my head:
Snow on the ground: Snowpack, hardpack, powder, crust, crud, piste
Falling snow: snowing, sleet, blizzard, snowstorm.
Drifting snow: snowdrift.
Frost: frost.
Modern scene lingo Pow, corduroy, granular, chunder, cornice etc
An extreme example is the "it's called a zygyzgy of ptarmigans"-type alternative words for flocks of specific birds in English, which are basically made up and unused.
(Well, maybe during the North American skiing season more than in late May.)
Arguably the existence of niche words could even mean less, as in the flock words example, words that have more or less been invented as pure word games.
A climbing partner and I counted over 60 words for snow in (our idiosyncratic) English.
So, I guess there Inuit, English speakers and mountaineers as three different populations.
I heard the Eskimos have over 50 words for a bad example
^ my favorite t-shirt.
So many of these studies also abuse compound words and misunderstand agglutination to produce their shocking counts.
If you want the verb "love", you can cherish, adore, treasure, adulate, worship, dote, or delight in. For the noun, you can feel ardor, passion, eros, devotion, respect. You can feel lust, or infatuation. If you aren't feeling creative, a thesaurus will have plenty more.
Not all of these have meanings identical to "love", but rather suggest different shades of meaning, formality, and approval. This is the major purpose of synonyms.
A lot of that is because we use multi-word phrases instead of single words to express a lot of ideas too. Greek might use philia where we'd just say 'brotherly love', it doesn't make our language less for not having a single word for the concept. Every time I've heard someone say "you can't express x in English", I've been able to express it in 1-4 words. Often we have a word but the other person just isn't familiar with it and assumes it doesn't exist, or assumes it's not known because it was borrowed into English.
The Romans believed that philosophy had to be done in Greek because Latin wasn't suited to the field.
There is a speech (letter?) by Cicero railing against the belief, in which he demonstrates that it's possible and in fact easy to use Latin for all of the concepts that are supposed to be restricted to Greek.
Apparently nobody learned anything from this.
Is it because he wrote it in latin?
And of course many proeminent figures in philosophy expressed their major works in Latin.
Really? Did they ever see them? They had abaci.
Other languages, especially languages that people actually used and that interacted with many other languages, are every bit as prone to complication as English.
And then, yes, agglutination. What's way more interesting to me than how many words Inuktitut or Chinese have for snow is the way the very structure of these languages illustrates how ill-defined a concept "word" is in the first place. You might think you know what it means in English, and that might transfer reasonably well to other Indo-European languages, but as you go further afield you start to see more and more examples where the concept needs heavy modification to remain useful.
For example, "avalanche" is not a word for snow. It's a word for a specific event involving snow. Having a word that meant "snow that is likely to cause an avalanche" actually would be a useful concept that isn't present in English.
Ask anyone who skis what his favorite type of snow is. His least favorite: Champaign powder, fat wet flakes, cold fluff, icy crust, I could probably talk for an hour about the different types of snow and the conditions that lead to them. Some types of snow lead to avalanche conditions. Some are dangerous to drive in. Some are a dream to ski, some make you turn around and go home.
Maybe we don’t have singular words for it, but we certainly can describe the differences in language. It would be insane to think otherwise.
With respect to snow and snow-related things, I actually ran into this personally. That thick icy crust on snow that you've described in your comment - it has a dedicated word for it in Russian, наст (nast). It never occurred to me that there isn't an equivalent single word for that in English in 20 years of living in English-speaking countries because it simply doesn't occur in the areas where I live. Until, one day, it did, and I realized that I have to explain-translate it.
(Some other languages that have a dedicated word for that are Polish, Swedish, and Norwegian)
https://nck.pl/projekty-kulturalne/projekty/ojczysty-dodaj-d...
In Norwegian and Swedish the word is "skare". If I were to translate it to English, I'd just translate it to crust, but it has a similar etymology to English "shear".
Eskimos had over two hundred different words for snow, without which their conversation would probably have got very monotonous. So they would distinguish between thin snow and thick snow, light snow and heavy snow, sludgy snow, brittle snow, snow that came in flurries, snow that came in drifts, snow that came in on the bottom of your neighbor’s boots all over your nice clean igloo floor, the snows of winter, the snows of spring, the snows you remember from your childhood that were so much better than any of your modern snow, fine snow, feathery snow, hill snow, valley snow, snow that falls in the morning, snow that falls at night, snow that falls all of a sudden just when you were going out fishing, and snow that despite all your efforts to train them, the huskies have pissed on.
It's funny but makes a decent argument for the same thing you are. Seems perfectly natural to me.
(Also, any excuse to quote Douglas Adams is worth it...)
But creating that latent space and the corresponding embedding algorithm is hard in the first place. Today’s embedding models could be terrible for the fringe languages this research is about, and we wouldn’t know because we don’t know how to evaluate overall semantic accuracy.
Am I off piste here?
But there is another big issue in that different languages (and especially North American indigenous languages) tend to have radically different notions of "word". As such, deciding what to embed is not easy.
And the article asks the reasonable question "what is the difference between having a single word for a thing versus a commonly understood cluster of words?". It's not a hard boundary.
Every translation loses a little bit of information but potentially brings in different connotations. The things that translators and localizers argue about endlessly: do we look for the words that most closely match the other words, or do we look for feeling and meaning that most closely matches the original intent?
> Linguists Find Proof of Sweeping Language Pattern Once Deemed a ‘Hoax’
Abstract from the cited paper [0]:
> our work suggests that large-scale computational approaches to the topic can produce non-obvious and well-grounded insights about language and culture.
I think I'll continue to be sceptical of the Sapir–Whorf hypothesis.
This subject is something that has been discussed for over a century (to be honest I'm not sure how much it's been considered seriously by linguists in recent years, but hey, I remember it being brought up back in LING201).
The title of the article just seems a bit extreme to me, as if the debate around linguistic relativity is over now that someone ran a counter over some bilingual dictionaries. It's an interesting approach, and maybe it can give some direction into where to look, but I think we'd need a lot more than numerical analysis on dictionaries to prove something about language, and we need to account for other causes of correlations.
Eg, bilingual dictionaries (which this research analyses) are likely to be compiled by people who are aware of these claims about their language. If you're creating a dictionary for a language that is known for having "X words for snow", you'll put more effort into listing many words for snow than many words for taste. Note that bilingual dictionaries often exist for language learning purposes, so they intentionally won't paint a complete picture of the language.
That's just wrong. Ask anyone who does alpinism, snowboarding, or skiing. You'll hear at least 10+ different words for this substance.
I'll just get my coat...
Translation is only possible because we are all humans and have experienced broadly similar concepts, but there's a limit to it, especially in social milieu and in how we conceptualize ourselves in society.
To truly understand another people and culture at a deep level, you need to learn their native tongue and their living environment -- This is what I've internalized as a long-time learner and teacher of languages.
https://cslc.nd.edu/assets/141348/pullum_eskimo_vocabhoax.pd...
> What i do here is very little more than an extended review and elaboration on Laura Martin's wonderful American Anthropologist report of 1986. Laura Martin is professor and chair of the Department of Anthropology at the Cleveland State University. She endures calmly the fact that virtually no one listened to her when she first published. It may be that few will listen to me as I explain in different words to another audience what she pointed out. But the truth is that the Eskimos do not have lots of different words for snow, and no one who knows anything about Eskimo (or more accurately, about the Inuit and Yupik families of related languages spoken by Eskimos from Siberia to Greenland) has ever said they do. Anyone who insists on simply checking their primary sources will find that they are quite unable to document the alleged facts about snow vocabulary (but nobody ever checks, because the truth might not be what the reading public wants to hear).
https://en.wikipedia.org/wiki/Geoffrey_K._Pullum
(and there is a Wikipedia page for this topic: https://en.wikipedia.org/wiki/Eskimo_words_for_snow)
I'll share some other revealing or at least interesting examples I liked; I'll paste below some cherry-picked excerpts from a conversation I had with an LLM:
Japanese: Honne vs. Tatemae
Words: 本音 (honne, true feelings) vs. 建前 (tatemae, public facade)
Cultural significance: Japanese society values harmony and social cohesion. The existence of specific terms for “what you really think” vs. “what you say to maintain face” reflects the high cultural importance of context-sensitive communication and emotional restraint.
Korean: Nuanced honorifics
System: Verbal endings, titles, and pronouns change based on age, status, and relationship
Cultural significance: The extreme granularity of politeness levels in Korean reflects a hierarchical, Confucian-influenced society where social status, age, and respect are central to daily interactions.
Russian: Degrees of truth and lies
Words: ложь (lozh, a lie), неправда (nepravda, untruth), and правда (pravda, truth)
Cultural significance: Russian distinguishes between lies and non-truths—which can imply omission, alternative interpretations, or state-controlled narratives. The prominence of pravda (also the name of a Soviet newspaper) shows how central truth and its manipulation are in Russian cultural-political life.
Spanish: Ser vs. Estar (to be)
Words: Ser (essential being) vs. Estar (temporary state)
Cultural significance: The fact that Spanish makes a grammatical distinction between inherent traits (ser feliz – being a happy person) and current states (estar feliz – feeling happy now) may reflect a worldview that embraces fluidity in personal and social identity.
Danish: Hygge
Word: Hygge — cozy, intimate, contented atmosphere
Cultural significance: This untranslatable term reflects a cultural emphasis on modest comfort, emotional safety, and communal well-being—especially during long, cold winters. It's not just a word but a cultural ideal.
Finnish’s lack of a future tense
Finnish uses present-tense forms to talk about future events, relying on context or adverbs instead of a separate future-tense verb. Some linguists argue this encourages a more present-focused worldview, though opinions vary.
Tsimané (Amazonian): No Fixed Future vs. Past Distinction
Grammar: Many Amazonian languages (like Tsimané) have clear past vs. “non-past” rather than past vs. future.
Cultural significance: Reflects a worldview where the future is not an ontological category—reinforcing an orientation toward present action and community relationships rather than distant plans.
French savoir vs. connaître
French distinguishes “knowing how” (savoir) from “knowing someone or being familiar with something” (connaître). English’s single “know” hides this nuance, whereas French speakers constantly signal whether they’re referring to factual/learned knowledge or personal acquaintance/experience.
Georgian: Evidentiality Markers
Grammar: Verb prefixes or particles indicate how the speaker knows what they’re saying (e.g., witnessed vs. heard vs. inferred).
Cultural significance: The need to signal source of knowledge underlines a communal emphasis on accuracy, trustworthiness, and relational nuance.
Quechua: Three-way Evidentiality
Markers: Distinguish whether information is firsthand (-mi), hearsay (-si), or inferred (-chá).
Cultural significance: Highlights a worldview where knowing how one learned something is as important as the information itself—rooted in oral tradition and communal storytelling.