Maltese, interestingly, is the only Afro-Asiatic derived language.
Hungarian, Finnish, and Estonian are the three Uralic languages.
All the others are Indo-European, Greek being the only Hellenic one, Irish the only Celtic, the rest are Baltic, Slavic, Italic, or Germanic.
(I originally used the term Balto-Slavic, though I was unaware of some of the connotations of that term until just now. Baltic and Slavic do share a common origin, but that was a very very long time ago)
It's Semitic, to be precise.
The family tree model seems to assume that every language has only 1 direct ancestor. It seems to have been inspired by phylogenetic trees in biology. In phylogenetics, single-parent trees work fine because distantly related species can't breed with one other. By contrast, different languages borrow features from one another all the time. It could perhaps be useful for some languages, but not for English. I reckon.
In the Yiddish original: "אַ שפּראַך איז אַ דיאַלעקט מיט אַן אַרמיי און פֿלאָט", see: https://en.wikipedia.org/wiki/A_language_is_a_dialect_with_a...
So tough that "siege of Malta" needs a disambiguation page on Wikipedia.
Not in my experience. Not at all actually. My experience with Arabic speakers is that they think they're understanding when you speak Maltese, because it sounds kind of familiar, but in actual fact they're not understanding much at all.
Which is not surprising after a thousand years of divergence.
Linguistically, it does not matter -- there is no objective definition of the difference between a language, a dialect, or whatever -lect.
It's the opposite: "it's a different language" is usually just a nationalistic desire for differentiation of what are essentially dialects/variants of a language.
>Linguistically, it does not matter -- there is no objective definition of the difference between a language, a dialect, or whatever -lect.
That's more because academic linguistics, as developed in the latter half of the 20th century, had to pay lip service into several ideologies, rather than there not actually being good practical ways to discern e.g. arabic as a single basic language with different variants.
> It's the opposite: "it's a different language" is usually just a nationalistic desire for differentiation of what are essentially dialects/variants of a language.
It's both. The idea that Ukrainian is an uneducated farmer's dialect of Russian is a common talking point in the "Greater Russia / Russkiy Mir" narrative. Conversely, asserting the status of the Ukrainian language is a big part of Ukrainian identity in the face of an imperial invasion.
As someone who once studied General Linguistics, I don't understand this remark. I've learned that calling something a language is a political act and often of great significance to the speakers, but is almost never well-defined from a purely linguistic perspective. That's a fact. Although you can sometimes find typological criteria to further argue that a variety is a language on its own, for example there are good grammatical reasons for not counting Swiss German as a variety of German, you will also find examples the other way around where two varieties have large lexical and grammatical differences and still count as the same language.
The strongest criteria for what counts as a language are based on language origins (as opposed to typology), and these do not generally suffice or make meaningful distinctions to varieties (~dialects). Mutual comprehensibility can be very low for speakers of the same language, which is why most research focuses on varieties or on speaker groups that are of particular sociolinguistic interest.
I don't get why you talk about "academic linguistics" as if there was a non-academic one and why you think linguistics "had to pay lip service into several ideologies." What are you talking about?
>As someone who once studied General Linguistics, I don't understand this remark. I've learned that calling something a language is a political act and often of great significance to the speakers, but is almost never well-defined from a purely linguistic perspective. That's a fact.
Yes, this ideologically motivated idea after enough repetitionbecame "a fact" of the field, as if describing some objective physical law, and even non-political students will be taught and stick to the same (and anybody with a dissenting opinion will be getting an earful if not committing career suicide).
This wasn't always the case, it's more so with liberalism prevailing, especially in the latter half of the 20th century.
Best get to retraining those models.
They get certain recognition, but they are not official in Europe. For example, just from Spain there are 13 languages on that list.
Irish is certainly not a robust vigorous language but your 40,000-80,000 numbers downplay it I'd suggest. Here are some statistics from Deepseek
   Category             Region                  Number of Speakers              Source & Year
   Some Ability         Republic of Ireland     1,873,997 (40% of population)   2022 Census
   Some Ability         Northern Ireland        228,600 (12.4% of population)   2021 Census
   Daily Speakers       Republic of Ireland     71,968                          2022 Census
   Daily Speakers       Northern Ireland        43,557                          2021 Census
   Native/Fluent        Global Estimate         ~80,000-170,000                 Various Sources
   Speakers U.S.        United States           ~20,000+                        EstimateWhereas Irish seems to be heavily promoted but for whatever reason precious few people learn it as their mother tongue, and those who do so are primarily in an area where it’s always been that way. For better or worse, people are preferring to use English at home and Irish is treated like a luxury good.
Sorry, but it is.
Fun fact: villages, towns, and cities in Frisia often have names which differ in Frisian and Dutch. In those cases the signs at the place limits will have both names listed; the official one on top (which in some cases is the Dutch name (e.g., Leeuwarden/Ljouwert) and in some cases the Frisian (e.g., Gytsjerk/Giekerk)).
And huh interesting, I didn't know that for some places with bilingual names, the Dutch name is official and for others the Frysian is? Who gets to decide that, the municipality?
In a number of cases originally Frisian names actually supplanted older Dutch names (e.g., Burgum, Grou, Eastermar, etc.), so those places have just one name in both languages (except on the Dutch language Wikipedia because of weird reasoning about allowable sources and apparently a hatred of Frisianised Dutch names).
But "official" means exactly what it means, and when I'm saying "Frysian is an official language of the Netherlands", it means that it's recognized as an official language of Netherlands, by the Dutch government. And if it was up to the provinces I dunno, but it's not. Frysian is the one that's considered one of the official languages of the Netherlands.
I also don't think comparing to Italy makes sense at all because countries are different and decide what are their official languages for very different historical reasons. For instance you can look up what Dutch government body is responsible for deciding the Frysian language is an official one in the Netherlands and why, and you will very likely find no Italian equivalent of that.
Then there are other, regionally-rocognized language that local governments use alongside the national one (West Frysian in Friesland, German in South Tyrol, etc.), and may even enjoy a majority of speakers within those regions, but they are not "an official language OF" the wider country.
Which is exactly what it says for German in Italy, mutatis mutandis.
Deepseek answers with, “The Constitution of the Netherlands is written in Dutch.
    Dutch is the official language of the Netherlands and the language used for all primary government and legal documents, including the Constitution (Grondwet).    Official Language: Dutch is the sole official language for national governance.
    The Kingdom of the Netherlands: It's worth noting that the Kingdom of the Netherlands also includes the Caribbean countries of Aruba, Curaçao, and Sint Maarten. While they have their own official languages (Papiamento and English), the Charter for the Kingdom of the Netherlands, which governs the relationship between these countries, is also originally written in Dutch.
    No Multilingual Version: Unlike some countries (e.g., Canada, Belgium, or Switzerland), the Netherlands does not have an official, legally equivalent version of its Constitution in any other language.
    Therefore, the authoritative and legally binding text of the Constitution exists only in Dutch.”
The Constitution of Ireland is written in Irish and English and to the best of my knowledge where differences arise the Irish one takes precedence.
“Brea, bûter en griene tsiis is goed Ingelsk en goed Frysk”
I’m sure everyone is aware that English comes from Anglish, i.e., the Angles as in the Germanic tribe.
Deutsch is derived from proto-germanic (as best we can tell) þiudiskaz, meaning “the people” i.e., the group of the different self associating tribes. It gets far more interesting in that it seems many of the strong dialects of especially southern Germany, Austria, and England have in fact retained some very old words and pronunciations that were lost in more standardized, conformed, and perverted dialects.
> Some of these [Old Saxon] speakers took part in the Germanic conquest of England in the fifth century AD. While it is not true that English and Plattdeutsch derive completely from the same source, the Old Saxon input into Anglo-Saxon was of primary importance and this linguistic group contributed greatly to the Anglo-Saxon dialects which our English forefathers spoke.
https://web.archive.org/web/20170530232902/https://blogs.bl....
> Whoever preserved this story was also curious about Ohthere’s descriptions of where the Angles had lived ‘before they came into this land’ (England). Members of Alfred's court remembered that their ancestors came from mainland Europe, and they wanted to learn more about the lands which they identified as their own places of origin.
The scribe explicitly wrote things like "he said krán which we call crein" showing they were speaking in their own languages. It's even clearer if you consider our standard Old English is West Saxon from 850 and our standard Old Norse is from 1250 in Iceland (more different than the Danish variety of most Scandinavians in England). At the same time point,they would have more similarities (8th century Danish had wír before w turned to v).
Latin would have been spoken in towns and cities but as Roman rule collapsed it was replaced by Brittonic (ancestor of Welsh), unlike in the continent where it developed into various Latin derived Romance languages.
The cadence and general way it sounds is much closer to English than any other language
https://www.politico.eu/article/catalan-basque-galician-boos...
It was the de facto language, but not the official language. What was baffling.
EDIT: It's worth noting that this is mostly a spoken thing, AIUI - most formal/semi-formal writing would be in Hochdetusch rather than a local dialect.
Historically, Germany used to be divided into countless small fiefdoms and each of them used to speak unique barely intelligible languages.
Hochdeutsch is in opposition to Niederdeutsch which Dutch and arguably English are a variety of.
The dialects are a whole other thing though.
Any literate German can read the NZZ easily, but they cannot have a colloquial conversation with an average person from Zürich, unless the latter switches to standard German (which is a foreign language for them, though one they have to learn from age 6).
I presume they also pick up a lot of standard German in the media: there's lots of German movies, and Germany has the biggest movie dubbing industry in the world, too. There's some Swiss German media, but not nearly as much as there's on offer in standard German.
As a native french speaker, no other language gives me that "why don't I understand what they say... oh, right, that's not my language!" feeling. Something with frequencies used, I suppose, but it always puzzles me.
Eg it does a passable impression of Singapore's Singlish.
Also, English remains one of the main working languages of the EU bureaucracy, because for many EU states (especially in Eastern Europe) it is a more popular foreign language than the other two (French and German)-when Czech diplomats need to talk to Spanish diplomats, English is the language they choose.
This idea people have here that “each country gets to nominate a language” isn’t how it actually works. The treaties just contain a list of languages, and which languages are in the list is down to diplomatic negotiations not any coherent principle.
https://www.irishstatutebook.ie/eli/2003/act/32/enacted/en/p...
It says that each country can only request ONE language. And Ireland requested Irish.
(In fact to strengthen that probability, if it had been say French, when and why would it have switched go English? Just because the UK joined?)
https://www.reddit.com/r/northernireland/comments/1fivtob/no...
People closer to the issue are better-placed to gather the necessary information, but again: strong feeling. Most people find it hard to get past that. The most informed person I know is so biased that I don't at all trust their conclusions.
Does modern English read like historical English?
> Native speakers have complained that official documents and signage in Ulster-Scots are incomprehensible to them.
Sure, there are tonnes of issues with the "officialisation" of any language but the fact that there are "native speakers" involved in the debate strongly suggests it wasn't all just made up for political reasons, which was the point I was responding to.
If you can read and understand text from the 18th century, then yes. We're not talking about Middle English or Old English.
>but the fact that there are "native speakers" involved in the debate
I should have put native speakers in quotes as well. What counts as a native Ulster Scots speaker is someone who speaks English with an NI accent with some localisms thrown in.
Nobody speaks the official Ulster Scots that was invented because the Irish language was getting support and political leaders on the other side of the community felt they deserved something as well. The Protestant community in NI see it as a bit of an embarrassment.
Yes, and I can read and understand historical Ulster Scots as well, but you were making a different point about codification/drift, no? The English I would find in those historical writings is quite different from what is being taught in schools today or recommended in style guides.
> What counts as a native Ulster Scots speaker is someone who speaks English with an NI accent with some localisms thrown in.
Then by your definition I am a native speaker. So how can we square it that you're telling me native speakers feel one way while I feel another way?
> Nobody speaks the official Ulster Scots
That's the nature of any newly codified minority language.
> The Protestant community in NI see it as a bit of an embarrassment.
There is no "protestant community" in Northern Ireland. A Dungannon farmer, an East Belfast loyalist and a BT9 lecturer will all give you very different views despite being of protestant background.
I'm not entertaining the notion that I have to pretend you're a native speaker when you've made clear you're only identifying as such for the purpose of making an argument.
>There is no "protestant community" in Northern Ireland.
Anyone who applies for a job in NI fills out a form where they are asked if they are a member of "the Protestant community", "the Roman Catholic community" or neither. You're denying the factual existence of the different communities in NI for the purpose of winning an argument on the internet.
Could you outline the key ways in which it differs? And say why that suggests the language was later "fabricated?"
> I'm not entertaining the notion that I have to pretend you're a native speaker when you've made clear you're only identifying as such for the purpose of making an argument.
If you won't entertain the notion that I'm a native speaker could you amend your definition of "native speaker" or explain what differentiates me from the native speakers whose complaints you referenced previously? And could you let us know where we can read about their complaints?
> Anyone who applies for a job in NI fills out a form where they are asked if they are a member of "the Protestant community", "the Roman Catholic community" or neither.
Of course you understand that the "protestant community" is not an homogenous group with shared views and opinions on these things. The reason that question is on the forms is because of historical discrimination against Catholics and the need to quantify heritage issues in order to avoid such discrimination forwards.
One protestant might feel embarrassment, another might feel pride, and another might not care at all. Suggesting there's a unified view from "the protestant community' is disingenuous.
This will answer all your queries.
>Suggesting there's a unified view from "the protestant community' is disingenuous.
I've yet to meet a member of that community in person (now you've decided they exist) who has any interest in Ulster Scots as a language, (even people who are quite opinionated and argumentative on other NI topics). This is evident in the lack of Ulster Scots language classes. There are more Irish classes running in East Belfast than for Ulster Scots.
Outside of the political class (who are only interested in it as a means to stifle support for the Irish language) Ulster Scots advocates are exclusively found online.
It doesn't. It's just an opinion piece about the use of neologisms in certain publications. It makes the same claim about incomprehensibility for native speakers but also fails to reference the voices of any actual native speakers. Who are they? Do they really complain about this as you said?
> I've yet to meet a member of that community in person who has any interest in Ulster Scots as a language
Well? I have met them. I've met lecturers at Queens such as Ivan Herbison studying the thing, I've met artists like Willie Drennan touring the country sharing contemporary poetry and song in Ulster Scots. I've met people in the countryside of Antrim not only with an interest in it, but speaking it day to day. Just because you haven't personally encountered these people doesn't mean they don't exist.
> now you've decided they exist
This is quite unfriendly. I made a clear distinction between what you were claiming--a single protestant community who are collectively embarrassed by Ulster Scots--and the collection of people with a shared background who identify as protestants for the sake of anti-discrimination laws, but who are otherwise diverse in their beliefs and opinions. To say that in so doing I somehow conceded your original claim is again disingenuous. It also seems absurd in relation to your broader point to now insist that just because some politician decided a form should say "protestant community" that that is necessarily reflective of an on-the-ground reality.
> There are more Irish classes running in East Belfast than for Ulster Scots.
By your definition of native speakers everyone in East Belfast is already brought up speaking Ulster Scots at home, so of course there's more interest in other languages. There are more people from East Belfast attending Irish classes than English classes too, it doesn't mean no one is interested in English.
>By your definition of native speakers everyone in East Belfast is already brought up speaking Ulster Scots at home
But reading and writing in it? And would they agree they're speaking Ulster Scots or would they say it's English?
>There are more people from East Belfast attending Irish classes than English classes too
Did you not learn English in school? I find it hard to believe English isn't taught in East Belfast schools. And that's not counting English as a second language classes for immigrant communities. What language is the signage in in East Belfast?
Simply put, Ulster Scots prominence in legislation is merely a reflection of bad-faith political negotiations by Unionists to degrade the status of the Irish Language Act by proxy. Anyone on the ground knows it for the dog-whistle that it is, used simply to curry favour with a particularly sectarian unionist base in as a counter to the Irish Language provisions outlined and agreed to in the Good Friday Agreement.
And that's 'curry favour' - not 'curry my yoghurt' by the way. https://www.bbc.com/news/uk-northern-ireland-29895593
This has more or less been the case ever since the forced Ulster plantations lead to the development of Ulster Scots as a defined community with resilient Protestant and unionist ties. It'd be far more credible if Fingal tried to secede from Dublin and the Republic tomorrow morning using Yola as a justification.
https://en.wikipedia.org/wiki/Yola_dialect
In short, the ILA and promotion of Gaeilge in the north is about trying to make some small reparation at a state level for a cultural genocide perpetrated by our Colonists, and to help re-establish the oldest written vernacular language in western Europe, dating back over 2,500 years.
The promotion of Ulster Scots however... well the Commissioner is literally called 'Commissioner for Ulster Scots and Ulster British Tradition'. This is after DUP members removed themselves from the equality and good relations group after basically fillibustering for 5 years of discussions on bi-lingual signs to force a stalemate.
https://www.belfastlive.co.uk/news/northern-ireland/dup-stor...
Ah right, I get you now! The point you're making is fair enough, apologies for drawing the labour from your to explain it so fully.
Specifically, the link says two things:
1. That 2 parties want to add *limburgish* to the list, not frisian. That's the bottom-right part of The Netherlands, about as far removed from Friesland as you can get (which is the top part of the Netherlands).
2. That one party wants to add Frisian, but, that is a one-day fly party that will cease to exist in a few hours as they will get 0 seats in this election and will presumably call it a day right after. It was a party founded to support one person and that person has quit due to workstress, and is highly unlikely to return as this _was_ his return. Their opinion used to be relevant as they had 13.3% of the seats this past session (and didn't exist before it). But, it isn't here.
can arabic people understand maltese?
That’s a really interesting question — and the answer is: *partially, but not easily.*
Here’s why:
### Linguistic roots
Maltese is a *Semitic language*, and its *core grammar and basic vocabulary* come from *Arabic*, specifically from *Siculo-Arabic*, the dialect of Arabic spoken in Sicily and Malta about 1,000 years ago. Because of that, *many Maltese words sound familiar* to Arabic speakers — especially from the *Maghrebi (North African)* or *Levantine* dialects.
For example:
| Maltese | Meaning | Similar in Arabic | | ------- | ------- | ----------------- | | Dar | house | دار (dar) | | Kelb | dog | كلب (kalb) | | Seba | seven | سبعة (sabʿa) | | Xemx | sun | شمس (shams) |
### Influence from Italian and English
However, over the centuries, Maltese absorbed *a lot of Italian (especially Sicilian)* and *English* vocabulary — so modern Maltese is *a hybrid*. Roughly:
* 30–40% of its vocabulary is Semitic (Arabic origin), * 40–50% is Romance (mostly Italian/Sicilian), * and the rest is English and other sources.
That means Arabic speakers might *recognize some words and structures*, but they’ll *struggle to understand full sentences*, especially because:
* Pronunciation has changed, * Grammar evolved differently, * Many everyday words are not Arabic anymore.
### Summary
So:
* *Yes*, Maltese and Arabic share a deep connection — like cousins. * *No*, they’re *not mutually intelligible* today. An Arabic speaker might catch words here and there, but a real conversation would be hard without studying Maltese.
The above is exactly my experience with Arabic speakers by the way. Again, not surprising after 1k years of divergence.
I do wonder what natives think and feel about the longevity of their language? What is taught in schools at what ages (assuming English is in the mix somewhere). Is there enough media in Maltese for Malti to go about the moderns at fully in Maltese? It’s shockingly hard to find any information on Maltese, and even harder to find content.
I’m not sure if’s dying out, or in danger thereof; if there are preservation efforts, or if there is no need.
Businesses do work in Maltese and English. Both are official languages. Its quite rare to encounter a business that deals near exclusively in Maltese. Many prefer Maltese but will fall back to english where necessary.
Regarding monolignual speakers, I think theres a lot of stereotypes for maltese only, english only and code switchers. I think its all a bit silly... So as long as communication can happen I don't fuss.
On Maltese music... There's a lot of low ish quality music then there's a few absolute gems. Look up The Travellers, Lapes, Jon Mallia on YouTube/Spotify.
I was surprised to hear Maltese radio stations played in taxis, while visiting Malta just a few weeks back
There was a point about 7 years ago when the overton window shifted to "speak english to strangers first" because of a large influx of foreigners who did not know the language. Since then I've met foreigners who have better Maltese than some natives.
Older folks & geriatrics will sometimes be surprised when they assume someone is foreign and they turn out to be Maltese. "int Malti??" is a statement I get often because I don't look Mediterranean despite being born here.
Source: I'm also Maltese.
Arabic (language): al-‘arabiyyah (الْعَرَبِيَّة).
How much do you consider Maltese its own language (as opposed to a dialect of Arabic)?
Maltese is definitely its own language. Arabic roots are there (theres a Semitic joke in there ) but it isn't arabic anymore. Its written left to right with a variant of the english alphabet.
Hindi and Urdu are 90% the exact same language, and are mutually inteligible (Urdu speaker and Hindi speaker can have complete full conversation with each other) but each is written differently (one LTR the other RTL) and with different alphabets
In my books, the distinction between languages and dialects are so arbitrary that the best method is simply to ask the people that speak those languages/dialects. If they consider them to be different language (which Maltese speakers seemingly do) I call them different languages.
I don't buy the argument of just asking the speakers. There are cultural, political, etc. reasons people may think things which don't conform with reality. Many Hindi-Urdu speakers get insulted by the reality that the languages are pretty much the same because they don't want to identify with people from another country their country is constantly at war with.
I don't think anyone would seriously consider it a dialect of Arabic though with its completely different alphabet and half the vocabulary and morphology coming from Italian languages/dialects, even if Malta hadn't spent the best part of a millennium trying very hard not to become part of the Arab world
I updated my original comment, and learned a good amount about that dispute as a result, so thanks for calling it out.
I think some people get touchy about them being lumped together if their last period of commonality (per the article) was 1400 BCE. For comparison, I believe all the Slavic languages were mutually intelligible around 1200 AD. But much more recently than this, in the last few centuries, there have been notable attempts by east slavs to absorb the Baltic language cultures and deny them.
I doubt West and East Slavic were. But inside those geographic groups they probably were (Czech and Polish AFAIR were around that time).
It is an example I think of often, about how quickly languages can change. In the scale of 1000 years, a lot changes. Most of the diversity in Romance languages is from around that timescale too, it really started to diverge substantially around 900ad-1100ad.
The fact they they are the closest surviving relatives on it own doesn't mean it makes sense to group them together (i.e. Italo-Celtic is also a theorized subgroup in a similar way but nobody is disputing that Celtic and Italic languages evolved into distinct groups).
Then there is a huge amount of missing links and unknown unknowns. e.g. Thracian and Dacian probably were also pretty close to Baltic or Slavic (maybe even closer to Baltic than Slavic is but we don't know enough about them to make any conclusive claims at all... but we at least know these languages existed)
Well, that and Romanian. And Hungarian. And outside the EU, Albanian. And Georgian, Azeri and Armenian if you consider those Eastern Europe.
In my mind, I was thinking of the belt of countries between Russia and Central Europe, starting from the Baltics down to the Balkan (excluding Greece).
Some of my fellow Romanians will also claim they're Central European, but in my mind, all the ones I listed are Eastern European countries. I'd even include Turkey and Kazakhstan in there, part of the latter is to the West of the Urals, which is what we normally consider the border between Europe and Asia.
https://www.researchgate.net/publication/382295560/figure/fi...
https://www.worldatlas.com/r/w960-q80/upload/03/90/9b/countr...
Albania is clearly south east europe.
And, I don't care about your random Romanian friend's anecdote.
Yes, it is clearly south east Europe. East.
> And, I don't care about your random Romanian friend's anecdote.
Who's my friend?
There is a branch that contains both Baltic and Slavic languages, but there's also one that contains Albanian and Greek.
There have been some attempts to tie Albanian to Germanic, or Greek, or other branches, but they all have failed.
At some point they all are Indo_european, but they split a way ago.
and
> only Estonian is not a Slavic language.
So following this logic saying "in Eastern Europe, only Estonian is not a Baltic language" would make as much sense?
Are there really any other Hellenic languages besides Greek?
> as well as some additional relevant languages (Arabic, Catalan, Chinese, Galician, Hindi, Japanese, Korean, Norwegian, Russian, Turkish, and Ukrainian).
https://arxiv.org/pdf/2409.16235
The paper also goes into detail on training set sources, which I feel like a curation thereof might be considered the main contribution of this publication?
What about Basque? Is that too controversial?
Now, being from Belgium, even within that small part of the country where everybody is supposed to speak Dutch, I genuinely don't understand people from near the coast, which was about 150 miles from where I used to live.
What I find interesting is that the differences in Flemish dialects make them much more distinct than what would normally call dialects. There are significant grammatical difference beyond the usual vocabulary differences. For instance, coastal Flemish conjugates yes and no[1], Limburgisch is a tonal language.
So for instance, Basque is not an official language of any country (only French in France and Spanish/Castilian in Spain). Belgium's official languages are French, Dutch, and German, "Flemish" is only a local variant of Dutch (Belgian French is also only a local variant of French).
In the US, people will resort to fisticuffs, over variants of Spanish. I usually translate into Castilian Spanish, because that seems to be the equivalent of "Vanilla" Spanish. No one is really happy (except the Spaniards), but I'm not accused of favoritism.
Section 3
(1) Castilian is the official Spanish language of the State. All Spaniards have the duty to know it and the right to use it.
(2) The other Spanish languages shall also be official in the respective Autonomous Communities in accordance with their Statutes.
(3) The richness of the different linguistic modalities of Spain is a cultural heritage which shall be specially respected and protected. [1]
[1] https://www.senado.es/web/conocersenado/normas/constitucion/...
From https://european-union.europa.eu/principles-countries-histor... we can find an excerpt relating to the policy and its purpose:
>One of the EU’s founding principles is multilingualism.
>This policy aims to:
>communicating with its citizens in their own languages
>protecting Europe’s rich linguistic diversity
>promoting language learning in Europe
With this in mind, the first intention fails by an enormous margin, given that 95%+ of Spain doesn't speak an iota of Basque, the second is met handily, given the long history of the language, and I'm not sure what to think about the third, any language whatsoever would serve that purpose.
https://en.wikipedia.org/wiki/A_language_is_a_dialect_with_a...
It's mostly true, though, even if it's a somewhat simplified view.
It's all Greek, to me...
Old-fashioned one.
Get off my lawn...
I have often joked that Norwegian is just a dialect of Swedish, but I never expected to get official validation like this!
That being said, the Scandinavian languages all come from old Norse, and modern national constructs aside, most of the people in the those areas descend from the same mix of Germanic tribes. There's no denying that modern-day Danish, Norwegian and Swedish are very similar.
[0] https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:01...
PS: Gaelic is a more general term for Irish and Scottish. Ireland brings specifically Irish(Gaeilge in Irish) language.
Arabic, Catalan, Chinese, Galician, Hindi, Japanese, Korean, Norwegian, Russian, Turkish, and Ukrainian.
>Europe is the only continent in the world to have a large public network of supercomputers that are managed by the EuroHPC Joint Undertaking (EuroHPC JU). As soon as we received the EuroHPC JU access to the supercomputer, we were ready to roll up our sleeves and get to work. We developed the small model right away and in less than 6 months the second model was ready.
[1] https://www.eurohpc-ju.europa.eu/eurohpc-success-story-speak...
Repurposing some of that physics sim compute
Who would have thought that Europe is the only continent to have a network of supercomputers managed by Europe⸮
But they were not trained on government-sanctioned homegrown EU data.
Karpathy repeated this in a recent interview [0], that if you'd look at random samples in the pretraining set you'd mostly see a lot of garbage text. And that it's very surprising it works at all.
The labs have focused a lot more on finetuning (posttraining) and RL lately, and from my understanding that's where all the desirable properties of an LLM are trained into it. Pretraining just teaches the LLM the semantic relations it needs as the foundation for finetuning to work.
Pretraining teaches LLMs everything. SFT and RL is about putting that "everything" into useful configurations and gluing it together so that it works better.
How would they curate it on that scale? Does page ranking (popularity) produce interesting pages for this purpose? I'm skeptical.
ok what are you implying on this
If none of the LLM makers used the very big corpus of EU multilingual data I have an EU regulation bridge to sell it to you
There is no necessary correlation between language and the correct set of laws to reference. The language of the question (or the answer, if for some reason they are not the same) is an orthogonal issue to the intended scope. There is no reason US laws couldn't be the relevant to a question asked in German (and, conversely, no reason US laws couldn't be wrong for a question asked in English, even if it was specifically and distinguishably US English.)
For most questions it does this pretty well (e.g. asking for the legal age to drink). However once the answer becomes more complex it starts to halucinate very quickly. The fact that some of the hallucinations are just translated US laws makes me think that the knowledge transfer between languages is probably not helping in instances like this.
In recent LLMs, filtered internet text is at the low end of the quality spectrum. The higher end is curated scientific papers, synthetic and rephrased text, RLHF conversations, reasoning CoTs, etc. English/Chinese/Python/JavaScript dominate here.
The issue is that when there's a difference in training data quality between languages, LLMs likely associate that difference with the languages if not explicitly compensated for.
IMO it would be far more impactful to generate and publish high-quality data for minority languages for current model trainers, than to train new models that are simply enriched with a higher percentage of low-quality internet scrapings for the languages.
I have to admit I have not encountered significant mistokenization issues in Japanese, but I'm not using it on a daily basis LLMs. I'm somewhat dobutful this can be a major issue, since frontier LLMs are absolutely in love with Emoji, and Emoji requires at least 4 UTF-8 bytes, while most Japanese characters are happy with just 3 bytes.
Plus all your T&S/AI Safety is not solved with translation, you need lexicons and data sets of examples.
Like, people use someone in Malaysia, to label the Arabic spoken by someone playing a video game in Doha - the cultural context is missing.
The best proxy to show the degree of lopsidedness was from this : https://cdt.org/insights/lost-in-translation-large-language-...
Which in turn had to base it on this: https://stats.aclrollingreview.org/submissions/linguistic-di...
From what I am aware of, LLM capability degrades once you move out of English, and many nation states are either building, or considering the option of building their own LLMs.
Chat GPT for example tends to start emails with "ich hoffe, es geht dir gut!", which means "I hope you are well!". In English (especially American) corporate emails this is a really common way to start an email. In German it is not as "how are you" isn't a common phrase used here.
But also European culture could maybe make a difference? You can already see big differences between Grok and ChatGPT in terms of values.
European culture is already embedded in all the models, unless the people involved in this project have some hidden trove of private data that they're training on which diverges drastically from things Europeans have published publicly (I'm 99.9% positive they don't...especially given Europe's alarmist attitude around anything related to data).
I think people don't understand a huge percentage of the employees at OpenAI, Anthropic, etc. are non-US born.
Comparison with similar EU models + 600 other highlights:
I think LLMs may be on the whole very positive for endangered languages such as Irish, but before it becomes positive I think there's an amount of danger to be navigated (see Scots Gaelic wikipedia drama for example)
In any case I think this is a great initiative.
Europe has about 1.3 times the population of the USA and about 75% of the GDP yet EU tech output is a very small percentage of US tech output. We are not talking about 70, 50, 30, or even 20%. It's a drop in the bucket.
>The seven largest U.S. tech companies, Alphabet (Google), Amazon, Apple, Meta, Microsoft, Nvidia, and Tesla, are 20 times bigger than Europe’s seven largest, and generate 10 times more revenue.
https://eqtgroup.com/thinq/technology/why-is-europes-tech-in...
"Why" is a good question, but I definitely wouldnt expect significant competition in LLMs from Europe based on the giant tech disparity. Having 1 non-cutting edge model that isn't really competitive is pretty much what I would expect.
I'm going to guess that this part is intentional. Europe tends to be more aggressive in enforcing antitrust laws. Economically, Europe's goal isn't to have the biggest companies but to have more smaller companies.
So you're not going to get companies like Google, but you will get companies like Proton, Spotify, Tuta, Hetzner, Mistral, Threema, Filen, Babbel, Nextcloud, CryptPad, DeepL, Vivaldi, and so on.
So is your hypothesis that the total market cap of EU tech companies is something like 50,60,70, etc. % of total US tech marketcap? Something significantly different than the ~10% implied by that figure (largest us companies 10x largest EU companies). And it's just more broadly distributed?
Hard to find data on this but this is showing EU tech market cap at 3.2T. https://www.stateofeuropeantech.com/chapters/outcomes
Whereas this is saying the US "megacaps" ($200B+) are at 21T. https://www.cnbc.com/2025/09/05/tech-megacaps-worth-market-c...
Which puts the entire EU tech market at 15% of the US megacaps. Not even the entire market.
I don't see any sense in which the EU has fewer capabilities. It has, say, a smaller number of businesses with smaller market dominance.
It isnt clear to me what capability the EU would gain by having a monopolist social network, a monopolist search engine, a monopolist advertising trader
I only use open source LLMs for writing (Qwen 32b from Groq) and open source editor of course, Emacs.
If some people can write better using commercial LLMs (and commercial editors), by all means, but they put themselves at a disadvantage.
Next step for me, is to use something open source for translation, I use Claude for the moment, and open source for programming, I use GPT curently. In less than a year I will find a satisfying solution to both of these problems. I haven't looked deep enough.
llama-3.1-70b-versatile is pretty good at translating though
i predict at some point countries will get CIA'ed when they publish plans to build a large data center.
Similar to the time when they got CIA'ed when announcing plans for new nuclear plants.
This AI law is a clear example of that. Pencil pushers creating more obstacles for the sake of creating more obstacles rather than actually taking a pragmatic approach.
You would never hear it, though, as European IT press only promotes SV startups
https://www.youtube.com/watch?v=u8QFiLhnuFg&feature=youtu.be
Disclaimer, i work at Hopsworks.
Still two month earlier 19 European language model with 30B parameters got almost no mention:
https://huggingface.co/TildeAI/TildeOpen-30b
Mind you that is another open model that is begging for fine-tuning (it is not very good out of box).
>You need to agree to share your contact information to access this model
Is this common? I've never seen it on the site before, and it isn't on the smaller model. What are they collecting this information for?
I used the 9B Instruct version, from the small models, it was the one with the best Latvian knowledge out there, bar none. GPT-OSS 20B and Qwen3 30B A3B and similar ones weren't even close.
That said, the model itself was a little bit dumb and not something you'd really use for programming/autocomplete or tool calling or anything like that, which also presented some problems - even for processing text, if you need RAG or tool server calls, you need to use something like Qwen3 for the actual logic and then pass the contents to EuroLLM for translation/formatting with the instructions, at which point your n8n workflow looks a bit messy and also you have to run those two models instead of only one.
Meanwhile, the best cloud model for Latvian that I've found so far was Google Gemini 2.5 Pro, but obviously can't use cloud models in certain on-prem use cases.
I have to specifically tell something like this: “do you known Lithuanian language”, then it starts replying in Lithuanian
While he is great at converting his influencer status to income in his micro-SaaS projects, I don't think running ad-fueled browser games on state-sponsored super computer should be really aim of these grant programs.
I’ve seen this in practice with innovation subsidies: small startup has to spend a lot of time to go though all the hoops, taking up significant resources, to get modest compensation for actual innovative work. Larger business hires an external agency to do everything for them, to make silly proposals “convert website from jquery to react” as innovation, and gets thousands upon thousands of FTE-hours in subsidy compensation hours for it.
The EU is also great at creating a heavy regulatory environment. Which entrenches existing incumbents. So the EU creates barriers that favor big companies, then tries to fix it with grants that... also favor big companies.
And then everyone's surprised that there's no innovation in Europe.
From all the world's companies worth over 100B$ there's only one European company - SAP, founded 50 years ago. [1]
[1] https://www.economist.com/briefing/2021/06/05/once-a-corpora...
Would you rather have an economy of SAPs or an economy of Teslas?
In part probably because it's harder to become a monopoly in the EU.
This x2. A close friend of mine works at a major EU HW tech company and his job is wearing a suit, going to dinner parties and rubbing shoulders with high level local and national bureaucrats to convince them to fund X, Y, Z projects, none of which result in any major commercial success or ROI for the governments because the money they get is not enough to make new successful products, but hey, he's loving his job which gives him amazing job security against the waves of layoffs the company went through due to falling of sales, plus the networking he gets out of that is invaluable.
So at least some people are enjoying the gravy train while it lasts. But that's why a lot of EU tech companies immediately go to the US first before opening up to the EU market. US VSs are more generous with their cheque books than EU governments and investors, plus the 300 million people single consumer market speaking English as the common language and all that.
All these while the EU is running out of funds and in a process of de-industrialization. There should be an independent corruption investigation on Brussels.
It's bureaucracy, often bordering with stupidity. You may need advisors to navigate all their forms & processes. But it certainly isn't "pals-only" type of deal.
On the other hand - is it harder than getting VC funding? For seasoned founder with reputation - probably. For fresh startup - probably not.
The probability of getting a Horizon Europe grant allegedly (not official stats) is about 8.5% according to some friends, which may seem low. You need to write 70 pages following a Word template and the key goal is to cover answers to a large number of questions. Each proposal gets various grades across a range of dimensions, which get added up and if you obtain at least 13 out of a possible 15 points, you are eligible to get funded, read: "You will get funded if there is enough money." Often, there are several proposals that justly achieve 15/15, and because of that, many prosals that have 14 points and all proposals that have less may not get funded, simply because there just is not enough total funding available to fund all the technically eligible proposals. Having judged many proposals in AI / ML / search / "big data" / language technology etc. I recommend optimizing recall, i.e. aspiring completeness.
The application process is not easy, but you can get help: there are support agency in each member country, free online Webinars to help, hotline help desks as well as an ecosystem of paid consultants that typically charge about 3k€ to vet a proposal for you if you need that kind of service (I never used it).
The process is neutral and conducted professionally and with external oversight (consultants are hired as "rapporteurs" that report on process/procedural integrity in additional to the actual reviewers). I value the research officers of the EC as people of high competency, integrity and motivation (research money is tax payers money so it should be spent carefully).
In comparison, VC (and even more so business angel) funding is achievable with much less formal apparatus, often a short business plan and a convincing slide deck and demo can get people to a partner meeting if the time is right. But the criteria and process are much different, and ideas ready for public research grants are typically too early for VCs (but the EC wants to foster the creation VC-funded startups resulting from the disseminated research).
> The EIC welcomes applications from innovators in all EU Member States and countries associated to the Horizon Europe programme. It particularly welcomes applications from startups and SMEs with female CEOs.
So that's one that strongly pushes that angle. But the one I saw I clearly remember saying "female co-founder", not CEO.
highly doubt, the whole thing about the success of the US west coast is that they are&were willing to fund unproven upstarts.
The point being that, as soon as public dollars are on the table, people expect perfection. Anything less is waste, fraud, and abuse.
There's literally no winning. Want to make sure the money is allocated right? Bureaucracy. Want to not do that? Waste, fraud, and abuse.
Except that Apple, Intel, Tesla, etc have all received US government investment [1]. TSMC is a product of the Taiwanese state! Government investment can be done well, and seeds excellent companies.
[1]: https://www.sba.gov/blog/2024/2024-02/white-house-sba-announ...
The comment you're replying to is tainted with the survivorship bias. We see successful companies that got government funding, but not the opposite. Maybe we'd have more innovation and competition without government picking these specific winners.
Ironically, one of the companies you mentioned (Apple) now operates in an environment with very little competition and regularly faces antitrust claims.
Government picking winners may actually reduce competition in the long run. The key difference: when private money picks wrong, it's their loss. When government picks wrong, it's taxpayer money.
https://www.politico.eu/article/ombudsman-slams-commission-f...
Yes, some of the questions are weird, but I'd really rather write a bit confirming that the AI system being developed isn't going to be racist or Skynet than jump through some other hoops that exist (and that absolutely includes VC due diligence). The actual biggest issue with European funds is they get way more competent applications than they can fund anyway.
When you talk to most EU business owners, even in tech, the limiting factor isn't regulations. This being the #1 reason is such a tired trope.
Ironically, China has in some ways a bigger regulatory burden when it comes to software, as there if the government doesn't approve the business is dead in the water. I doubt that Klarna would've gotten off the ground there, for one, I could see them being shut down much earlier there. In the EU only now very slowly are some governments even starting to talk about some weak measures around their business model. But I've never, not once in my life, heard "Chinese software companies can't get off the ground due to the regulatory burden".
The same people who clamor about the EU regulations are the ones who hate on the EU for their protectionist measures against US tech. Yet another bout of irony here - China's software industry has flourished exactly thanks to 10 times stronger protectionist measures against US tech. So has Korea's, and their protectionism has never even been anywhere on the China level, more inbetween EU and China. No, if there's anything that would help, it's much more tech protectionism in the EU.
Pieter Levels is at the end of the day an influencer, not a serious founder.
I have a tech startup in Estonia and I agree. To me the biggest limiting factor is lack of funding.
In the US, you can get VC across state borders and VC can invest in any US company regardless of their location. Not in the EU.
Investing in an Estonian startup from, say, the Netherlands is near impossible. And even if managed, now that NL investor has to repeat the entire process for Poland, Spain, Malta and other countries.
EU Made Simple explains this very well https://www.youtube.com/watch?v=1RqYws1JAuI
If anything this would be due to Dutch regulations rather than Estonian ones. So that would be more of a problem for setting up VCs in the EU rather than attracting capital in general, as the same would apply to the rest of the world. Is it easier for a US or Singapore VC to invest in an Japanese startup vs an Estonian startup? As far as I know the answer is a "no", but I'm happy to be proven wrong.
A texas VC can invest in a startup in Chicago and a startup in Florida can raise money from NY and so on.
> If anything this would be due to Dutch regulations rather than Estonian ones
Yes, because there's no unified capital market. You describe an effect not the cause. With 27 member states, if such regulations are bi-directional, you'd need 351 of such "regulations". If one way, you'd need 702.
In the US with 51 states, there aren't a total of 1,275 "regulation contracts" between all states, there's one, and it's federal.
> Is it easier for a US or Singapore VC to invest in an Japanese startup vs an Estonian startup?
This was my main question, and it looks like the answer is indeed no.
Still, the existence of the EU is (to bring peace by) lowering, mostly economic, barriers that traditionally exist between countries.
And the comparison in this threat was "between EU and US" (in europe Thing is hard... in Europe you cannot... etc). When people use "The EU" or "Europe" in this sense, they consider it as a "Single thing". Europe, or the EU, has a great disadvantage for startups over the US. You say that is because Europe is not a "thing" like a country is. I say, that is because Europe seen as a "thing" lacks some critical infrastructure that "the thing called US" has - or even "the thing Brazil" or "China" do have, internally.
And you register online.
Opening a company in Estonia is very cheap but in Spain the manager/CEO needs to be an "autónomo" (like a self-employed tax status). This costs thousands of Euros per year. Something like 2,400-30,000 Euros per year, every year, forever.
The article is unclear, but is probably referring to making it easier for startups to offer products in other EU countries.
It's in very early stages, so info is very scattered. More info, for example, here: https://www.loyensloeff.com/insights/news--events/news/the-2...
And that's just one of the problems (many of the problems have nothing to do with European bureaucracy)
Secondly, what's easier besides VC funding? If it's VC funding, the disparity there has nothing to do with regulations - guess how much VC funding the non-EU rest of the world gets.
It’s a distant memory to me now, I’m building a company and so much has happened that the details of this decision have faded away. But, between the AI act and GDPR, there’s a set of potential traps laid out for you to step into, along with reams of paperwork. All that requires lawyers and compliance consultants to help you figure it out, and that’s way too much for a fledgling startup.
I think it said it all that the AI regulations were written before there was really anyone to regulate. Why would I want to pour my heart and soul into a system that’s geared to find ways to stop me from building?
Anyway, it’s no longer relevant to me: I’m gone and I don’t have to worry about it anymore.
I don't want to exchange my freedoms for your shareholder value, thank you.
Yet startups here have managed to compete with US bigtech incomparably better than the EU. Shows that tough privacy laws have nothing to do with it.
> All that requires lawyers and compliance consultants to help you figure it out, and that’s way too much for a fledgling startup.
It also really just doesn't unless you're doing really shady stuff, in which case, good. The huge majority of startups don't need a lawyer to deal with GDPR.
Which part is easier? That you have 50 different states with slightly varying laws to consider (e.g. Californian Data protection)? That you have a byzantine system of "benefits" to choose and manage?
And compared to where? Germany or Estonia or Sweden or Spain? The complexities will vary wildly depending on the country (kind of like in the US, where lots of companies pick the state to base themselves in based on the combination of favourable laws and precedents and taxes).
there are certain sentences you can just tell would never be written by an American lol
California Consumer Privacy Act is a thing you need to take into account for Californian customers.
Illinois has a Biometric Privacy Act.
And who knows what Wyoming or South Dakota or Oregon have that you might take into account if your business falls under any of them.
most laws like CCPA also have some threshold where you already need to be pretty successful for it to apply to you.
for some select industries (biometrics & healthcare), yes you have a patchwork of laws.
Because they are plenty of companies just ignoring regulation and simply paying the fines, as the enforcement often a joke.
TBH, in this respect it's much the same in the EU.
Okay, what is the limiting factor? Because when I talk to EU business owners (admittedly, very few) - they point to lack of big EU capital markets, which is directly downstream of the policy environment. And when I talk to top EU human capital, they all point to the lack of competitive wages. There's a real difficulty in allocating capital to talented humans.
And, at least in Southern Europe, the income tax schedule is so aggressive it's hard to justify continuing working in many of these countries if you are highly talented.
Like, if you can tell me what the induced operator norm from l_2 -> l_2 is - probably you should come to the US and work at a biglab and make bank. What can you do in Portugal, Italy, Spain, etc.??
> Pieter Levels is at the end of the day an influencer, not a serious founder.
Sure, agreed.
I think it is a complete misreading to point to protectionism as the reason for Chinese success, but having a big unified domestic market for consumers along with massive saving rates and capital controls probably does help.
A few.
A big part is that the EU is a collection of countries that (with very few exceptions) have different languages and laws. For a company to serve Spain and France, for instance, it would need to translate everything, hire local lawyers and customer support agents. Considering the much smaller size of the countries (biggest one is 70 million vs 330 million in the US), the opportunity for "unlimited" growth is limited.
This also rebounds in the fact that when an American company makes it big, they have the resources to flood other EU markets and be cheaper/better than the local competition due to economies of scale and money based on their big successful US market. A French company making it big is still small compared to a US equivalent.
Then, there's the capital markets, no denying that. The money being thrown around the US is like nowhere else on the planet. Some of it definitely a bubble / unrealistic, but that doesn't matter. But in part it's because of the size of the total potential market that this is justified.
Education / national mythology also plays a part, I think (this is pure conjecture now). In the US, the "American Dream", "everyone can make it" etc is heavily ingrained. It propagates through the world with the help of Hollywood and other American cultural exports. In most EU countries, there isn't such a heavy emphasis on independence and "pulling yourself up by your bootstraps". "Hustle culture" isn't a thing. So for most people, it isn't something that comes naturally to them to start a company and work 100 hour weeks to be big and rich and successful and famous.
That's not to say there aren't such people, I went to 42 and have been to Station F and know some people in that universe. A decent proportion of my classmates wanted to make their startup and make it big, and some did end up starting their own companies.
Ding ding ding! When China does it with solar and EVs we call it "dumping". When Uber, OpenAI and Anthropic do it, that term is never ever used. VC funded US techs dumps harder than any Chinese industry ever has.
If you manage to get 10 million customers, your business is already successful on a gigantic scale, and you should have all the know-how in taking on the world. The success of other people is rarely the reason why you are failing in your own life. Start somewhere, do something.
> The money being thrown around the US is like nowhere else on the planet.
That's true and it's awesome. In Europe money is only thrown to real estate owners and any enterprising people with a dream are cordially invited to fucking forget about it, shut up, and fall back in line. Even if they already have a proven track record. They take their idea to the United States and are treated incredibly well in comparison. Even if their business will only be a niche business with limited reach, like 99% of businesses.
That's simply not true. Europe is a continent that includes startup powerhouse Sweden, UK that has a ton of them too, and France that is making massive strides (just check out Station F). It might be true in Slovakia or whatever, but you simply cannot say that broadly for the whole continent.
> That's true and it's awesome
Is it? It is to an extent, but it also encourages nonsense (Juicero, the AI spice mixer), pump and dump tech-isation of existing stuff with no business model (Wework) and outright scams (Theranos, Nikola).
If the barrier to entry to get money thrown at you is so low, a lot of it gets wasted. So the rest really really has to make it big, which gives some perverse incentives (like Uber dumping to driver old school taxis out of business to then jack up prices, because if they're not a monopoly/biggest fish in town, they wouldn't have been worth it)
Go search the web for startup financing in Sweden. One of the big banks has a page dedicated to this - but they're not offering their own money, instead they're pointing to government grants and government lending programs.
> If the barrier to entry to get money thrown at you is so low, a lot of it gets wasted. So the rest really really has to make it big
You can't do your accounting like that. One entity does not make up for the losses of another entity. Money is cheap in the US and thrown around to bad startups and good startups. In Europe money is expensive and mostly available to governments or for real estate feudalism.
Why work in the "europoor" countries when you can go to america and earn megabucks.
All of these purported EU-specific reasons completely ignore that things are the same elsewhere. It's the US that is the outlier.
That's not the reason the EU has no unified capital market.
The reason is that member states are reluctant to hand over control and power to EU. Just look at e.g. Eurobonds and the shitshow around that. Especially the rise of populist anti-EU parties all over, are causing this.
Ironically, the parties that "freedom lovers" like Musk are funding and pushing, are the ones causing the EU to not function as a single capital market and "not having a business friendly environment".
Capital controls are protectionist measures, but anyway, no.
> Okay, what is the limiting factor?
Let's look at which countries have a significant local software industry compared to population size.
- China
- US
- Korea
- You can argue for Japan and India but that's already starting to stretch.
- Yup, effectively no where else. Even in an "out of the way" place like Myanmar everyone uses Meta, with a nice little genocide to show for it. Sure, in Vietnam they use Zalo, and other places have a few other local players. But most of the famous US tech apps are dominant.
Is the EU the outlier here? No. Everywhere else US tech dominates. Meta, Netflix, Apple, Google, Uber, Spotify, Microsoft, Match Group, Paypal, Amazon, and on and on. They don't just dominate the EU, they dominate the world.
Except for the countries I named above, where at least some of the markets that US big tech competes in, instead have bigger local players. And even there, guess what?
Their market share is almost 1:1 linearly correlated to the degree of protectionism in those countries, all the way from China, then Korea, then India/Japan, and then everywhere else! Who woulda thought!
Why does Korea have much less US tech dominance than, say, Germany? Despite German companies theoretically having a big advantage: the German public is 100x more privacy conscious than the Korean one, and much less trusting of US companies.
I can tell you that it's not less regulations; Korea's GDPR is much more onerous than the EU's and so are investment regulations. On every single regulatory aspect, German software startups have it easier. But they were never protected. US tech was allowed to waltz in, dump their products - that's what they did, it's hilarious how now China "dumping" EVs and solar is suddenly an issue when it's exactly the strategy that US tech continues to this day; the AI companies are doing it right now! And the Korean companies were protected. Both by the rules burden, that local companies had to deal with too, along with intentional protectionism.
When it comes to solar and EVs, we all understand that a foreign country dumping their goods kills local industry. It's the exact same with software.
But then half of HN has millions on the bank exactly thanks to the above - this is where all those fat SV salaries have come from - so I do get the lack of desire to understand it.
Fundamentally BYD cars are cheaper because China has localized the complete supply Chain or has very good raw materials import and local refinement capability. BYD invested themselves and spend 20-25 years vertically integrating while European companies put out huge dividend and miss major technology trends.
The EV market in China was hyper competitive with a huge number of competitors and the subsidies were strategically phased out to turn the industry competitive rather then relaying on protection.
The idea that China has enough money do 'dump' products in literally ever sector that people accuse it of 'dumping' AND at the same time have enough money for massive infrastructure programs at home seems like coping to me.
> Uber is irrelevant in many places in the EU due to protectionism.
Thanks for another great example that proves the point :) Though I think there are markets where Uber Eats dominates even when Uber Taxi doesn't. Could be wrong.
Seems like you actually believe this. I think our starting points on reality are different enough that we are not going to have a productive conversation, I wish you and other Europeans the best of luck in your protectionism-led growth strategy. Make sure to not discuss it with any pesky macroeconomists who might lead you astray. take care
What it's terribly good at is adding burdens that the US giants don't face early on, slowing down the early growth between 28 fragmented markets. I don't know specifically about how China works, but the question is proving product-market fit, and for that, you need a lot of users fast.
In the EU, it's a different battle country to country as the media environment, the markets, the regulation etc. are all fractured.
Bingo, my friend. And good old american exceptionalism taught since the first days of school.
In the US, some ex-Googler might found a startup. Europe doesn't have the equivalent of FAANG. (Europe-wide companies are not quite as easy as US-wide)
Even if the super computer itself "fails", is the goal actually the secondary impacts to the economy?
(And in the US, we do our own fair share of picking winners / losers, especially in the current regime)
Cluster: for public benefit, cutting edge research in biotech, medical, robotics.
Levels: I want to create AI photos of people for my AI Slop startup
That's not what the quoted paragraph says and you can read the whole release if you want: https://ec.europa.eu/commission/presscorner/detail/en/ip_25_...
--- start quote ---
Apply AI Strategy
The Apply AI Strategy aims to harness AI's transformative potential by driving adoption of AI across strategic and public sectors including healthcare, pharmaceuticals, energy, mobility, manufacturing, construction, agri-food, defence, communications and culture. It will also support small and medium-sized enterprises (SMEs) with their specific needs and help Industries integrate AI into their operations.
--- end quote ---
I also quoted a paragraph from a document I will find when I'm not on mobile.
Levels literally wants to train AI Slop: https://x.com/levelsio/status/1981499900266193028
--- start quote ---
Train a foundational model for AI photos of people
--- end quote ---
My quote: Cluster: for public benefit, cutting edge research in biotech, medical, robotics.
Literal quote from your link: The Apply AI Strategy aims to harness AI's transformative potential by driving adoption of AI across strategic and public sectors including healthcare, pharmaceuticals, energy, mobility, manufacturing, construction, agri-food, defence, communications and culture.
You: your quote was misleading.
I'm sorry, I don't have the time or the patience with willfully ignorant and blind people getting their interpretations from AI slop engagement farmers.
Adieu
> I'm sorry, I don't have the time or the patience with willfully ignorant and blind people getting their interpretations from AI slop engagement farmers.
Riiight.
If these are public money, you want to reduce the blind grasping
> Slop has stumbled into accidental success on a number of occasions.
So, show me these occasions where AI slop led to "transformative potential by driving adoption of AI across strategic and public sectors including healthcare, pharmaceuticals, energy, mobility, manufacturing, construction, agri-food, defence, communications and culture."
Being honest with taxpayers about what research is is probably possible unless the population is low IQ.
And frankly, the dream scenario that Pieter describes where he somehow would qualify for these resources also wouldn't help kickstart the tech industry, and it's also not how it works in the states.
What does help, and what European governments (at least the one in The Netherlands that Pieter is from) actually do, is more funding for startups. If you're a startup founder in NL almost every angel you talk to has a matched funding deal with the government. That's such a smart way of keeping up with the US. Do you think US startups get free compute from the government? They don't even get subsidies most of the time. What they get is better funding because there's more capital available, and helping investors with that is exactly how you solve that.
Does government offering matched funding to investors actually help startups who are struggling to find (any) funding? If a startup can't find (any) funding, matching is irrelevant.
> Do you think US startups get free compute from the government? They don't even get subsidies most of the time. What they get is better funding because there's more capital available, and helping investors with that is exactly how you solve that.
Umm. I'm not really convinced that the political elites in Europe understand how to do any of this stuff well.
See also: https://www.eib.org/en/publications/online/all/the-scale-up-...
What’s worse, the parliament cannot originate law. Only the unelected Commission can do so. And they can do it behind closed doors. This is a setup that’s ripe for corruption.
However, they're appointed by the EU Council (the heads of state, most of them elected, some appointed by a national parliament), and approved by the (elected) European Parliament.
At the cost of some transparency, this does make it possible to select a bit more for management skills instead of just campaigning skills.
US presidential candidates are appointed, but they must then be elected by the public.
UK PMs must be elected MPs, or at least must face an immediate by-election (by constitutional convention) if not already an MP
If back-hand deals, internal political favouritism, nepotism or opportunism lead to the appointment of an EU Commissioner there’s nothing the public can do about it.
Is this really true? Aren't the outcome of primary elections much much more important in determining who will be the Republican and Democratic presidential nominees than elected officials and party officials are?
Specifically, an unreliable source that gives fast answers to questions says that a candidate who wins enough primaries to get over half the delegates is almost certain to be the presidential nominee for his or her party.
* Citizens vote in a nationwide election every four years.
* Those votes determine electors in each state — this is part of the Electoral College system.
* The Electoral College then formally elects the president based on those state results.
* The candidate who receives at least 270 of the 538 electoral votes becomes president.
So while there’s a layer of formality through the Electoral College, the president is ultimately chosen by voters, not appointed by any government body.
The same way that voters elect their national governments, which then appoint the Commission.
* It doesn’t have ongoing powers, authority, or political discretion like a parliament or the European Council.
* It’s a one-time, ceremonial body that meets once every four years to cast votes reflecting (in nearly all cases) the popular vote results in each state.
2. Electors don’t deliberate or choose freely.
* In almost every state, electors are legally bound (or at least politically bound) to vote according to their state’s popular vote.
* So the outcome is functionally determined by voters, not by an independent decision of the Electoral College.
3. The EU analogy misrepresents the role of the Electoral College.
* In the EU, citizens elect national governments, which then appoint members of the European Council, who negotiate and nominate the EU Commission — a genuine appointment process.
* In the U.S., the electors are not government officials making appointments; they are agents executing the people’s will as expressed in the election.
---
In the US, the people elect the President, because electors vote as their state’s voters directed.
The EU Commission analogy doesn’t fit, because that’s an appointment process made by governments, not a formalization of a direct vote.
Fact of the matter is, neither the US President nor the EU Commission are directly elected. Both are appointed with one layer of indirection between them and a direct vote.
Either we call both elected or we call both unelected. To do so for one, but not the other is anti-European propaganda.
It’s also enshrined in our respective governments and the foundational philosophies that underpin them. The US Declaration of Independence sets out to describe that the natural rights of the men who created the USA are preeminent and the Constitution lists how those rights may not be infringed upon, i.e. It creates laws that binds and limits the actions of government, something that was and has never been emulated since. Where across Europe, you simply do not have anything even remotely similar and the law inversely describes what you are permitted by government to do instead.
It is effectively descriptive vs prescriptive law and underlying philosophies. It is something I have had the hardest time on occasion getting my European friends to really internalize, seemingly because it’s so contrary to what their conditioned with all their life, i.e., the government is essentially the matter that grants you what it grants you, not that you have rights that the government may not infringe upon.
But to be fair, this possibly European tendency to dominate and control what you may and may not do and when and how, is and long has been creeping into the USA too, arguably since even the 12th amendment to the Constitution and getting worse with every amendment since, layers upon layers of contradicting and conflicting flaws and bugs that will be reading their ugly heed here in about two years, when Trump may run for President again. And if you don’t think he can, you simply don’t understand what a spaghetti code the Constitution is after the 11th amendment, hack after hack building up mountains of debt that is going to come due in our lives.
Ah the good old "Europe can't do Silicon Valley" trope.
The EU is such a bizarre place because they treat capital and entrepreneurs with such massive distrust, but never really bothered getting rid of the quasi-static entrenched hierarchies from feudalism? Like I'll go to the UK or France and there will just be massive swathes of land owned by the nobility or 'former' nobility? Maybe start there but let your high-value human capital earn a good wage?
Those 1960s regulation have a 1000x larger effect then any land owners owning a bunch of unproductive farmland or highlands.
Not to say there are no issues with large land owners, but its nowhere near anywhere close to being the issue with housing prices and living cost.
Yeah, no, this isn't even remotely true.
The EU Parlament has it's issues, and it's atrocious how corrupt politicians like e.g. von der Laien fail upwards into leadership roles there, but they're both massively exaggerating the issues to a degree that it's hard to call the stated opinions anything but unhinged.
Its kinda like calling a Chihuahua "apex predator". completely out of touch with reality.
2. A credible scale effort for EU own silicon for AI Compute, wouldn't hurt either.
3. And this can only be achieved by vertical integration to combat fragmentation.
EuroAI: Europe’s Moonshot to AI Sovereignty
https://open.substack.com/pub/ifiwaspolitical/p/euroai-europ...
Would you prefer European AI sovereignty with 15% overhead costs from geographic distribution, or 100% dependence on Nvidia/OpenAI with zero European industrial base?
Yep, the US-government sponsored, open-weight LLM is miles ahead of EuroLLM
Just the base model and a template like "English: {text}\n{language}:" can also work with a bit of filter and retry logic
My experience with government funding is that they apply something and won't even try to sell it because selling is hard: you don't want to know that the thing you built is lacking nor that the competition is better. Especially the academic types don't. Yet I'm paying for these guys. Also, by funding the academics they won't even need to go to the job market.. But as I paid for their education I thought I was buying people who create value.
Perhaps the above is rather harsh and it's "not that bad", my subjective experience nevertheless.
Vaswani is an Indian born computer scientist, Shazeer is US, Parmar was born in India, Uszkoreit was born in Germany, Jones was born in the UK, Gomez is British-Canadian, Kaiser is a Polish computer scientist, and Polosukhin is Ukrainian.
Almost all of these people have PhDs and Master degrees. The ROI on academia is vast for society, including European universities. The thing the US does well is capitalize on that education, and sadly also try to steal credit for it as "American exceptionalism." If Europe and other countries learn how to keep their academics and get them working in local industries, America's edge will evaporate overnight.
The wider availability of capital is a bigger deal though. "Attention is all you need" is available to people on other continents to read, but a computer scientist in Europe that understood exactly how big transformers were going to be and why had less chance of funding than a webdev in California with a pitchdeck full of cliches and me-too GPT wrapper for an industry they'd barely touched does today.
To be clear, I don't oppose publicly funded education (nor immigrant academics, though I don't see how this relates?). What I do oppose EU trying to compete with tech giants as if they could - the incentives are not set up right, they won't succeed and the funds will be wasted.
There are a few variables here but at this point in time, private-funded innovation isn't different by much and all things considered, the difference isn't in its favor.
This model was released in 2024, and I couldn't find any links to the training data - is it just an open weights model?
It seems like it, in most ways, it would be bad to train on 24 separate languages. That's just 24 partitions to the data. Seems really inefficient and better to simply train in the biggest (english) and translate.
I do think this will introduce some biases that correlate with the English language. It would be interesting to see more specifically what this means. But regardless, I don't think you can produce a competitive model with such a large subdivision of training data.
I suppose that's a typo and I found a technical report here: https://arxiv.org/abs/2506.04079
What else, hmmm... Ohhh yes digitization, we "Germans" still can't let our FAXing machines go.
What we are good at, outsourcing and riding into deeper dependency of other continents.
As the aim of EuroLLM is to provide EU citizens with powerful and useful AI tools, it is critical that the model can also translate and answer questions in other European and non-European languages. With this in mind, we added support for 11 additional languages (Arabic, Catalan, Chinese, Galician, Hindi, Japanese, Korean, Norwegian, Russian, Turkish, and Ukrainian).
The US and China are running rings around Europe.
Mistral is an exception as it was funded by US VCs and they are a great example showing that without VC funding, Mistral would have been begging to the EU for a microsopic grant to train a LLM worse than Llama.
We support your work and offer backup and distribution. Here a copy just in case: https://hugston.com/uploads/llm_models/EuroLLM-22B-Instruct-...
which sank to the bottom thanks to HN's invisible hand
Oh wait, one's not supposed to notice
They almost exclusively compare their model to prior models from 2024 or older and brag about "results comparable to Gemma-2-9B". I'm not sure what I expected. The eurollm.io homepage states "EuroLLM outperforms similar-sized models", which just seems like a lie for all practical purposes
An overly charitable interpretation is that EuroLLM isn't a reasoning model and has minimal post-training, so they sought out comparisons to such models (they're still ignoring reasoning models that have non-reasoning modes)
As another comment here noted, the title is missing (2024) - this model was released almost a year ago, last December, so it's not surprising that that's the models they compare to.
It seems the new version is called Horizon Europe
Edit: Thanks, @Bengalilol.
The 1.7B one looks meh.
But really solid numbers on the 9B! Props to the team!
"all official 24 EU languages” to "all 24 official EU languages"
This kind of comment is just trolling, no added value.