FilterHN

Un Ministral, Des Ministraux

211 points

by veggieroll

1 day ago

| past

| 16 comments

| mistral.ai

| HN

▲

1 day ago

[-]

3b is is API-only so you won’t be able to run it on-device, which is the killer app for these smaller edge models.

I’m not opposed to licensing but “email us for a license” is a bad sign for indie developers, in my experience.

8b weights are here https://huggingface.co/mistralai/Ministral-8B-Instruct-2410

Commercial entities aren’t permitted to use or distribute 8b weights - from the agreement (which states research purposes only):

"Research Purposes": means any use of a Mistral Model, Derivative, or Output that is solely for (a) personal, scientific or academic research, and (b) for non-profit and non-commercial purposes, and not directly or indirectly connected to any commercial activities or business operations. For illustration purposes, Research Purposes does not include (1) any usage of the Mistral Model, Derivative or Output by individuals or contractors employed in or engaged by companies in the context of (a) their daily tasks, or (b) any activity (including but not limited to any testing or proof-of-concept) that is intended to generate revenue, nor (2) any Distribution by a commercial entity of the Mistral Model, Derivative or Output whether in return for payment or free of charge, in any medium or form, including but not limited to through a hosted or managed service (e.g. SaaS, cloud instances, etc.), or behind a software layer.

▲

diggan

1 day ago

[-]

> I’m not opposed to licensing but “email us for a license” is a bad sign for indie developers, in my experience.

At least they're not claiming it's Open Source / Open Weights, kind of happy about that, as other companies didn't get the memo that lying/misleading about stuff like that is bad.

▲

talldayo

1 day ago

[-]

Yeah, a real silver-lining on the API-only access for a model that is intentionally designed for edge devices. As a user I honestly only care about the weights being open - I'm not going to reimpliment their training code and I don't need or want redistributed training data that both already exists elsewhere. There is no benefit, for my uses, to having an "open source" model when I could have weights and finetunes instead.

There's nothing to be happy about when businesses try to wall-off a feature to make you salivate over it more. You're within your right to nitpick licensing differences, but unless everyone gets government-subsidized H100s in their garage I don't think the code will be of use to anyone except moneyed competitors that want to undermine foundational work.

▲

tarruda

1 day ago

[-]

Isn't 3b the kind of size you'd expect to be able to run on the edge? What is the point of using 3b via API when you can use larger and more capable models?

▲

littlestymaar

1 day ago

[-]

GP misunderstood: 3b will be available for running on edge devices, but you must sign a deal with Mistral to get access to the weights to run.

I don't think that can work without a significant lobbying push towards models running on the edge but who knows (especially since they have a former French Minister in the founding team).

▲

1 day ago

[-]

> GP misunderstood

I don’t think it’s fair to claim the weights are available if you need to hammer out a custom agreement with mistral’s sales team first.

If they had a self-serve process, or some sort of shink-wrapped deal up to say 500k users, that would be great. But bespoke contracts are rarely cheap or easy to get. This comes from my experience building a bunch of custom infra for Flux1-dev, only to find I wasn’t big enough for a custom agreement, because, duh, the service doesn’t exist yet. Mistral is not BFL, but sales teams don’t like speculating on usage numbers for a product that hasn’t been released yet. Which is a bummer considering most innovation happens at a small scale initially.

▲

littlestymaar

1 day ago

[-]

I'm not defending Mistral here, I don't think it's a good idea I just wanted to to not out that there is no paradox as if the 3b model was API-only.

▲

mark_l_watson

1 day ago

[-]

You are correct, convenience for trying many new models is important. For me, this means being able to run with Ollama.

▲

DreamGen

1 day ago

[-]

From what I have heard, getting license from them is also far from guaranteed. They are selective about who they want to do business with -- understandable, but something to keep in mind.

▲

wg0

1 day ago

[-]

Genuine question - if I have a model which I only release weights with restrictions on commercial usage and then someone deploys that model and operates it commercially - what are the way to identify that it is my model that's doing the online per token slavery over HTTP endpoint?

▲

dest

1 day ago

[-]

There are ways to watermark the output, by slightly altering the choice of tokens in a recognizable pattern.

▲

wg0

1 day ago

[-]

Within the model? Like as part of training afterwards some fine tuning?

▲

csomar

1 day ago

[-]

Thanks, I was confused for a bit. The 3b comparison with llma3.2 is useless. If I can't run it on my laptop, it's no longer comparable to open models.

▲

moralestapia

1 day ago

[-]

Lol, the whole point of Edge models is to be able to run them locally.

▲

cjtrowbridge

1 day ago

[-]

They released it on huggingface.

▲

aabhay

1 day ago

[-]

This press release is a big change in branding and ethos for Mistral. What was originally a vibey, insurgent contender that put out magnet links is now a PR-crafting team that has to fight to pitch their utility to the public.

▲

csomar

1 day ago

[-]

This might suggest that they are plateauing. If you think your next model won't improve a lot, then you'll try to start earning from this current model. Luckily, we still have meta. llma3.2 is really good. and it runs on my laptop with a regular intel CPU.

▲

littlestymaar

1 day ago

[-]

I was going to say the same. Incredible to see how quickly Mistral went from “magnet links casually dropped on twitter by their CTO” to “PR blog post without the model weights” in just a year.

Not a good sign at all as it means their investors are already getting nervous.

▲

swyx

1 day ago

[-]

just want to point out that this isnt entirely true. pixtral was magnet link dropped recently. mistral simply has two model rollout channels depending on the level of openness they choose. dont extrapolate too much due to vc hate.

▲

Whiteshadow12

1 day ago

[-]

Nice voice of reason swyx, people who are not hooked on X, will have selective memory, "Mistral has changed, I miss the old Mistral".

Last year Mistral watched as every provider host their models with little to no value capture.

Nemo is Apache 2.0 license, they could have easily made that a Mistral Research License model.

It's hard to pitch vc for more money to build more models when you don't capture anything making it Apache 2.0.

Not everyone can be Meta.

Magnet links are cute but honestly, most people rather use HF to get their models.

▲

wg0

1 day ago

[-]

That's usually the evidence of VCs getting involved. Somber corporate tone proud on accomplishments user will find useful we continue to improve looking to the future and such.

▲

xnx

1 day ago

[-]

Has anyone put together a good and regularly updated decision tree for what model to use in different circumstances (VRAM limitations, relative strengths, licensing, etc.)? Given the enormous zoo of models in circulation, there must be certain models that are totally obsolete.

▲

leetharris

1 day ago

[-]

People keep making these, but they become outdated so fast and nobody keeps up with it. If your definition of "great" changes in 6 months because a new model shatters your perception of "great," it's hard to rescore legacy models.

I'd say keeping up with the reddit LocalLLama community is the "easiest" way and it's by no means easy.

▲

kergonath

1 day ago

[-]

> I'd say keeping up with the reddit LocalLLama community is the "easiest" way and it's by no means easy.

The subreddit is… not great. It’s a decent way of keeping up, but don’t read the posts too much (and even then, there is a heavy social aspect, and the models that are discussed there are a very specific subset of what’s available). There is a lot of groupthink, the discussions are never rigorous. Most of the posts are along the lines of “I tested a benchmark and it is 0.5 points ahead of Llama-whatever on that one benchmark I made up, therefore it’s the dog’s and everything else is shite”. The Zuckerberg worshiping is also disconcerting. Returns diminish quickly as you spend more time on that subreddit.

▲

potatoman22

1 day ago

[-]

Someone should use an LLM to continuously maintain this decision tree. The tree itself will decide which LLM is used for maintainence.

▲

mark_l_watson

1 day ago

[-]

I tend to choose a recent model available for Ollama, and usually stick with a general purpose local model for a month or so, then re-evaluate. Exceptions to sticking to one local model at a time might be needing a larger context size.

▲

iamjackg

1 day ago

[-]

This is definitely a problem. I mostly take a look at the various leaderboards, but there is a proliferation of fine-tuned models that makes it incredibly daunting to explore the model space. Add to that that often they're not immediately available on turn-key tools like ollama, and the friction increases even more. All this without even considering things like licenses, what kind of data has been used for fine tuning, quantization, merges, multimodal capabilities.

I would love a curated list.

▲

cmehdy

1 day ago

[-]

For anybody wondering about the title, that's a sort-of pun in French about how words get pluralized following French rules.

The quintessential example is "cheval" (horse) which becomes "chevaux" (horses), which is the rule they're following (or being cute about). Un mistral, des mistraux. Un ministral, des ministraux.

(Ironically the plural of the Mistral wind in the Larousse dictionnary would technically be Mistrals[1][2], however weird that sounds to my french ears and to the people who wrote that article perhaps!)

[1] https://www.larousse.fr/dictionnaires/francais/mistral_mistr... [2] https://fr.wiktionary.org/wiki/mistral

▲

BafS

1 day ago

[-]

It's complex because french is full of exceptions

the classical way to pluralize "–al" words:

  un animal → des animaux [en: animal(s)]
  un journal → des journaux [en: journal(s)]

with some exceptions:

  un carnaval → des carnavals [en: carnival(s)]
  un festival → des festivals [en: festival(s)]
  un idéal → des idéals (OR des idéaux) [en: ideal(s)]
  un val → des vals (OR des vaux) [en: valley(s)]

There is no logic there (as many things in french), it's up to Mistral to choose how the plural can be

EDIT: Format + better examples

▲

maw

1 day ago

[-]

But are these truly exceptions? Or are they the result of subtler rules French learners are rarely taught explicitly?

I don't know what the precise rules or patterns actually might be. But one fact that jumped out at me is that -mal and -nal start with nasal consonants and three of the "exceptions" end in -val.

▲

cwizou

1 day ago

[-]

No, like parent says, with many things in French, grammar and what we call "orthographe" is based on usage. And what's accepted tends to change over time. What's taught in school varies over the years too, with a large tendency to move to simplification. A good example is the french word for "key" which used to be written "clef" but over time moved to "clé" (closer to how it sounds phonetically). About every 20/30 years, we get some "réformes" on the topic, which are more or less followed, there's some good information here (the 1990 one is interesting on its own) : https://en.wikipedia.org/wiki/Reforms_of_French_orthography

Back to this precise one, there's no precise rule or pattern underneath, no rhyme or reason, it's just exceptions based on usage and even those can have their own exceptions. Like "idéals/idéaux", I (french) personally never even heard that "idéals" was a thing. Yet it is, somehow : https://www.larousse.fr/dictionnaires/francais/idéal/41391

▲

speed_spread

1 day ago

[-]

Errr. French is _not_ based on usage but has an official rulebook that is maintained by L'académie de la langue française. Obviously nobody is coming after you if you don't respect the rules but there absolutely is a defined standard. This makes French useful for international standards and treaties because the wording can be very precise and leave much less to interpretation.

To my knowledge there aren't that many languages that are managed as officially as French is.

▲

cwizou

23 hours ago

[-]

Sure, sorry I left that part out.

The Académie tried to codify what was used at the time (which varied a lot) to try and create a standard, but that's why there's so many exceptions to the rules everywhere : they went with "tradition" when creating the system instead of logical rules or purer phonetical approach (which some proposed).

There's a bunch of info on the wikipedia link about it, and how each wave or "réforme" tries to make it simpler (while still keeping the old version around as correct).

Each one is always hotly debated/rejected by parents too when they see their kids learning the newly simplified rules.

Recently, the spelling of onion in french went from "Oignon" (old spelling with a silent I) to "Ognon" (simplifying it out), and event that one made me have a "hmm" moment ;)

▲

epolanski

1 day ago

[-]

If it is like Italian, my native language, it's just exceptions you learn by usage.

▲

makapuf

1 day ago

[-]

I've never heard of such a rule (am native), and your reasoning is fine but there are many common examples : cheval (horse), rival, estival (adjective, "in the summer "), travail (work, same rules for -ail words)...

▲

Muromec

1 day ago

[-]

Declesion patterns are kinda random in general.

▲

realo

1 day ago

[-]

Indeed not always rational...

cuissots de veau cuisseaux de chevreuil

▲

kergonath

19 hours ago

[-]

I think you got it backwards ;)

In any case, this is (officially) obsolete now.

https://fr.m.wikipedia.org/wiki/Cuisseau

▲

clauderoux

1 day ago

[-]

The exceptions are usually due to words that were borrowed from other languages and hence do not follow French rules. Many of the words that were mentioned here are borrowed from the Occitan language.

▲

rich_sasha

1 day ago

[-]

That's news to me that French for "valley" is masculine and "val" - isn't it feminine "vallée"? Like, say "Vallée Blanche" near Chamonix? And I suppose the English ripoff, "valley" sounds more like "vallée" than "val" (backwards argument, I know).

▲

idoubtit

1 day ago

[-]

The "Vallée blanche" you mentioned is not very far from "Val d'Arly" or "Val Thorens" in the Alps. Both words "val" and "vallée", and also "vallon", come from the Latin "vallis". See the Littré dictionary https://www.littre.org/definition/val for examples over the last millennium.

By the way "Le dormeur du val" (The sleeper of the small valley) is one of Rimbaud's most famous poems, often learned at school.

▲

bambax

1 day ago

[-]

Un val is a small vallée. Une vallée is typically several kilometers wide; un val is a couple of hundred meters wide, tops.

The "Trésor de la langue française informatisé" (which hasn't been updated since 1994) says val is deprecated, but it's common in classic literary novels, together with un vallon, a near synonym.

▲

dustypotato

22 hours ago

[-]

Le terme vallée, utilisé comme toponyme, doit être distingué du terme val qui est souvent employé pour désigner et nommer une région limitée dans divers pays d'Europe et dans leurs langues.

-- https://fr.wikipedia.org/wiki/Vall%C3%A9e I agree. it's weird. I'm sure there are other similar examples

▲

mytailorisrich

1 day ago

[-]

Yes, la vallée (feminine) and le val (masculine). Valley is usually la vallée. Val is mostly only used in the names of places.

Apparently val gave vale in English.

▲

makapuf

1 day ago

[-]

Genders in French words is a fine example of a cryptography-grade random generator.

▲

GuB-42

1 day ago

[-]

It can funny sometimes. A breast (un sein) and a vagina (un vagin) are both masculine, while a beard (une barbe) is feminine. For the slang terms, a ball (une couille) and a dick (une bite) are also feminine.

Of course, it is not always the opposite, otherwise it wouldn't be random. A penis (un penis) is masculine for instance.

▲

Muromec

1 day ago

[-]

It's a keyed generator, they just lost that small bag that seeded it

▲

kergonath

19 hours ago

[-]

> Ironically the plural of the Mistral wind in the Larousse dictionnary would technically be Mistrals

This is getting off-topic, but anyway…

The Larousse definition is wrong, that’s for sure. The Tramontane comes from the West, between the Pyrenees and the Massif Central, it is not at all the same current as the Mistral.

I am not sure how prevalent “les Mistrals” is in the literature. I don’t doubt that some people wrote this, possibly for some poetic effect, but it sounds very wrong as well. Mistral is a proper noun, and it is not collective like “Alizés”. It means specifically the wind that blows along the Rhône valley, there cannot be more than one.

[edit] as other pointed out, there is the Mistral gagnant sweet, which can indeed be plural.

▲

Rygian

1 day ago

[-]

On the subject of French plurals, you also get some counterintuitive ones:

- Egg: un œuf (pronounced /œf/), des œufs (pronounced /œ/ !)

- Bone: un os (pronounced /os/), des os (pronounced /o/ !)

▲

mytailorisrich

1 day ago

[-]

Mistral is essentially never in plural form because it is the name of a specific wind.

The only plural form people will probably know is from the song Mistral Gagnant where the lyrics include les mistrals gagnants but that refers to sweets!

Not sure why anyone would think "les mistraux"... ;)

▲

ucarion

1 day ago

[-]

I'm not sure if being from the north of France changes things, but I think the Renaud song is much more familiar to folks I know than the wind.

▲

kergonath

19 hours ago

[-]

This is probably heavily population-dependent. I don’t think they named the Mistral-class ships after the song or the sweet.

https://en.m.wikipedia.org/wiki/Mistral-class_landing_helico...

▲

Spone

1 day ago

[-]

The song actually refers to a kind of candy named "Mistral gagnant"

https://fr.m.wikipedia.org/wiki/Mistral_gagnant_(confiserie)

▲

mytailorisrich

1 day ago

[-]

Well yes, it is a Mediterranean wind!

▲

tarruda

1 day ago

[-]

They didn't add a comparison to Qwen 2.5 3b, which seems to surpass Ministral 3b MMLU, HumanEval, GSM8K: https://qwen2.org/qwen2-5/#qwen25-05b15b3b-performance

These benchmarks don't really matter that much, but it is funny how this blog post conveniently forgot to compare with a model that already exists and performs better.

▲

DreamGen

1 day ago

[-]

Also, the 3B model, which is API only (so the only thing that matters is price, quality and speed) should be compared to something like Gemini Flash 1.5 8B which is cheaper than this 3B API and also has higher benchmark performance, super long context support, etc.

▲

butterfly42069

1 day ago

[-]

At this point the benchmarks barely matter at all. It's entirely possible to train for a high benchmark score and reduce the overall quality of the model in the process.

Imo use the model that makes the most sense when you ask it stuff, and personally I'd go for the one with the least censorship (which imo isn't AliBaba Qwen anything)

▲

lairv

1 day ago

[-]

Hard to see how can Mistral compete with Meta, they have order of magnitude less compute, their models are only slightly better (at least on the benchmarks) with less permissive licenses?

▲

leetharris

1 day ago

[-]

In general I feel like all model providers eventually become infrastructure providers. If the difference between models is very small, it will be about who can serve it reliably, with the most features, with the most security, at the lowest price.

I'm the head of R&D at Rev.ai and this is exactly what we've seen in ASR. We started at $1.20/hr, and our new models are $0.10/hr in < 2 years. We have done human transcription for ~15 years and the revenue from ASR is 3 orders of magnitude less ($90/hr vs $0.10/hr) and it will likely go lower. However, our volumes are many orders of magnitude higher now for serving ASR, so it's about even or growth in most cases still.

I think for Mistral to compete with Meta they need a better API. The on-prem/self-hosted people will always choose the best models for themselves and you won't be able to monetize them in a FOSS world anyways, so you just need the better platform. Right now, Meta isn't providing a top-tier platform, but that may eventually change.

▲

cosmosgenius

1 day ago

[-]

Their 12b nemo model is very good in a homelab compared llama models. This is for story creation.

▲

dotnet00

1 day ago

[-]

For one, Mistral's models seem less censored and less rambly than the Llama models.

▲

blihp

1 day ago

[-]

They can't since Meta can spend billions on models that they give away and never need to get a direct ROI on it. But don't expect Meta's largess to persist much beyond wiping out the competition. Then their models will probably start to look about as open as Android does today. (either through licensing restrictions or more 'advanced' capabilities being paywalled and/or API-only)

▲

sangnoir

1 day ago

[-]

> But don't expect Meta's largess to persist much beyond wiping out the competition

I don't quite follow your argument - what exactly is Meta competing for? It doesn't sell access to a hosted models and shows no interest of being involved in the cloud business. My guess is Meta is driven by enabling wider adoption of AI, and their bet is more (AI-generated) content is good for its existing content-hosting-and-ad-selling business, and good for it's aspirational Metaverse business too, should it pan out.

▲

blihp

1 day ago

[-]

I'm arguing that Meta isn't in this for altruistic reasons. In the short term, they're doing this so Apple/Google can't do to them with AI tech what they've done to them with mobile/browsers. (i.e. Meta doesn't want them owning the stack, and therefore controlling and dictating, who can do what with it) In the longer term: Meta doesn't sell access... yet. Meta shows no interest... yet. You could have said the same thing about Apple and Google 15+ years ago about a great many things. This has all happened before and this will all happen again.

▲

thrance

1 day ago

[-]

In Europe, they are basically the only LLM API provider that is GDPR compliant. This is a big factor here, when selecting a provider.

▲

TheFragenTaken

1 day ago

[-]

With the advent of OSS LLMs, it's "just" a matter of renting compute.

▲

isoprophlex

1 day ago

[-]

Azure openai is definitely compliant...

▲

vineyardmike

1 day ago

[-]

Are all the big clouds not GDPR compliant?

Hard to imagine anyone competing with AWS/GCP/Azure for slices of GPUs/TPU. AFAIK, most major models are available a la carte via API on these providers (with a few exclusives). I can’t imagine how anyone can compete the big clouds on serving an API, and I can’t imagine them staying “non compliant” for long.

▲

thrance

1 day ago

[-]

Maybe, but when selling a SAAS here, big clients will always ask what cloud provider you use. Using an European one is always a plus, if it isn't simply required.

▲

simonw

1 day ago

[-]

Yeah, the license thing is definitely a problem. It's hard to get excited about an academic research license for a 3B or 8B model when the Llama 3.1 and 3.2 models are SO good, and are licensed for commercial usage.

▲

sigmar

1 day ago

[-]

to be clear- these ministal model are also licensed for commercial use, but not freely licensed for commercial use. and meta also has restrictions on commercial use (have to put “Built with Meta Llama 3” and need to pay meta if you exceed 700 million monthly users)

▲

sthatipamala

1 day ago

[-]

You need to pay meta if you have 700 million users as of the Llama 3 release date. Not at any time going forward.

▲

simonw

1 day ago

[-]

... or presumably if you build a successful company and then try to sell that company to Apple, Microsoft, Google or a few other huge companies.

▲

tarruda

1 day ago

[-]

> need to pay meta if you exceed 700 million monthly users

Seems like a good problem to have

▲

harisec

1 day ago

[-]

Qwen 2.5 models are better than Llama and Mistral.

▲

speedgoose

1 day ago

[-]

I disagree. I tried the small ones but they too frequently output Chinese when the prompt is English.

▲

harisec

1 day ago

[-]

I never had this problem but i guess it depends on the prompt.

▲

espadrine

1 day ago

[-]

> Hard to see how can Mistral compete with Meta

One significant edge: Meta does not dare even distribute their latest models (the 3.2 series) to EU citizens. Mistral does.

▲

kergonath

19 hours ago

[-]

I am not sure why you are downvoted, because AFAICT this is still true. It was definitely true when they were released.

▲

gunalx

1 day ago

[-]

Not having open ish weigths is a total dealbreaker for me. The only really compelling reason behind sub 6B models is them being easy to run on even consumer hardware or on the edge.

▲

smcleod

1 day ago

[-]

It's pretty hard to claim it's the world's best then not compare it to Qwen 2.5....

▲

daghamm

1 day ago

[-]

Yeah, qwen does great in benchmarks but is it really that good in real use?

▲

akvadrako

1 day ago

[-]

In my experience using it for story telling, no. Even their largest model likes to produce garbage which isn't even words when you turn the temperature up a bit. And without that it's really bland.

▲

smcleod

1 day ago

[-]

Yes it's fantastic, really great for both general use and coding.

▲

sharkjacobs

1 day ago

[-]

I know that Mistral is a French company, but I think it's really clever marketing the way they're using French language as branding.

▲

mergisi

1 day ago

[-]

Just started experimenting with Ministral 8B! It even passed the "strawberry test"! https://x.com/mustafaergisi/status/1846861559059902777

▲

fhdsgbbcaA

5 hours ago

[-]

That just means it was trained recently enough to have it in training data.

▲

barbegal

1 day ago

[-]

Does anyone know why Mistral use a 17 bit (131k) vocabulary? I'm sure it's more efficient at encoding text but each token doesn't fit into a 16 bit register which must make it more inefficient computationally?

▲

cpldcpu

1 day ago

[-]

The tokens are immediately transformed into embeddings (very large vectors), so the 17 bit values are not used for any computation.

▲

daghamm

1 day ago

[-]

I don't get it.

These are impressive numbers. But while their use case is local execution to preserve privacy, the only way to use these models right now is to use their API?

▲

druskacik

1 day ago

[-]

How many Mistral puns are there?

The benchmarks look promising, great job, Mistral.

▲

amelius

1 day ago

[-]

I have no idea how to compare the abilities of these models, so I have no idea how much of a deal this is.

▲

zurfer

1 day ago

[-]

poor title, Mistral released new open weight models that win across benchmarks in their weight class: Ministral 3B and Ministral 8B

▲

scjody

1 day ago

[-]

Are they really open weights? Ministral 3B is "Mistral Commercial License".

▲

leetharris

1 day ago

[-]

Yeah the 3B are NOT open. The 8B are as they can be used under commercial license.

▲

diggan

1 day ago

[-]

"commercial license != open", by most standards

▲

zurfer

1 day ago

[-]

too late to edit now. I was completely wrong about open-weights.

The meme at the bottom made me jump to that conclusion. Well, not that exciting of a release then. :(

▲

DreamGen

1 day ago

[-]

That would be misleading. They aren't open weight (3B is not available). They aren't compared to Qwen 2.5 which beats them in many of the benchmarks presented while having more permissive license. The closed 3B is not competitive with other API only models, like Gemini Flash 8B which costs less and has better performance.

▲

WiSaGaN

1 day ago

[-]

"For self-deployed use, please reach out to us for commercial licenses. We will also assist you in lossless quantization of the models for your specific use-cases to derive maximum performance.

The model weights for Ministral 8B Instruct are available for research use. Both models will be available from our cloud partners shortly."