Mistral at GTC 2026: Small 4, Forge and Leanstral explained

On March 17, 2026, the Nvidia GTC conference in San José became the stage for an offensive that few had seen coming.

In a single day, Mistral AI shipped three distinct products, signed a strategic partnership with Nvidia, and made its global ambitions unmistakably clear.

A new unified model, a sovereign training platform, a formal code verification agent: three products that, taken separately, look like routine technical releases.

Taken together, they sketch a major repositioning: Mistral is no longer selling a model, but a complete stack, capable of competing with OpenAI and Anthropic on their own turf.

Key takeaways:

Mistral Small 4 unifies reasoning, multimodal and agentic code in a single MoE model (119B total / 6B active), with an industry-first reasoning_effort parameter
Forge trains a model from scratch on your data: not fine-tuning, not RAG, but full ownership of a model you can deploy on-premise
Leanstral generates formal code proofs in Lean 4: scoring 26.3 on FLTEval for $36, vs. 23.7 for Claude Sonnet 4.6 at $549, a cost-performance ratio 15 times better
The Nvidia partnership and the Nemotron coalition position Mistral as co-architect of global open AI standards, at the cost of a real sovereignty paradox
Mistral is targeting one billion dollars in annual recurring revenue in 2026: these three announcements are a show of strength, not just product launches

Three announcements, one strategy

Choosing Nvidia GTC as the launch pad was no accident.

Jensen Huang‘s annual conference has become the Davos of AI: every announcement is amplified there by a global audience of technical decision-makers.

Mistral didn’t just show up with a product: the startup used the stage to send a signal of global ambition.

Until now, Mistral was seen as the European champion of open-source AI, a national alternative to American giants.

These three simultaneous announcements shift the narrative: Small 4 targets the enterprise market with a versatile model, the Nemotron coalition positions Mistral as co-architect of global open AI standards, and Leanstral proves the depth of an R&D operation that goes far beyond conversational models.

Arthur Mensch, Mistral’s CEO, put it plainly:

Open frontier models are how AI becomes a real platform.

Together with Nvidia, we will play a central role in training and advancing large-scale frontier models.

This is no longer the language of a startup fighting to survive: it’s the language of a company that co-builds industry standards.

Small 4: one model, three capabilities

Before March 17, technical teams looking to get the most out of Mistral had to juggle multiple specialized tools: Magistral for reasoning, Pixtral for multimodal, Devstral for agentic code.

Mistral Small 4 ends that fragmentation with a single unified architecture.

The model is built on a Mixture of Experts architecture with 128 experts in total, with only 6 billion active parameters per request out of 119 billion total.

The practical result: 40% lower latency and three times more requests per second compared to Small 3, with a context window extended to 256,000 tokens for processing long documents without chunking.

The real breakthrough is the reasoning_effort parameter: a reasoning depth dial with no industry precedent.

Set to “none”, the model responds as fast as Small 3, ideal for a chatbot or lightweight agent orchestration.

Set to “high”, it switches to deep reasoning comparable to Magistral, suited for complex analytical tasks.

A single deployment replaces three distinct models: for DevOps teams, that means less infrastructure to manage and lower operational costs.

On benchmarks, Small 4 with reasoning enabled matches or outperforms GPT-OSS 120B on three key evaluations (AA LCR, LiveCodeBench, AIME 2025), with outputs that are 20% shorter on LiveCodeBench.

Shorter responses don’t mean lower quality: for Small 4, they mean reduced latency and inference costs in production cut in half compared to comparable Qwen models at equal performance.

Priced at $0.15 per million input tokens and $0.60 per million output tokens, Small 4 goes head-to-head with the hyperscalers’ budget model tier.

The whole thing is released under Apache 2.0 license, available on Hugging Face, the Mistral API, and as an optimized Nvidia NIM container for on-premise deployments.

To make the most of these numbers without getting tripped up by out-of-context figures, our guide to reading AI benchmarks without being misled gives you the tools you need.

Forge: the real shift for enterprise

Mistral Forge is arguably the week’s most structural announcement, the one most likely to fly under the radar for anyone looking only for a product roundup.

Forge is not fine-tuning: adapting a few thousand examples on top of an existing model.

Forge is not RAG: connecting an external document store to a generic model.

Forge is a platform that gives a company or government the ability to train a frontier-level model from scratch, on its own internal data, under its own compliance policies.

Interface de précision abstraite avec une aiguille orange lumineuse représentant le paramètre reasoning_effort de Mistral Small 4

Here’s what that means in practice: the European Space Agency can train a model on 30 years of telemetry data without that data ever leaving its own infrastructure, and get a model that natively understands its operational jargon and specific regulatory constraints.

The platform covers the full model lifecycle: pre-training on large volumes of internal data, post-training to refine on specific tasks, and reinforcement learning to align behavior with business objectives.

Forge can run on Mistral’s infrastructure, on dedicated clusters via Mistral Compute, or directly on-premise at the client’s site: in sectors like defense, finance or healthcare, data simply cannot leave the building.

The other differentiator is human: Mistral engineers are deployed directly at client sites, a model borrowed from IBM and Palantir, to identify the right data and adapt models to real-world needs.

Elisa Salamanca, Mistral’s Head of Product, elaborates:

We embed engineers and even scientists at client sites.

It’s a major differentiator for us: we’re the only ones doing it this way.

The first announced partners span very different profiles: ASML (semiconductors, Series C investor), Ericsson (telecoms), the European Space Agency, Reply (consulting, Italy), and Singapore’s government agencies DSO and HTX.

This approach speaks directly to the core question around LLMs in business decisions: generic models can’t capture the depth of an organization without being trained on its real data.

AWS launched its Nova Forge service in late 2025 with a similar positioning: Mistral responds with a more sovereign, more portable offer, and a human accompaniment that hyperscalers don’t provide.

Leanstral: formal proof as a secret weapon

The third announcement is the quietest of the day, and potentially the most important over the long term.

An AI agent generates code that “looks like it works,” and a human has to review, test and fix the errors: that’s the bottleneck of every coding tool today.

Lean 4 is a formal proof assistant that produces the mathematical proof that a piece of code does exactly what it was asked to do.

If the proof is valid, the code is guaranteed correct in the mathematical sense: not “probably correct,” not “correct in tested cases,” but correct for every possible valid input.

Leanstral is the notary of code: it won’t sign off without verifying every clause.

Mistral trained a 120-billion-parameter agent (6B active), specialized in automatically generating these proofs in Lean 4, with direct compiler integration via the MCP protocol.

On the FLTEval benchmark (based on real pull requests from the Fermat’s Last Theorem formalization project at Imperial College London):

Leanstral pass@2: score 26.3 for $36
Claude Sonnet 4.6: score 23.7 for $549
Claude Opus 4.6: score 39.6 for $1,650

The ratio is 1 to 15 in Leanstral’s favor compared to Sonnet, for superior performance on this specific task.

A critical reading is warranted: FLTEval is an in-house benchmark, not reproduced by third parties, and the comparison pits a specialized 6B active-parameter model against general-purpose Swiss Army knives.

The sectors targeted are those where “looks like it works” isn’t good enough: aerospace, capital markets, smart contracts, critical medical systems.

Formal verification has been technically possible for decades.

It was economically out of reach for most teams.

At $36 for a result that outperforms frontier models on this specialized task, the cost objection is starting to crumble.

Leanstral is available immediately via Mistral Vibe (command /leanstall), via a free API endpoint (labs-leanstral-2603) for a limited time, and as a direct download under Apache 2.0 license.

What this changes for European AI

Let’s put the numbers on the table: Arthur Mensch confirmed that Mistral is on track to exceed one billion dollars in annual recurring revenue in 2026.

For a three-year-old European startup facing OpenAI and Anthropic, that’s a signal of commercial maturity that goes well beyond the “national champion” label.

The Nemotron coalition is strategically the week’s most important decision: by joining Nvidia as a founding member, alongside Perplexity, Cursor, LangChain and Black Forest Labs, Mistral co-develops the foundation models on which the rest of the global industry will build its applications.

This is a fundamental shift in status: from model vendor to co-architect of global open AI standards.

The sovereignty paradox deserves to be named clearly: the DGX cloud on which Nemotron 4 will be trained belongs to an American company subject to US extraterritorial regulations.

Mistral is aware of this and is simultaneously investing in its own computing capacity in France and Sweden via Mistral Compute (born from the acquisition of Koyeb in February 2026), so that this partnership remains a strategic choice rather than an imposed constraint.

Architecture Mixture-of-Experts de Mistral Small 4 : lattice hexagonale lumineux rayonnant depuis un nœud orange central sur fond noir

Our in-depth analysis of Mistral’s digital sovereignty strategy details how the startup navigates between global ambition and European roots.

For French SMBs, the reality is direct: analysts estimate that from-scratch training remains realistic only for large enterprises with strong AI teams, significant budgets and specific data advantages.

Fine-tuning Small 4 and integrating via the Mistral API will remain the most relevant and cost-effective path for the majority of organizations.

But here’s what these three announcements change for the architecture of AI in Europe: they prove that a credible alternative exists, that it can keep pace with American giants on innovation, and that it’s building a coherent offering that spans from a lightweight model (Small 4) all the way to a sovereign training platform (Forge).

That’s the difference between an actor that survives and one that shapes a global market.

Conclusion

In a single day at GTC 2026, Mistral demonstrated that it has understood something many AI startups miss: one product is not enough to build an industrial infrastructure.

You need a model for the mass market (Small 4), a platform for sovereign use cases (Forge), and R&D that proves technological depth (Leanstral).

The questions around the Nvidia paradox deserve to be tracked over time: open source doesn’t guarantee independence when the training infrastructure remains in the hands of an American player.

For now, the demonstration lands: Mistral is playing in the big leagues, on its own terms.

Try Mistral Small 4 directly on Hugging Face and tell us in the comments whether the reasoning_effort parameter concretely changes your workflow.

FAQ

What is Mistral Small 4 and how does it differ from Small 3?

Mistral Small 4 unifies reasoning, multimodal and agentic code in a single Mixture of Experts architecture (119B total, 6B active) with a 256,000-token context window, 40% lower latency than Small 3, and the reasoning_effort parameter to modulate reasoning depth on the fly.

Is Mistral Forge fine-tuning?

No: Forge is from-scratch training, meaning the construction of a complete model from the ground up on an organization’s internal data, as opposed to fine-tuning (adapting an existing model) and RAG (connecting an external document store).

Who are the first Mistral Forge partners?

Launch partners include ASML (semiconductors), Ericsson (telecoms), the European Space Agency, Reply (consulting) and Singapore’s government agencies DSO and HTX, all sectors where data cannot leave internal infrastructure.

What is Lean 4 and why does Leanstral matter?

Lean 4 is a formal proof assistant that generates the mathematical proof that code is correct for every valid input, and Leanstral automates this generation at 6B active parameters, making accessible to engineering teams a practice previously reserved for rare specialists.

How should I interpret Leanstral’s FLTEval benchmarks?

FLTEval measures performance on real pull requests from the Fermat’s Last Theorem formalization project: Leanstral (pass@2) scores 26.3 points for $36 vs. 23.7 points for $549 with Claude Sonnet 4.6, but this is an in-house benchmark comparing a specialist to generalists, which calls for caution.

What is the Nemotron coalition and why did Mistral join it?

The Nemotron coalition brings together Mistral, Perplexity, Cursor, LangChain and Black Forest Labs to co-develop Nemotron 4, an open-source foundation model trained on Nvidia’s DGX cloud, giving Mistral a role as co-architect of global open AI standards.

Does the Nvidia partnership create a dependency for Mistral?

It’s a real tension: the DGX cloud on which Nemotron 4 will be trained belongs to Nvidia (an American company subject to US extraterritorial regulations), which is why Mistral is simultaneously investing in its own computing capacity in France and Sweden via Mistral Compute.

Can Mistral Small 4 run locally in my company?

Yes: Small 4 is available as an Nvidia NIM container for on-premise deployments and for direct download on Hugging Face under Apache 2.0 license, with a minimum of 4 HGX H100 GPUs required, making it an enterprise-infrastructure-only model.

What’s the difference between Mistral Forge and AWS Nova Forge?

AWS Nova Forge is cloud-centric (data transits through Amazon), while Mistral Forge offers full on-premise deployment with Mistral engineers embedded at client sites (the Palantir/IBM model) and full ownership of the resulting model.

Is Mistral Forge for SMBs?

Not right away: from-scratch training requires significant budgets and expert AI teams, making it a tool for large enterprises and regulated sectors; for SMBs, fine-tuning Small 4 via the Mistral API remains the most cost-effective path.

Mistral strikes three times at GTC: why it changes everything for European AI