GPT-Rosalind: OpenAI's vertical AI for biology

On April 17, 2026, OpenAI launched GPT-Rosalind, its first reasoning model designed for biology, drug discovery, and translational medicine.

The name honors Rosalind Franklin, whose X-ray crystallography made the structure of DNA visible.

The English-speaking press immediately referred to it as a “direct competitor to AlphaFold 3,” which is technically incorrect.

This confusion could cost several days for French-speaking R&D teams who need to decide what to test this week.

This guide breaks down the announcement: what the model does, the published benchmarks, its relationship with AlphaFold 3 and Chai-1, and the decisions a French biotech CTO needs to make.

In brief

GPT-Rosalind orchestrates, doesn’t fold proteins: it reads literature, formulates hypotheses, plans experiments, and delegates 3D structure to AlphaFold 3 or Chai-1
Quantified benchmarks: BixBench 0.751 vs 0.732 for GPT-5.4 and 0.698 for Grok 4.2, 6 families beaten out of 11 in LABBench2, 95th human percentile at Dyno Therapeutics
Free preview but closed to the French: access reserved for qualified US companies via the Trusted Access Program, mandatory biosafety verification
Fine-tuning to skepticism: first time OpenAI delivers a model trained to reject weak targets rather than comply
Two paths for French-speaking R&D: wait for EU access or switch to the ESM3 stack plus Chai-1 plus self-hosted Boltz-1

What OpenAI announced on April 17, 2026

OpenAI positioned GPT-Rosalind as the first model in a vertical series dedicated to life sciences

The announcement was relayed the same day by FierceBiotech, Pharmaphorum, and The Next Web

The model is trained to reason about biology, not to generate 3D protein structures

Public partners include Amgen, Moderna, Thermo Fisher Scientific, Novo Nordisk, Broad Institute, Allen Institute, Los Alamos, and UCSF School of Pharmacy

NVIDIA provides the computing power, Benchling ensures connection with digital lab notebooks

Séan Bruich, SVP AI at Amgen, summarizes the logic: accelerating therapeutic timelines by applying advanced tools to new tasks

Joy Jiao, head of Life Sciences Research at OpenAI, tempers: the model is designed to speed up time-consuming phases, not to replace researchers

A reasoning model, not a structural predictor

The modern pharma R&D chain stacks several specialized models that speak different languages

At the top of the stack, a scientific reasoning engine reads articles and plans steps, while below, structural predictors like AlphaFold 3, Chai-1, or Boltz-1 transform a sequence into 3D atomic coordinates

GPT-Rosalind occupies the first tier, not the second

What GPT-Rosalind does

The model is calibrated for five specific tasks listed by OpenAI:

Scientific literature synthesis: compiling hundreds of PubMed articles and extracting converging evidence
Mechanistic hypothesis generation: proposing biological pathways ranked by robustness
Experimental planning: CRISPR protocols, cloning, and cell assays with reagent selection
Multi-omics interpretation: reading transcriptomic, proteomic, and metabolomic datasets
Target validation: ranking a therapeutic pipeline based on feasibility criteria

What it doesn’t do

GPT-Rosalind does not generate 3D atomic coordinates

It doesn’t fold proteins, design de novo molecules, or replace traditional docking

OpenAI refers to AlphaFold 3 and Chai-1 for these tasks, called as external tools

The common mistake is to present the model as an alternative to AlphaFold 3: they coexist, they don’t replace each other

The Codex plugin for life sciences

OpenAI has delivered a free plugin for Codex

It connects mainstream models to over 50 scientific tools: PubMed, ClinicalTrials.gov, UniProt, ChEMBL, AlphaFold Atlas, and STRING-DB

For a lab without GPT-Rosalind access, this plugin is already a usable tool

It fits into the logic of reasoning inaugurated with o1, applied to a highly regulated field

Published benchmarks and the “default skeptic” bar

Three evaluations anchor the model’s promises: BixBench, LABBench2, and an industrial collaboration with Dyno Therapeutics

The figures come from OpenAI and have been confirmed by FierceBiotech, with the useful caveat of a training bias on public evaluations

BixBench and LABBench2: public measurement

BixBench is a bioinformatics benchmark maintained by Edison Scientific: 53 analysis scenarios, 296 questions, agent placed in front of an empty Jupyter notebook

On this bench, GPT-Rosalind scores 0.751 pass@1, ahead of GPT-5.4 at 0.732, GPT-5 at 0.728, Grok 4.2 at 0.698, and Gemini 3.1 Pro at 0.550

The gap with the generalist peak is 2 points, not ten

LABBench2 weighs 1,900 tasks in 11 families, and GPT-Rosalind beats GPT-5.4 in 6 families, with the biggest gain on CloningQA

The Dyno Therapeutics test: the real signal

The most cited measure comes from an evaluation at Dyno Therapeutics, specializing in designing AAV capsids for gene therapy

Dyno provided unpublished, novel RNA sequences to avoid benchmark contamination

On sequence-function prediction, GPT-Rosalind’s top ten submissions reached above the 95th percentile of human experts

On generation, the score drops to 84th percentile, a result to be read with caution

This figure measures a specific sub-task, not the model’s overall performance on all biological R&D

Fine-tuning to skepticism

The most interesting differentiation point isn’t a number, it’s a training choice

OpenAI conditioned GPT-Rosalind to reject weak targets rather than validate by default

The model is trained to say “this hypothesis lacks evidence, I don’t validate it” instead of crafting a pleasing response, a stance that saves three months of useless deliverables for a team validating a poorly framed brief

In a context where a poorly prioritized target can cost over $2 billion in the clinical cycle, shifting from a compliant model to a disagreeable one has direct economic value

This is the first time OpenAI publicly delivers this stance

Translucent panels stacked evoking a bio-AI stack in cross-section, clinical blue light on a dark editorial background

Mapping bio-AI verticals: GPT-Rosalind orchestrates, others execute

Useful analogy: the 2026 bio-AI stack works like a starred kitchen

Structural predictors are the station chefs, GPT-Rosalind is the brigade chef who sequences the service

Each has its specialty, none replaces the others

Proprietary and open source structural predictors

AlphaFold 3 remains the structural reference: 3D prediction of protein-ligand-DNA-RNA complexes, 50% gain on PoseBusters compared to AF2

Isomorphic Labs has signed deals totaling nearly $3 billion with Eli Lilly and Novartis

Evo 2 addresses long genomics: 40 billion parameters, 1 megabase context, useful for regulatory regions

ESM3 unifies sequence, structure, and function in a single multimodal model

Chai-1 replicates AlphaFold 3 quality in open source, with or without multiple alignment

Boltz-1 is even more hackable, training and architecture published, currently the most transparent model of the lot

Roles in an R&D workflow

Reading grid for an R&D manager:

GPT-Rosalind: orchestrator, literature synthesis, hypotheses, planning
AlphaFold 3: 3D structure of complexes, restricted pharma access
Evo 2: long genomics open source up to 1 Mb
ESM3: multimodal protein design with constraints
Chai-1 and Boltz-1: open alternatives to AlphaFold 3, locally installable

A biotech CTO announcing to their board “we’re switching from AlphaFold 3 to GPT-Rosalind” makes an architectural grammar mistake: it’s more about switching from “ad hoc scripts” to “orchestrator that properly calls AlphaFold 3”

Restricted access, unclear pricing, sovereignty, and biosafety

The economic and legal aspect, little covered by the French press, determines what a French biotech SME can do this week

The preview: free, US Enterprise, safety review

During the research preview, using GPT-Rosalind consumes neither tokens nor credits

Access is conditioned on four cumulative criteria:

Enterprise with Enterprise contract: ChatGPT, Codex, or API Enterprise
US headquarters or legal entity: EU structures are not supported
Legitimate biological research use oriented towards human health
SOC 2 Type 2, HIPAA, BAA controls available, biosafety audit

OpenAI has not published any post-preview pricing, and the reasonable assumption remains a custom Enterprise license

FR/EU sovereignty and biosafety

For a French-speaking lab, the equation is clear: no direct access before EU opening

Available routes are limited to detours via a qualified US partner, typically an Amgen subsidiary or a Broad Institute collaborator

Regarding GDPR, health data falls under Article 9, and contracting must provide for a transfer outside the EU

More than 100 researchers have signed an open letter calling for stricter control of sensitive biological data

OpenAI responded with a system of “high-precision flags” that triggers an alert as soon as dual-use thresholds are crossed

Two diverging light paths emerging from a bank of clouds, one amber and the other blue, contemplative atmosphere

Three concrete decisions to make this week

The French-speaking reader has three likely profiles, each with a different decision to make

Biotech SME and academic lab

Concrete case: a 12-person French biotech working on an anti-colorectal cancer antibody

Before: three weeks of cross-referencing literature and hypothesis meetings

With GPT-Rosalind via a Broad Institute partnership: 2 hours for 40 papers synthesized, 6 hypotheses ranked, and a validation protocol

The ROI depends on the entry cost into a US partnership, often with shared intellectual property

For an INSERM or CNRS lab without a US subsidiary, hosting ESM3 on the Jean Zay HPC remains viable and preserves data sovereignty

At Owkin or InstaDeep, the decision has already been made: continue investing in the proprietary open source stack, and monitor GPT-Rosalind for literature synthesis, not for patient data

French-speaking AI dev: what return on investment

For a freelancer or an IT service company building biotech consulting offers, adding a GPT-Rosalind workflow involves two investments

The first: mastering the free Codex plugin, accessible without the Trusted Access Program

The second: setting up a Chai-1 or Boltz-1 demo in parallel, to offer the open source version to clients blocked by US access

The hybrid stack of Codex plugin plus local open source covers 80% of use cases for a biotech SME without dependence on trusted access

Training represents about 5 days of ramp-up for a data engineer already comfortable with LangChain

Forty-eight hours after launch, three signals deserve attention: the publication of the first independent reviews, the announcement of an EU opening, and a public post-preview pricing

The strategic question remains: when is a vertical model better than a generalist plus in-house RAG?

For further reading, see the 2026 technical guide to RAG applied to research

Frequently asked questions about GPT-Rosalind

Is GPT-Rosalind a direct competitor to AlphaFold 3?

No, GPT-Rosalind reasons and AlphaFold 3 predicts 3D structures, they coexist in an R&D stack where the former calls the latter as a tool

Can I access GPT-Rosalind from France today?

Not directly, access is reserved for qualified US companies via the Trusted Access Program, a French lab must go through a US partnership or wait for EU opening

How much does GPT-Rosalind cost?

During the research preview, usage does not consume tokens or credits for approved companies, no public post-preview pricing has been announced

Who are the announced partners?

Amgen, Moderna, Thermo Fisher, Novo Nordisk, Allen Institute, Broad Institute, Los Alamos, NVIDIA, Oracle Health, Benchling, UCSF School of Pharmacy, and Retro Biosciences

What does “fine-tuning to skepticism” mean?

OpenAI trained the model to reject weak targets or under-documented hypotheses rather than crafting a pleasing response

On which tasks does GPT-Rosalind outperform GPT-5.4?

On BixBench at 0.751 versus 0.732, and on 6 of the 11 LABBench2 families with the biggest gain on CloningQA

Can the model generate a new protein?

Not directly, de novo generation remains the domain of ESM3 or Chai-1, GPT-Rosalind can frame the request and call these tools

What open source alternatives exist today?

For 3D structure Chai-1 and Boltz-1 installable locally, for protein design ESM3, for long genomics Evo 2

What is the dual-use risk mentioned by researchers?

More than 100 scientists are calling for stricter control of training data due to fears of misuse on pathogens, OpenAI responded with high-precision flags

Which published figures are independently verifiable?

BixBench and LABBench2 scores are based on consultable Edison Scientific benchmarks, the Dyno test uses unpublished sequences and remains to be corroborated

GPT-Rosalind: OpenAI’s first vertical AI model for life sciences