The Chan Zuckerberg Initiative and NVIDIA are expanding their collaboration to accelerate development of AI-powered virtual cell models, committing to GPU-accelerated processing of petabyte-scale biology datasets and adding open imaging and RNA models plus shared benchmarks to CZI’s Virtual Cells Platform.
The effort advances multiple strategies to relieve persistent bottlenecks in biological research: scale and harmonize biology datasets toward billions of cell-level observations; train larger, multimodal models across molecular, cellular and tissue scales; and make models, datasets and evaluations broadly usable in one place. That unified access point includes CZI’s Virtual Cells Platform, NVIDIA’s MONAI imaging models, the CodonFM RNA foundation model, and the new cz-benchmarks package for consistent comparisons. A fourth priority tackles the sector’s biggest constraint: opening data where possible, including negative results, to counter selective or siloed sharing.
For CZI, the effort is foundational to its mission of curing, preventing or managing all disease by the end of the century. Patricia Brennan, VP of science technology and general manager of science, explained the vision: “We believe that AI, at the intersection of really understanding fundamental biology, is key to that. The more we can simulate those different aspects of the cell, we believe, the more we’ll have a better understanding of accelerating our understanding of the mechanisms and the pathways of disease.”
What’s shipping today
What’s actually holding back virtual cell models
In an interview with R&D World, CZI’s Patricia Brennan and NVIDIA’s Rory Kelleher highlighted the data bottleneck slowing AI in biology, the shift from papers to platforms, and the role of openness across research and industry. Here are a few highlights:
On the data bottleneck
“Generation and harmonization of large biological datasets have proven to be a bottleneck for AI applications for science.”
(Patricia Brennan, CZI)
“The community is very eager to share, but getting the data into consistent, usable forms is not easy.”
(Patricia Brennan, CZI)
Platform as ecosystem
“The shortest time possible from when a researcher is ready to share to when another can use it is better, rather than having to contact them and ask for data.”
(Patricia Brennan, CZI)
“It is one thing to publish libraries and models and hope people find them. It is another to put them on the shelves where researchers already go for data, tools and other models.”
(Rory Kelleher, NVIDIA)
Why openness matters
“At the intersection of AI and biology, we see open as really important. If you are a startup or pharma, it is not as easy to be open.”
(Patricia Brennan, CZI)
“Ecosystems, research and startups thrive on a rich set of open AI tools and open models.”
(Rory Kelleher, NVIDIA)
NVIDIA’s infrastructure posture
“NVIDIA is not a healthcare solutions company; we are an accelerated computing platform company and an AI infrastructure company. We translate challenges into computational problems.”
(Rory Kelleher, NVIDIA)
CZI and NVIDIA are releasing scaled data-processing infrastructure to handle petabytes of biological data using GPU-accelerated tooling, including RAPIDS-singlecell. The collaboration accelerates development of CZI’s existing multimodal virtual-cell models, which include rBio, GREmLN and TranscriptFormer, while making NVIDIA Clara open models available on the Virtual Cells Platform for the first time: MONAI for biomedical imaging tasks such as cryo-electron tomography, and CodonFM, an RNA foundation model designed to help optimize mRNA therapeutics. The partnership also delivers cz-benchmarks, an open Python package co-developed between the organizations that standardizes model evaluation so researchers can compare performance on common tasks without building custom test frameworks.
For NVIDIA, the collaboration represents an infrastructure play rather than a healthcare product. “We are an accelerated computing and AI infrastructure company,” said Rory Kelleher, senior director of business development for healthcare and life sciences. “We work with leaders in their domains to learn the applications of AI and the bottlenecks around it. We translate those challenges into computational problems.”
Kelleher highlighted data-processing libraries that aim to organize biological data more cost effectively, along with pathways to scale model training. NVIDIA’s framing throughout the announcement is to support CZI’s open, community resources so that these tools can be used widely by scientists.
The scaling thesis. Kelleher argues that biology is beginning to benefit from three familiar scaling levers from large language models. Pre-training helps models learn basic relationships across datasets. Post-training is a reinforcement step that aligns models with expert intent. Test-time compute allows models to think longer at inference and arrive at better answers. “We believe these three scaling laws are starting to apply in biology,” Kelleher said. “We are building the tooling and infrastructure so researchers can take advantage of them.”
Why now?

Patricia Brennan
The vision for virtual cells is laid out in a December 2024 Cell perspective, “How to build the virtual cell with artificial intelligence: Priorities and opportunities,” which outlines a roadmap for simulating cellular behavior at molecular and structural levels to understand disease mechanisms. Momentum has accelerated in recent months. “It has been super exciting to see the momentum in the last eight or nine months,” Brennan said. “I have heard a lot of people use the ImageNet moment analogy.”
That analogy points to the role of high-quality data at scale. “There is not really an internet of biology, so organizations are taking bold steps to generate it,” Kelleher said. “It is not just observational data. It is perturbational data so you can understand what happens when you knock out a gene or introduce a chemical.”
On the timing of the expanded alliance

Rory Keller
Leaders across the field describe a shift from one-off models and papers to integrated, reproducible platforms. CZI positions the Virtual Cells Platform as infrastructure that lowers the barrier for biologists and gives machine learning researchers a single destination for data, models and evaluation tools. “We are trying to move away from each development being a one-off so that science can build on science,” Brennan said. “The Virtual Cells Platform is a dissemination vehicle for the dataset, the tooling, the evaluation and the tutorial.” Additional background on the platform is available on CZI’s Virtual Cells page and the benchmarks section.
The announcement comes amid increasingly bold predictions from NVIDIA CEO Jensen Huang about AI’s expanding reach into biology. At a talk at the Stanford Graduate School of Business, Huang drew parallels between the current moment in biology and the chip design revolution of 40 years ago: “I would love for the world of biology to be at a point where it’s kind of like the world of chip design 40 years ago, computer-aided and designed, EDA—that entire industry really made possible for us today. And I believe we’re going to make possible for them tomorrow.” Huang explained that NVIDIA is now able to represent genes, proteins and even cells, getting “very, very close to be able to represent and understand the meaning of a cell, combination of a whole bunch of genes. What does a cell mean? It’s kind of like, what does a paragraph mean? If we could understand a cell like we understand a paragraph, imagine what we could do.”
In other recent public remarks, Huang has argued that the same AI architectures that understand language can learn the “meaning” of proteins and, eventually, cells. Describing an emerging future where researchers interact with cellular models through natural language, Huang said: “You can now talk to it like a chatbot. You can say, ‘Generate a cell with these properties,’ or you can ask a cell, ‘What are your properties? What can you bind to? What is your metabolism?’ Essentially, you can talk to a cell just like you can talk to a chatbot.”
Early applications of the technology show promise but await clinical validation. Industry collaborations, including Turbine’s ADC-focused Virtual Lab partnership with AstraZeneca and Ono, show that cell simulations can reproduce known biomarkers, such as SLFN11-linked resistance to the topoisomerase-I payload SN-38, and help triage drug development hypotheses. At present, prospective clinical validation of improved success rates remains limited as of October 2025, with most evidence from preclinical partnerships and case studies. Emerging open benchmarks, such as CZI’s cz-benchmarks package, aim to establish standardized evaluation frameworks that could help verify which virtual-cell approaches prove to be the most promising.




Tell Us What You Think!
You must be logged in to post a comment.