Uncovering nature’s secrets is no easy task. The daily life of a scientist is often grueling, frustrating, and—perhaps surprisingly—boring as they repeat experiments over and over.
Here’s where AI could lend a hand. This week, two studies offer a glimpse into a future where AI and scientists bounce ideas off each other and collaborate on projects to benefit humanity.
Both systems rely on large language models in end-to-end scientific discovery. They read through existing literature, generate hypotheses, suggest relevant experiments, and analyze and interpret the data for scientists to evaluate. The researchers then give the AI feedback, and the cycle begins again.
One of the systems, called Robin, was instructed to find drugs for a common eye condition. Developed by FutureHouse, a non-profit that builds AI systems to automate research in biology and other scientific fields, Robin quickly homed in on candidates. According to the team, the AI slashed research time 200-fold compared to scientists working alone.
The other system is Google DeepMind’s Co-Scientist. With human guidance, Co-Scientist found already approved drugs that could be repurposed for a type of leukemia within hours. It also surfaced promising targets for liver scarring. The system wasn’t tested in-house; it was distributed to other teams to integrate into their particular fields and workflows.
AI companies are racing to design agents that automate scientific discovery. But both teams stress their systems are collaborators, not replacements. Scientists crafted each project’s vision, checked the agent’s output, and guided its work, like a professor tutoring a bright student.
“These projects represent a significant step forwards,” wrote the editorial team at Nature, where both studies were published. “But for all the ‘wow’ factor, it is crucial to bear in mind that the AI systems were not working alone.”
Nobelist Pursuit
Scientists have a complex relationship with AI.
Nobel Prize-winning protein-prediction models have helped researchers make progress on previously undruggable targets, especially in complex diseases like cancer. Scientists are increasingly asking chatbots for help coding, writing articles, and even inspiring new ideas.
But the problem of AI slop in science is worsening: The bots are polluting scientific literature. Tens of thousands of articles in 2025 contained faulty references hallucinated by AI. Some scientists are uncomfortable with AI’s notoriously hefty energy consumption and worry over-reliance could erode cognition, judgment, and creativity. In a phenomenon called the “illusions of understanding,” AI solutions make us overestimate what we know.
Love or hate it, AI’s impact on research is growing. In the past few years, multi-agent systems, some with sophisticated reasoning abilities, are beginning to break complex problems into solvable chunks and “self-reflect” on their output.
Robin and Co-Scientist showcase this power in a cornerstone of scientific discovery: Suggesting novel, rigorous, and testable ideas when faced with real-world problems such as drug discovery.
Flurry of Ideas
Both systems use large language models to create AI agents that work semi-independently on different parts of a problem.
FutureHouse’s Robin, for example, was tasked with finding a treatment for a dry-eye disorder that’s a common cause of blindness. The agents scoured troves of scientific literature, including hundreds of thousands of open source papers, patents, and clinical trial data.
Rather than inventing a drug from scratch, the team asked Robin to repurpose existing drugs, a common strategy for speeding treatments to patients, and one particularly well suited to AI.
Robin can “consider tens of thousands of biological mechanisms…that could address the underlying cause of that disease,” study author Sam Rodriques, founder and CEO of FutureHouse, told Nature.
Armed with that knowledge, Robin took the role of research lead and recruited other AI agents to design lab experiments around potential drug candidates. In what the team called a “tournament of ideas,” the agents debated hypotheses, weighed evidence from previous studies, and selected the best for testing. The system then suggested experiments for validation.
Human scientists took over from there. They ran the suggested experiments and fed the results into another AI agent specializing in data analysis. After several iterations, Robin flagged ripasudil—a drug approved for glaucoma—as a promising candidate. The drug acts on immune cells, instead of eye cells, and hadn’t been explored for the condition. Early cell experiments were promising.
Co-Scientist works similarly but also incorporates DeepMind’s earlier experience building game-playing AI models. Faced with a scientific challenge, its agents have time to evolve hypotheses, test their reasoning, and rank ideas by plausibility and novelty.
DeepMind first released the AI in early 2025 to a small group of researchers. It’s been used by independent teams studying liver scarring, neurodegenerative diseases, and aging.
At Stanford University, for example, Gary Peltz used the system to find three promising drugs for chronic liver disease. Two worked well in the lab. One, to his surprise, was already FDA-approved for another disease. “When I saw that it was really quite striking. I kind of fell off my chair,” he said.
Beyond drug discovery, Co-Scientist has also worked on decades-old biological mysteries, like why many bacterial species share the same cluster of genes to resist antibacterial drugs. Scientists have wrestled with the problem for years; the AI system reached the same conclusion in days.
Inspiration Galore
To be clear, none of the AI-suggested drug candidates have been fully vetted. Even therapies that look promising in early cell experiments often fail once tested in the body.
Still, there’s little doubt that AI is already inspiring eureka moments.
One early Co-Scientist user, Clare Bryant who studies infectious disease at the University of Cambridge, was surprised when the system flagged a protein she’d missed. The protein intersected with biological processes she was already investigating to fight pathogens. “I spent the rest of the week itching to get back to the lab” to test the theory, she said.
Both teams took care to limit AI hallucination, where systems confidently present false or misleading information. Co-Scientist, for example, includes an internal “review board” that tests hypotheses against existing evidence to keep them grounded in reality. Meanwhile, Robin uses a built-in brake that restricts it to established knowledge and limits irrational leaps in logic.
The AI systems are already over a year old, and the field moves fast. Newer systems, such as Edison’s Kosmos, target the entire drug development pipeline. Yet even as the tools grow more sophisticated, researchers continue to stress that human oversight is essential.
“Human messiness, curiosity, and playfulness have fueled countless discoveries, and helped to inform society’s ethical frameworks,” wrote Nature’s editorial team. “AI systems might offer greater efficiency in some instances, but we don’t yet know whether greater efficiency equates to greater insight.”

