A Harvard-led study published in the journal Science found that OpenAI’s o1 preview model surpassed the diagnostic accuracy of emergency room doctors. Researchers tested the model on clinical reasoning tasks using real-world cases from a Boston hospital.
The AI achieved 67% accuracy in initial triage diagnoses across 76 actual ER cases. Human doctors recorded an accuracy rate between 50% and 55% in the same scenarios.
The model’s advantage appeared most significant during early diagnostic stages where information remained limited.
The study relied exclusively on text records without the visual or auditory cues available to physicians. Authors emphasized that the technology aims to augment rather than replace medical professionals.