Simulation is the Lab: Building Safe, Measurable AI for Mental Health Care

Oct 31

Christina Farr’s recent Second Opinion piece compared patient-facing AI to self-driving cars and said healthcare should “follow Waymo’s lead.” It’s a good analogy, and in mental health care we can make it even more concrete by looking at how self-driving development worked.

Self-driving software advanced by logging unlimited hours in a virtual simulated environment, where it could be refined and tested before cars ever touched real roads. It also had a shared definition of quality: traffic laws. Those two pieces made safety measurable and progress possible.

If AI is going to earn trust in mental health, it needs the same foundations. Simulation to test safely at scale, and open, public, agreed-upon benchmarks for what “good care” means, just as traffic laws define good driving. Together, those give us the tools to measure progress and make safety real.

We’re building that layer now. But this can’t be done in isolation. Self-driving wouldn’t have worked if every company made up its own traffic laws, and the same is true here. At Wave, we want to help build shared benchmarks and realistic simulated patient environments so teams across mental health AI can test, learn, and iterate fast, safely.

That’s the foundation we need before real-world deployment. In self-driving, teams started with millions of simulated miles to refine software with zero risk. Then they ran closed-course tests under human supervision. Next came very limited public routes, expanding the radius as confidence grew. Apply the same layers here. Full autonomy, where AI runs a clinical conversation end-to-end still belongs in the simulator.

We’re not in the self-driving era yet. We’re in adaptive cruise control. Models can help with steering inputs like note writing, data search, or summarizing risk factors, but the human has to stay in the loop. Each step forward should teach us something new about context, safety, and failure modes.

Plenty of open questions here, and no analogy maps perfectly. We're working through them in public because this feels too important to build entirely behind closed doors.