Auditable Outcomes: How We Measure Whether Wave Actually Works

May 8

Wave is a mental health platform built around coaching, measurement-based care, and integrated care navigation. This page explains how we measure clinical outcomes, what those outcomes show, and why we think auditable measurement is the right standard for digital mental health.

What auditable outcomes means at Wave

Most digital mental health platforms report engagement metrics: app opens, session completions, content views, retention curves. Engagement matters — without it, nothing else can happen. But engagement is not the same as clinical improvement, and high engagement is not evidence that a platform is helping people get better.

Auditable outcomes means something more specific: clinical change, measured with validated instruments, at regular intervals, on every member who uses the platform, with the methodology disclosed and the data reviewable.

That standard is harder to meet than reporting engagement. It requires measurement to be built into the model from the start, not bolted on later. It requires validated clinical instruments rather than proprietary "wellness scores." It requires the discipline to publish what the data actually shows, including limitations.

This is the standard Wave is built around.

What we measure

Clinical symptoms

Every Wave member is encouraged to complete the DASS-21 (Depression Anxiety Stress Scale–21) at intake and monthly throughout their engagement. The DASS-21 is a validated, peer-reviewed self-report instrument that measures depression, anxiety, and stress symptoms across a clinical severity continuum.

Where clinically indicated or based on partner requirements, we administer additional validated assessments alongside the DASS-21:

PHQ-9 (Patient Health Questionnaire-9) for depression-specific monitoring
GAD-7 (Generalized Anxiety Disorder-7) for anxiety-specific monitoring
PCL-5 (PTSD Checklist for DSM-5) for trauma-specific monitoring
Other population-appropriate validated measures as needed

These supplemental instruments do not replace the DASS-21; they support targeted clinical monitoring for members whose presentation or care pathway calls for them.

Coaching relationship quality

Every member also completes the WAI-SR (Working Alliance Inventory–Short Revised) monthly. The WAI-SR is a validated measure of the working relationship between coach and member — a strong predictor of outcomes across decades of psychotherapy and coaching research. We measure alliance because we treat the coaching relationship itself as a measurable variable, not an assumption.

Functional outcomes

We collect self-reported data on functional impact — including reductions in days unproductive at work, days absent, and impact on daily functioning. These outcomes are directly relevant to plan partners and employers evaluating return on investment, and they round out a clinical-only picture of improvement.

Equity-relevant outcomes

We stratify outcomes by demographic and socioeconomic characteristics (race/ethnicity, income band, gender identity, age, geography) so we can examine whether the platform produces equitable improvement across populations. Unequal outcomes across subgroups are a real risk in digital mental health, and not measuring for them is not the same as not having them.

What the data shows

Wave's outcomes data comes from two sources: peer-reviewed published research, and internal book-of-business analysis.

Peer-reviewed published research

Our outcomes research is published in JMIR Formative Research (Pickover & Adler, 2025). The study was a controlled engagement and outcomes analysis of Wave users between April 2023 and May 2024, comparing coaching users to app-only controls.

Key findings from the published study:

Statistically significant group-by-time interactions for depression (P=.04, medium effect size), anxiety (P=.003, large effect size), and stress (P=.03, medium effect size)
Coaching users with elevated baseline symptoms showed significantly greater symptom reduction than controls
Sample included clinically diverse users, with more than half presenting with severe or extremely severe depression, anxiety, or stress at baseline — a population typically excluded from coaching research

The study's stated limitations include sample size, non-random assignment, and the possibility of self-selection effects. The authors explicitly note that larger randomized clinical trials are essential to further establish effectiveness. We agree, and we think publishing limitations alongside findings is part of what auditable measurement requires.

The full citation: Pickover A, Adler S. Digital Mental Health Coaching in Clinically Diverse Populations: Controlled Engagement and Outcomes Study. JMIR Form Res 2025;9:e71346. https://doi.org/10.2196/71346

Internal book-of-business outcomes

In addition to the published research, we track outcomes continuously across our member population using the same validated instruments. From recent internal data:

Symptom improvement. 72% of engaged members experience clinically meaningful symptom improvement within eight weeks. The average time to meaningful improvement is 45 days – and that clock starts from day one of engagement, not from a first therapy appointment weeks away. Among members who do improve, 93% do so within six sessions or fewer.

Outcomes across demographic subgroups. We track improvement rates across race/ethnicity, income, gender identity, age, and geography to identify whether the platform produces equitable outcomes. Recent internal data shows clinically significant improvement among 65% of members identifying as racial or ethnic minorities and 54% of low-income members (under $30,000 annual household income). Underserved populations historically fare worse in mental health care. Building a platform that reaches them is part of what auditable measurement is for.

Functional outcomes. 71% of members report reductions in days unproductive or absent from work during their Wave engagement.

We're transparent about what these internal numbers are: real-world data from members who engage with the platform, analyzed against validated clinical instruments. They are not the same as a randomized clinical trial, and we describe them as what they are — internal book-of-business outcomes from operational measurement, not findings from the published study above. Together with the peer-reviewed research, they form the evidence base we share with plan partners and members.

Why engagement metrics aren't enough

Plan partners increasingly recognize that engagement metrics alone can mislead.

A platform reporting high engagement may be reporting a population skewed toward people for whom intervention was less needed. A platform reporting strong retention may be retaining members who feel attached to the product without showing clinical improvement. A platform reporting high content consumption may be measuring a wellness pastime rather than a structured behavior change program.

None of these patterns are inherently problematic — they may reflect real value to members. But they don't answer the question that matters most for plan partners and employers: is this platform actually helping members get better?

That question requires clinical measurement. And clinical measurement, done well, is harder to sell than engagement numbers because it constrains the story you can tell. You can't claim improvement you don't have. You can't paper over uneven results with overall engagement averages. You have to publish what the data shows, including the parts that are still in development.

We think this is the right tradeoff. We also think the field is moving in this direction, and that platforms unwilling to measure clinically will increasingly be left out of contracts that require auditable outcomes.

How our measurement approach differs

When evaluating digital mental health platforms on outcomes measurement, the differences that matter are usually these:

Validated instruments vs. proprietary scores. Wave uses DASS-21, PHQ-9, GAD-7, PCL-5, WAI-SR — instruments with peer-reviewed validation. Many platforms use proprietary "wellness scores" that have not been validated against clinical benchmarks.

Continuous vs. episodic measurement. Wave measures monthly throughout engagement, not just at intake and graduation. Between formal assessments, behavioral and engagement signals from the in-app experience — content interaction, skill practice, asynchronous coach messaging, completion patterns — feed into the same data infrastructure, producing longitudinal data that reveals trajectories, not just before-and-after snapshots.

Published vs. unpublished research. Wave's outcomes are published in a peer-reviewed journal. Many platforms have not subjected their outcomes claims to scientific review.

Disclosed methodology vs. summary claims. Wave shares the instruments, the timing, the sample characteristics, and the limitations. Many platforms report headline percentages without methodology that allows independent evaluation.

Equity stratification vs. aggregate reporting. Wave can report outcomes by demographic and socioeconomic subgroups. Many platforms report only aggregate numbers that can mask uneven results across populations.

Why this matters for plan partners

Plan and employer partners contracting with digital mental health vendors are increasingly required — by their own quality standards, by regulators, or by their members — to demonstrate clinical value, not just engagement. Auditable outcomes infrastructure is what makes that demonstration possible.

When a plan partner contracts with Wave, the outcomes data that comes back is anchored in validated instruments, collected at known intervals, and reportable at the population level. We can stratify by acuity at intake, demographic group, presenting concern, and care pathway. We can support analyses of parity across subgroups. We can produce month-over-month or quarter-over-quarter reporting tailored to the partnership model.

For partners pursuing value-based contracts or building outcomes-linked programs, this measurement infrastructure is the foundation that makes those contracts possible.

Why this matters for members

For members, measurement-based care isn't a reporting burden. It's part of how the work happens. Your DASS-21 score every month gives you and your coach a shared view of how you're doing — beyond what either of you could perceive from the inside of a session. It surfaces patterns earlier than narrative alone would. It supports decisions about when something is working and when to try something different.

It also gives you something most healthcare doesn't: a way to see your own progress. Members who use Wave can see their own measurement trajectories and discuss them with their coaches as part of the care itself.

Wave's broader clinical model

This page focuses on outcomes measurement specifically. The rest of Wave's clinical model — the coaching practice itself, our approach to AI in mental health, and the transdiagnostic clinical orientation that shapes what coaches actually work on — is described in our other cornerstone posts:

For health plan and employer partners

Plan and employer partners evaluating Wave's outcomes data and measurement infrastructure can reach our partnerships team at partners@wavelife.io. We share full methodology on our measurement-based care framework, our published research, our internal outcomes analyses, and our population-level reporting capabilities.

For prospective members

Members and prospective members can learn more about what working with a Wave coach looks like through our Pathways library.

Wave is a mental health platform serving members through health plan and employer partnerships. Our outcomes research is published in JMIR Formative Research (Pickover & Adler, 2025).

Wave Life