Skip to main content
Journal Club5 min read

What Actually Makes a Patient Trust a Medical AI

When patients decide whether to use an AI-enabled device, the most persuasive fact is not the accuracy figure or the privacy policy. A survey experiment measured what does move them — and the answer is humbling for anyone who builds the technology.

Dr. Sven Jungmann

Dr. Sven Jungmann

CEO

Editorial collage of a patient's hand holding a printed device label framed by a teal rectangle and a navy approval seal, with a single amber accent on one line.

Tell a patient that an AI-enabled device has been independently validated on an external dataset, and you have moved their trust by 2.76 percentage points — a rounding error. Tell the same patient that a regulator approved it, and trust jumps by 19.3 points. Those two figures come from the same experiment, asked of the same people, about the same fictional device. The thing a methodologist would put first barely registers; the thing a layperson can grasp does almost all the work. That inversion is the finding worth sitting with.

The study, published in January 2026 by teams at Mayo Clinic and the Yale School of Medicine, asked 340 US adults to weigh simulated one-page labels for a hypothetical AI-enabled cardiovascular device. It pairs a discrete choice experiment with a single-profile factorial experiment — participants compared labels against one another and rated them in isolation. No real product was involved, no real diagnosis was at stake, and that boundary defines everything the paper can and cannot claim. It measures which printed facts patients say move them, which is a question about communication, not about whether any device actually works.

The four signals that move people

Within that frame the results are tidy. Four information elements dominated trust: regulatory approval (+19.3 percentage points), high device performance (+16.6), provider oversight (+15.5), and the device's added value over usual care (+14.1). Acceptance — the willingness to actually use it — followed the same quartet, except provider oversight led there (+17.9). A second analysis scored by odds ratios reached the same conclusion. Against this, the technical assurances came off poorly: external validation barely moved trust at 2.76 points, an opt-in privacy default added a modest 9, and information about whether safety monitoring was proactive or reactive moved nothing measurable at all. About 72 percent of participants expressed at least some intention to use such a device, and the great majority found the labels easy to read and to understand.

The logic is obvious once the numbers are on the table. Patients anchor trust to signals legible from where they stand: a human stays responsible, an authority has vouched for it, the thing demonstrably helps. The guarantees prized by the people who build and regulate these systems — how a model was validated, how data are protected — fade into the background. This is not patients getting it wrong. It is a clean reminder that trust and quality are separate quantities, and the label that earns the most of the first is not necessarily fixed to the most of the second.

Persuasion is not calibration

Here is the line a careful reader must hold. Because every label described the identical fictional device, the study can only tell you which words raise confidence — never whether that confidence is deserved. Read carelessly, the result that approval is the strongest lever becomes an instruction to open every patient conversation with the approval stamp, including for products whose approval certifies little about real-world accuracy. The honest reading runs the other way: precisely because approval carries outsized weight, one is obliged to know what a given approval actually attests to before invoking it. A persuasive signal in untrue hands is just a more effective error.

Because every label described the same fictional device, the study can tell you which words raise a patient's confidence — never whether that confidence is deserved.

The subgroup signals point the same way and need the same restraint. Patients aged 55 and older responded far more strongly to provider-oversight information than younger ones — a 23.4-point effect against 10.3 (P<.001) — which is striking, given that this is the cohort most likely to meet AI-enabled cardiac monitoring and most in need of hearing that a human stays in the loop. Women weighted performance and regulatory information more heavily; participants identifying as people of colour weighted a device's added value more. These are genuine hints that one script does not fit everyone. But they come from a single convenience sample recruited online, nearly all insured, none facing a real decision — hypotheses for tailored communication, not established facts about populations.

The funder worth naming

One detail belongs in any fair account. The work was funded entirely by the US Food and Drug Administration, an award of $712,431. A study that crowns regulatory approval the single strongest driver of patient trust, paid for by the regulator whose approval is at issue, is a coincidence to state plainly and let the reader weigh. The authors disclose it openly; it does not invalidate a stated-preference result, but a reader should see it before drawing conclusions.

What a clinician takes from it

For German and European clinicians the practical reading is direct and unglamorous. When you discuss an AI-enabled device with a patient, the signals that land are the human ones: that you remain responsible, that a competent authority has assessed it under the Medical Devices Regulation, that it adds something over the usual path. The accuracy metrics and data-protection clauses that fill procurement files matter less to the person across the desk — which simply moves the burden onto you to have checked them yourself, so that the trust those plainer signals generate is earned rather than merely produced. Knowing what reassures a patient is worth something only after you have made sure the reassurance is true.

Source: Zhu X, Stroud AM, Minteer SA, et al. Key Information Influencing Patient Decision-Making About AI in Health Care: Survey Experiment Study. J Med Internet Res 2026;28:e75615. Funded by the US Food and Drug Administration ($712,431). A single-sample, US-only stated-preference survey experiment using simulated labels for a hypothetical device: it measures what patients report as persuasive, not whether any real device performs as claimed.

#Journal Club#Clinical AI#Patient Trust#Evidence-Based Medicine#Health Communication

Keep reading

Editorial collage of a confident stack of clinical document fragments bound by a teal bracket that stops at a closed ward door, with a single amber accent.
Journal Club

Sixty-Five Studies Agree the Models Win. The Ward Hasn't Noticed.

A PRISMA review of 65 studies finds language models consistently beat classical methods at classifying clinical text. The honest reading is narrower: it is a synthesis of single-site accuracy studies that mostly never asked whether the models work at the bedside.

Dr. Sven JungmannCEO
Editorial collage of a clinical summary sheet torn down the middle, one half framed by a teal speech bubble and the other by a navy clipboard, with a single amber dot on the tear line.
Journal Club

Two Readers, One Summary: Who Should Grade Patient-Facing AI?

A small Stanford study had clinicians and parents rate the same AI-written clinical summaries. They disagreed, significantly — and that disagreement, not the scores, is the finding worth keeping.

Dr. Sven JungmannCEO

This analysis comes from the people behind Visite.

Our weekly newsletter on AI in medicine. Every Friday, rigorously checked.

By signing up you agree to receive Grand Rounds by email. Unsubscribe anytime. More in our privacy policy.

Want to see this in your hospital?

30 minutes. Your questions. Our physician-founder shows you the platform personally.

Book a demo

No commitment. No sales pitch. Physician to physician.