Skip to main content
Journal Club5 min read

AI in the Waiting Gap: What One Caregiver's Case Study Does and Doesn't Show

A family used a chatbot to make sense of an MRI report in the weeks before a surgeon could see them. The account is honest and human — and it is one self-reported case, with no comparator and no follow-up. Worth reading for the gap it names, not the method it proves.

Dr. Sven Jungmann

Dr. Sven Jungmann

CEO

Editorial collage of hands holding a phone over a table of medical documents, with a navy semicircle suggesting an empty appointment slot and a single amber accent.

Sixty-five days. That is how long this family's metropolitan area told them they would wait for a first appointment with a neurosurgeon. By then the MRI was already done, its report already sitting in the patient portal in the dense register radiologists write for one another — a man who weeks earlier had played a three-day golf tournament was now crawling up his own stairs. Report in hand, no clinician available to read it with them, the patient's wife stripped the identifiers out, pasted it into a chatbot, and asked four plain questions. The answer came back in seconds. That interval, and what one family did inside it, is the whole of a case study in the Journal of Participatory Medicine — and the smallness is the point.

The author is Mary Beth Schoening of RampUp Health, writing as the caregiver herself; her co-author, the physician Dustin Cotliar, supplies clinical commentary but had no part in the care. The paper is explicitly an opinion piece — not grant-supported, no conflicts declared, and no primary data of its own. It is one family's first-person account paired with a list of lessons the authors draw from it. Keeping those two registers apart is the first thing a careful reader owes the piece.

The gap nobody is designing for

Set aside the US wait times — 31 days on average across the largest metropolitan areas, 65 in this one, neither of which transfers cleanly to Germany — and a structural observation survives. Between the moment a result becomes visible to a patient and the moment a clinician can sit down and interpret it with them, there is an interval the system rarely owns. Portals have made results visible faster; they have done nothing about that interval, which now opens earlier and stretches longer at the patient's end. The authors note, fairly, that patients overwhelmingly want their results the moment they exist. The waiting starts there, in a language they cannot read, and no one has designed for what happens next.

Into that vacuum the family put four moves: translate the MRI report into lay terms, research the two surgical options on the table — a laminectomy or a spinal fusion — assemble a one-page summary from portal, radiology and personal notes, and generate a prioritised list of questions for the consultation. The de-identified one-pager went to six recipients: the primary care physician, two physiotherapists, the insurer, a hospital access nurse handling the scheduling, and the surgeon. Schoening reports it saved time and, in her judgement, brought the operation forward, in part because the surgeon received the family's unfiltered account of how fast things had deteriorated.

The one detail that teaches the most

Schoening also writes that she cried on and off for two days after the chatbot told her — before any clinician had — that her husband would probably need spinal surgery. This is the most instructive line in the paper, and it cuts both ways. The blow landed at the kitchen table rather than in the consulting room. Underneath sits a real clinical fact: people absorb very little of what is said in an appointment delivered under shock. The authors cite a figure — neurosurgery patients recalling, on average, 24.8 percent of the information a day later — but it comes from a single study of forty-four patients, not from this case, and should be read as the secondary citation it is. The plausible reading is that processing bad news beforehand frees the appointment for the questions that genuinely need a surgeon. Plausible is as far as one family can take it; the case cannot rule out a motivated, digitally fluent household and ordinary chance.

A single favourable anecdote, told by the person it favoured, is the weakest tier of evidence — which is exactly why it is worth reading carefully rather than dismissing.

Where the evidence stops

Everything that makes the account compelling also bounds it. One family, self-selected and self-reporting, with no comparator and no follow-up beyond their own telling. No one checked the chatbot's translation against the radiologist's intent in any structured way, so we cannot say it was accurate. We cannot attribute the earlier surgery to the summaries rather than to any of a dozen other factors. A single favourable anecdote, recounted by the person it favoured, is the weakest tier of clinical evidence — no criticism of the authors, who claim nothing more, but a brake on the inference a busy reader is tempted to make.

To their credit the authors are blunt about the failure modes. Hallucination first: fluent, confident output that is simply wrong, the more dangerous for sounding right. Their remedy is a set of validation habits — compare more than one tool, check the cited sources, cross-reference trusted health sites, ask someone who knows. Privacy second: consumer language models are not GDPR- or HIPAA-grade stores, so every identifier — including the treating clinicians' names — has to come out before anything goes in. The third lesson is the one that should unsettle anyone the story uplifts. The patients who would gain most from this kind of navigation — older, less digitally confident, not native speakers, unable to pay for a subscription — are the least equipped to do what this family did. A tool that most helps the already-capable can widen the very gap it appears to close.

So the case is a proof of concept for one resourceful family bridging the waiting gap by hand. It is not evidence that the gap should be bridged this way for everyone, and it is silent on the people for whom it would not work at all. The serious question it leaves open is not whether patients will reach for these tools in the interval — they already do — but who is accountable for the quality and the equity of what they find when no clinician is in the room.

Source: Schoening MB, Cotliar D. Patients and Caregivers Leveraging AI to Improve Their Health Care Journey: Case Study and Lessons Learned. J Particip Med 2026;18:e69790. A single self-reported caregiver case study with clinician commentary — an opinion paper with no comparator, no follow-up and no primary data, the lowest tier of clinical evidence — and the appraisal above weighs it as such.

#Journal Club#Patient Engagement#Generative AI#Health Literacy#Evidence-Based Medicine

Keep reading

Editorial collage of a smartphone with a blank teal screen lying on an empty hospital bedside table, with a single amber accent at the screen's edge.
Journal Club

The Best App in the World, and No One on the Ward to Use It

Twenty clinicians explain why good mental-health apps never reach patients. The obstacle is almost never the technology. It is whose job it is to introduce the tool, watch the alerts, and answer when something looks wrong — questions no software answers.

Dr. Sven JungmannCEO
Editorial collage of an older person's wrist with a plain band rendered as a teal arc, faint activity waveforms below, and one amber dot marking a single external validation link.
Journal Club

Wearables and Dementia: A Strong Signal on Thin Validation

Forty-nine studies suggest disturbed sleep and activity shadow cognitive decline by years. Only three tested their model outside the lab that built it. The signal is real; the case that it works as a screening tool is not yet made.

Dr. Sven JungmannCEO
Editorial collage of a recovery-room patient's hand on a bedrail framed by a teal circle, with twenty-eight uneven navy bars behind it and one amber stripe standing apart.
Journal Club

An AUROC of 0.805, Sitting on 97 Percent Heterogeneity

Twenty-eight machine-learning models claim to predict delirium after heart surgery. Pooled, they look clinically useful. Read the validation methods and the heterogeneity, and the single number stops meaning what it appears to.

Dr. Sven JungmannCEO

This analysis comes from the people behind Visite.

Our weekly newsletter on AI in medicine. Every Friday, rigorously checked.

By signing up you agree to receive Grand Rounds by email. Unsubscribe anytime. More in our privacy policy.

Want to see this in your hospital?

30 minutes. Your questions. Our physician-founder shows you the platform personally.

Book a demo

No commitment. No sales pitch. Physician to physician.