Skip to main content
Journal Club5 min read

A Few Hundred Bad Records: What the Data-Poisoning Paper Actually Claims

An analytical synthesis argues that poisoning a medical AI scales with the absolute number of tampered records, not their share of the dataset. The reasoning is sound and worth knowing. But it is a threat model, not a measured event — and that distinction is the point.

Dr. Sven Jungmann

Dr. Sven Jungmann

CEO

Editorial collage of a vast uniform field of small chest X-ray fragments with one mismatched tile being placed by a hand, marked by a single amber dot.

Most of us carry a quiet assumption about large training sets: that size is a form of safety. A model fed ten million records, the intuition goes, must be harder to corrupt than one fed ten thousand — there is simply more good data to drown out the bad. A new analytical paper in the Journal of Medical Internet Research argues that this intuition is, for an attacker, almost irrelevant. What matters is not the share of the data that has been tampered with. It is the raw count. And the count that can do real damage is startlingly small.

The figure the authors keep returning to is 100 to 500 manipulated records. Synthesising prior security research, they report that an attacker who can slip that many poisoned samples into the data behind a medical AI can compromise it, with success rates above 60 percent in general and, for medical imaging models, between 70 and 95 percent. Crucially, that range barely shifts whether the surrounding dataset holds ten thousand records or ten million. A larger corpus is not a thicker wall.

First, what kind of paper this is

The genre matters here, because it sets the ceiling on how much any single sentence can bear. This is not an experiment. Farhad Abtahi and colleagues at the Karolinska Institutet, with partners at the Universidad Politécnica de Madrid, attacked no clinical system. They reviewed 41 security studies published between 2019 and 2025 and assembled an analytical threat-modelling framework, illustrated with eight worked scenarios spanning imaging models, clinical language models, scheduling agents, federated learning, and organ-allocation systems. The empirical numbers are inherited from that prior literature — much of it on non-medical models — and the medical scenarios are constructed projections. The authors are unambiguous about this, in the text and in their tables.

The claim that travels

The absolute-number argument is the part worth carrying away, and it is mechanistically plausible. Across the architectures reviewed — convolutional networks for imaging, large language models, reinforcement-learning agents — the cited studies report that attack success tracks the number of poisoned examples, not their proportion. The figures run from 100 to 500 samples for imaging models down to as few as 100 examples to undo a language model's safety alignment. The reasoning is not exotic: a learning system can acquire a narrow, reliable association from a small but internally consistent set of examples, and a sea of benign data does not dilute a signal that is coherent in itself.

The paper's more original move is about visibility rather than feasibility. Privacy law — the General Data Protection Regulation in Europe, its US analogues — exists precisely to prevent the cross-institutional pooling of patient records. But that same barrier is what would let an auditor spot a subtle, distributed manipulation in the first place. The rules that protect patients can, as a side effect, blind the people who might otherwise catch a slow campaign. The authors put the resulting detection delay at six to twelve months, longer in federated or privacy-constrained settings. That number is a reasoned projection, not a measurement — but the tension it names, between auditability and confidentiality, is genuine and carefully argued.

Where the evidence stops

This is where a careful reader has to honour the line the authors themselves draw. The vivid cases are the ones that stay with you: a radiology model quietly missing cancers in one demographic after roughly 250 tampered images — 0.025 percent of a million-image set — or an organ-allocation model drifting into systematic bias over years before the harm becomes statistically legible. Both are labelled, in the paper's own tables, as threat-modelling projections, not documented incidents. They exist to make a mechanism visible, and they do that job well. They are not evidence that such an attack has occurred, nor an estimate of how likely one is. A threat that is feasible in principle and a threat that is happening are different objects.

A threat that is feasible in principle and a threat that is occurring are different objects — and this paper is scrupulous about not confusing the two.

The authors' own limitations are the honest ones. They ran no original attacks on production systems. Their literature was English-language only, with the selection bias that implies. The empirical studies they lean on examined models of up to 13 billion parameters, while clinical foundation models now reach 100 billion and beyond — so the extrapolation to the largest models, they concede, still needs empirical confirmation. And the defences they sketch — monitoring for disagreement across model ensembles, adversarial testing, logging that is auditable yet privacy-preserving — have not been validated in any prospective clinical setting. Cite this as a well-reasoned argument about a plausible risk, not as a measured rate of harm.

Why it matters here

For European systems the relevant detail is concrete. The European Health Data Space is built to connect health data across 27 member states — on the order of 450 million people — with federated learning among its intended mechanisms. A federated model trained across that space inherits the trust assumptions of every contributing node. In the authors' projection, control over the data feeds of three to five member states — eleven to nineteen percent of participants — could in principle shape a shared model while leaving each national dataset looking unremarkable. That is a design question worth raising while the architecture is still open, not a cause for alarm. The usable takeaway is narrow: when a clinical AI is evaluated, robustness against deliberately corrupted training data belongs on the list of questions — and the size of the dataset is not an answer to it.

Source: Abtahi F, Seoane F, Pau I, Vega-Barbas M. Data Poisoning Vulnerabilities Across Health Care Artificial Intelligence Architectures: Analytical Security Framework and Defense Strategies. Journal of Medical Internet Research 2026;28:e87969. An analytical threat-modelling synthesis of prior security research, funded by the SMAILE core facility at the Karolinska Institutet (no competing interests declared); it presents no original experiments, and its medical attack scenarios are explicitly hypothetical rather than documented incidents.

#Journal Club#Clinical AI#AI Security#Data Poisoning#Evidence-Based Medicine

Keep reading

Editorial collage of a confident stack of clinical document fragments bound by a teal bracket that stops at a closed ward door, with a single amber accent.
Journal Club

Sixty-Five Studies Agree the Models Win. The Ward Hasn't Noticed.

A PRISMA review of 65 studies finds language models consistently beat classical methods at classifying clinical text. The honest reading is narrower: it is a synthesis of single-site accuracy studies that mostly never asked whether the models work at the bedside.

Dr. Sven JungmannCEO
Editorial collage of a clinical summary sheet torn down the middle, one half framed by a teal speech bubble and the other by a navy clipboard, with a single amber dot on the tear line.
Journal Club

Two Readers, One Summary: Who Should Grade Patient-Facing AI?

A small Stanford study had clinicians and parents rate the same AI-written clinical summaries. They disagreed, significantly — and that disagreement, not the scores, is the finding worth keeping.

Dr. Sven JungmannCEO

This analysis comes from the people behind Visite.

Our weekly newsletter on AI in medicine. Every Friday, rigorously checked.

By signing up you agree to receive Grand Rounds by email. Unsubscribe anytime. More in our privacy policy.

Want to see this in your hospital?

30 minutes. Your questions. Our physician-founder shows you the platform personally.

Book a demo

No commitment. No sales pitch. Physician to physician.