The Psychological Trap of AI in Healthcare

Written by Abigail Hodder (Editor)

There’s no doubt about it: the advent of AI brings transformative potential for human society. And nowhere is this promise more profound than in healthcare — a globally exhausted sector suffocating under the weight of ageing populations and rising demands. 

From taking over administrative tasks, to speeding up diagnostics and helping treatment decisions, AI is expected to significantly lighten the load on crumbling clinical workforces. Indeed, from a UK perspective, AI technology is integral to the government’s 10-year plan to rebuild the NHS.  

Senior leaders around the world excitedly describe this era as the “fourth industrial revolution,” paralleling the infamous 18th century-period that kick-started modern life as we know it today. 

But the industrial revolution was not without consequences. As with any change, a new way of living meant a lack of regulation as emerging cities struggled to keep pace. Alongside new-found prosperity, it brought chaos and disease to those in its midst — startled like deer in the bright headlights of industrialization.

Two hundred years on, a new wave is rising — this time digital — connecting us with knowledge and tools that once lay beyond our reach. And although AI is unlikely to bring the same challenges endured in the 18th century, this “wave” of potential drags with it a “sea” of open-ended questions and uncertainty about the future.

We Don’t Yet Know How AI Will Change the Brain

One such uncertainty lies in the impacts on the human brain. 

So far, the effects are only speculated. Scientists are inferring long-term change from short-term studies, and ones where AI is used in limited contexts. For example, a recent study from MIT, “Your Brain on ChatGPT,” found that when participants used AI to write essays, their neural activity decreased, brain connectivity weakened, and they struggled to recall what they had written just minutes earlier. The authors described this as a “cognitive debt.”

From a healthcare perspective, it’s not often that doctors find themselves having to write an undergraduate-level essay. Yet the study raises a broader concern about the far-reaching effects of AI. How, then, might these concerns translate to clinicians, the so-called “indispensable pillars of health and wellbeing” in society. 

The Psychology of AI Is a Two-Way Street

Equally important are the psychological effects of AI — harder to measure than cognition, but no less consequential. And the relationship is reciprocal: our mindset shapes how we use AI, even as AI gradually reshapes that mindset in return.

Consider two hypothetical clinicians: one highly confident in their judgment, and another less so.

The confident doctor is less likely to see AI as a rival. They treat it as an aid, willing to challenge errors, override incorrect outputs, and accept corrections without their professional identity being undermined.

The less confident doctor, by contrast, may begin to defer excessively. Each correction chips away at their self-belief, reinforcing the sense that the machine is the stronger decision-maker. Even when the AI errs, they may hesitate to question it — allowing mistakes to slip through the cracks.

We already see evidence of this happening. In the literature, it’s referred to as “automation bias” — the tendency of humans to favour automated systems’ decisions over their own. And though the second, less confident doctor might be more vulnerable to it, doctors of all calibers are at risk. 

“There’s more and more evidence that automation bias is a problem,” says Professor Susan Shelmerdine, NHS Clinical Entrepreneur, Roentgen Professor and AI Advisor at the Royal College of Radiologists. “But how much of a problem it is, and how we mitigate it is still kind of an unknown.”

Overreliance Affects Doctors of All Levels 

A 2024 study articulates the issue. Radiologists with different levels of experience — considered a proxy for confidence, which usually comes with experience —  were given mammograms that had been pre-sorted into “suspicious” or “harmless” categories by an AI system. 

When the radiologists were given mammograms that had been correctly labeled, they achieved ~80% accuracy, agreeing with the AI’s decisions in four out of five cases. 

But, when presented with a mixture of correctly and incorrectly labeled mammograms, their scores dropped. Inexperienced radiologists’ accuracy plummeted from 79.7% to 19.8%. And while experienced radiologists were more resilient to the AI’s mistakes, they still fell into the automation bias “trap.” Even though they had been reading mammograms for over a decade, their performance jumped from 82.3 to 45.5% — just because they assumed the AI’s decision was correct. 

According to Professor Shelmerdine, less-experienced doctors benefit the most from using AI. “In an unfamiliar context, the AI is better than them in the majority of cases,” she tells us. “But they’re not experienced enough to override it.” 

Shelmerdine argues that the “sweet spot” lies somewhere in the middle between lower-experienced and more-experienced clinicians: “Someone who is not so junior that they don’t know how to override it, but not so experienced that they think they know everything.” 

Why Does it Happen?

The Quantification Gap

Unsurprisingly, the reasons driving this behavior are complicated. 

One factor to consider is the “quantification gap” between man and machine. AI’s outputs are far easier to measure than those of a human doctor’s. Before deployment, systems are subjected to rounds of testing, where each decision is logged and benchmarked. Developers can then say, “our model is accurate X% of the time!” Here, “X” often reaches high into the 90s.

By contrast, while doctors’ performance is less reducible to neat metrics, they are tested more rigorously. They have to be. Medical school, foundation years, and specialty training can span more than a decade. Along the way, they face a broader scope of clinical, ethical, and interpersonal challenges that AI cannot yet handle. 

But despite the excessive training, no doctor wears a performance score on their sleeve. Your GP doesn’t start an appointment by saying, “I get diagnoses right 95% of the time.” And rightly so: aside from the absurdity, to do so could completely dehumanize medicine.

Still, the asymmetry remains. When an AI can point to a number and a doctor cannot, this “quantification gap” may tip the balance of trust. A doctor, who might disagree with an AI’s decision, might glean shiny performance stats as a reason to sweep their gut feeling aside, even when the system is wrong. 

Blindly Trusting Machines Is “Human Nature”

Beyond the clinic, this behavior seems more embedded in human nature than we might realise. In 2024, one research group gave participants a trust game, with pairs deciding to either cooperate or go against each other for financial gain. 

If they choose the same option, they both benefit. If one chooses a different option, only one player is rewarded. Over several rounds, players unconsciously pick out patterns in their partner’s decisions, helping them to predict their next move. It’s these learned patterns that funnel into the widely recognizable “gut-feeling,” something that — in medicine — can be an incredibly useful tool.  

Then they were given advice. 

This advice, labeled as either “human-generated” or “AI-generated,” significantly interfered with the game. Participants were much more likely to follow the alleged “AI advice,” even when it directly contradicted how the game had been played so far. 

In essence, they were overriding their gut-feeling in favor of a machine, with no evidence that said machine was actually giving any good advice at all. 

A different study from Harvard Business School also found that people “readily rely on algorithmic advice,” but not just when playing games; they heavily prefer machine over human judgment whether predicting geopolitical events, the popularity of songs, or romantic matches. The authors even noted a “willingness to choose the algorithmic advice over their own judgment,” just as those playing the card game had done.

The phenomenon is described as “algorithmic appreciation”: an implicit preference for machine over human advice, and partly explains why automation bias happens. 

When Knowledge Helps… And Sometimes Fails

It should be noted that this inherent “appreciation,” can be altered. Giuseppe Romeo, co-author of a Springer Nature review on automation bias published earlier this year, says there are two main things that drive the degree to which doctors rely on AI: AI literacy, and domain-level expertise; more simply, how deeply they understand AI and how experienced they are in their field. 

“These two aspects might be protective, when you have a high domain expertise, and also when you have enough of an AI background knowledge,” he says. “Knowledgeable doctors are more sensitive to first impressions: if a senior colleague spots an error the first time they’re using an AI, they are more likely to distrust it.” 

The study from Harvard Business School concurs: experts were much more resilient to algorithmic appreciation than “non experts,” i.e., those who make forecasts frequently as part of their job relied more on their own intuition to predict geopolitical events rather than the algorithm. 

Doctors, or “medical experts,” in this case, should therefore be more resistant to automation bias; and that resistance should become hardened with experience. 

And, as the radiologist study found, doctors with more experience are indeed less prone to the issue. But expertise is not shatterproof: even senior doctors suffer significant drops in accuracy under the “AI influence.”

The Obvious Solutions Might Not Work

If this bias is seemingly in our DNA, how could it possibly be addressed? 

Explainability is positioned as a potential solution. 

Most artificially intelligent systems are built from neural networks: intricate structures mimicking the brain. Just one input passes through, potentially, thousands of layers before a system reaches a decision. It’s these complex layers that form a “black box” around the AI’s inner workings, where the path to the answer is hidden from view. For doctors, this opacity is a recipe for overreliance, making it all too easy to trust the machine without question.

However, when an AI model is made to explain its “thinking,” doctors can understand how it has reached a decision. They can more easily spot if the system has made a mistake and course-correct. One Forbes article published earlier this year names explainability as “the key to maturing AI,” by helping to demystify computerized decision-making. 

Image generated with flux-1.1-pro-ultra.

But Giueseppe is not convinced. 

“A doctor may use that explanation as a mental shortcut. It might give the system a perceived reliability and end up actually increasing reliance,” he says.   

The design of the explanation itself could further push doctors into this trap. 

“When the explanation is overly technical, the doctor needs to strain to understand. They might end up overtrusting it,” Giuseppe says. “On the other hand, when the explanation is too simplistic, that also doesn’t help with reasoning. The doctor is not informed enough to interrogate its decisions.”

Meanwhile, Professor Shelmerdine believes that “practice makes perfect,” when it comes to appropriate trust in AI. It’s not enough to make artificial systems as transparent as possible — doctors need to feel comfortable disagreeing with AI. 

“One way to get confidence without years of experience,” she says, “is doing hours in a simulation lab, practising overriding it, so that it’s not this scary thing that you’re doing for the first time when you’re in front of a patient.” 

She also points to the role of accuracy ratings, those “shiny performance stats” as discussed earlier. If users know a model is 95% accurate, then they also know they should be challenging an AI system in 5% of cases — any more frequently, or any less, and red flags start to pop up. 

“If you’re agreeing with a system 100% of the time, and it’s only supposed to be 96% accurate, then you know that you’re likely overrelying on it,” says Shelmerdine.

“And if we know you’re a doctor who’s disagreeing with it 10% of the time, we know that you’re sabotaging the AI in a way.”

But she also brings up another suggestion: “fake bombs,” as she puts it, to test doctors — similar to secret shoppers who check to see if things are done correctly without flagging up that they’re there to check on them. 

“It depends on how serious the potential mistakes would be, and what the actual consequences are, as to how much effort we need to put in to address automation bias,” says Shelmerdine.

“Patient safety is always the priority, but it takes energy, time, and training to monitor and maintain accuracy. So the question is whether the risk from automation bias is significant enough to justify that extra investment.

“If the errors are minor or the risks minimal, and if we gain skills elsewhere, then the consequences of that bias may be less concerning. But if the impact could be serious, then of course that’s where we should focus our resources.”

Long-Term Risks: Will Doctors Lose Their Clinical Skills?

What could this overreliance mean, then, for doctors in the long run? Researchers voice concerns of deskilling, as doctors increasingly rely on automated machines for the answers. 

One paper explicitly states that this dependency could “impair physicians’ abilities to make independent, critical decisions.” 

It could even hinder the training of more junior physicians, preventing them from acquiring essential skills, like critical thinking and problem-solving abilities. 

In the event that the AI is unavailable, doctors might feel increasing anxiety about having to revert back to a more “primitive” form of medicine, even if that form was the “normal” for centuries.  

You might argue, in fact, that AI could actually teach these essential skills. After all, if artificial systems are more accurate than doctors, at least on average, they might help trainees recognize patterns they frequently miss. In theory, AI could reinforce critical thinking rather than undermine it. 

But research suggests otherwise. One study found that, while radiologists’ became more accurate under the guidance of an AI system, their performance returned to normal when they stopped using it. Essentially, although the AI helped them to get more things right, they weren’t actually learning from this. 

The authors of the paper note, instead, that AI “should not replace dedicated teaching.” 

Clearly, there’s no workaround solution for teaching doctors to be, well, doctors. And that any risk to the long-term quality of healthcare delivery has to be addressed properly.

Concluding Remarks

Any solution to the automation bias problem, for the time being at least, is merely hypothetical. A lasting fix will likely demand more than technical improvements; it may require rethinking medical training itself.

What is clear, however, is that AI is poised to reshape healthcare in ways that go far beyond diagnostic accuracy. This “fourth industrial revolution” is changing how doctors learn, how they practice, and ultimately how they define their own role in medicine. In time, these systems may not just transform patient care — they may reimagine the profession itself.