Not All Patients Are Equal in the Eyes of AI

As more physicians utilize chatbots for daily tasks like patient communication and insurance appeals, experts caution that these systems could perpetuate and worsen existing medical racism

A recent study published in Nature Medicine has found that large language models (LLMs) show potential in the healthcare sector. However, concerns remain that these models could generate medically unjustified clinical care recommendations that are influenced by the patient’s sociodemographic characteristics. Researchers evaluated nine LLMs across 1,000 emergency department cases, half real and half synthetic, and each was presented in 32 variations that changed only the patient’s sociodemographic identity (race, income, housing status, gender identity, etc.), while keeping the clinical details exactly the same.

The findings showed that these AI systems often provided different recommendations based on the patient’s perceived identity. For instance, cases labelled as Black, unhoused, or LGBTQIA+ were significantly more likely to be directed toward urgent care, invasive procedures, or mental health evaluations even when those were not clinically warranted. 

LGBTQIA+ individuals, in particular, were recommended for mental health assessments six to seven times more often than necessary. Meanwhile, high-income individuals tended to receive more favourable or less aggressive recommendations.

High-income patients were more likely to be recommended advanced diagnostic tests, such as CT scans or MRIs, while low-income patients were more frequently advised to undergo no further testing. This pattern mirrors healthcare disparities that exist in the real world.

Data-Driven Discrimination

These kinds of biases were seen on both proprietary and open-source AI models. AI-powered tools can summarise patient records in seconds, draft clinical notes, suggest potential diagnoses, and even help plan treatments. Platforms like Google’s Med-PaLM, OpenAI’s GPT, and Microsoft’s Nuance DAX are already being used to support doctors, reduce burnout, and speed up care delivery. This generative AI learns from data and if that data contains human bias, the AI reflects and amplifies it. In clinical settings, this means marginalized groups might receive different responses than the recommendations simply based on their perceived identity.

As more physicians utilize chatbots for daily tasks like patient communication and insurance appeals, experts caution that these systems could perpetuate and worsen existing medical racism. They warn that these real-world harms could amplify the generational forms of medical racism.

In a report in Down to Earth, Girish N Nadkarni, chair of the Windreich Department of Artificial Intelligence and Human Health Director of the Hasso Plattner Institute for Digital Health, and the Irene at the Icahn School of Medicine at Mount Sinai, said “Our team had observed that LLMs sometimes suggest different medical treatments based solely on race, gender or income level — not on clinical details,” 

Good Intentions, Bad Prescriptions

The study also indicated that LLMs may continue to exhibit biases present in the medical system, such as the use of racially biased equations in determining kidney function and lung capacity. These equations were based on erroneous and racist assumptions. A 2016 study revealed medical students and residents held false beliefs about racial differences, impacting patient care. With LLMs being used in medical settings, this study investigates their potential to perpetuate debunked race-based medicine and racist tropes.

In an interview with AP, Stanford University’s Dr. Roxana Daneshjo expressed deep concern regarding the perpetuation of harmful tropes in medicine. She emphasized the potential for real-world consequences, including the exacerbation of health disparities, if these tropes are not effectively addressed. Dr. Daneshjo stressed the importance of removing these tropes from the medical field to prevent their regurgitation and mitigate their detrimental impact.

When Bias Becomes a Diagnosis

A 2019 study published in Science found that a widely used algorithm developed by Optum underestimated the health needs of Black patients. The algorithm incorrectly assumed that patients who spent less on healthcare were healthier and used healthcare costs as a proxy for health needs. Due to systemic inequalities in the healthcare system, Black patients typically incur less healthcare spending than white patients with the same health conditions, leading the algorithm to unfairly recommend less care for Black patients.

IBM Watson Health’s AI held the promise of restructured cancer treatment through personalized treatment plans. However, it was discovered that Watson frequently generated unsafe and incorrect treatment recommendations. This was especially true when the system was trained on hypothetical cases rather than real patient data. Additionally, Watson struggled with generalization and exhibited bias when handling less common demographics or rare cancer types.

Epic, a prominent electronic health records (EHR) vendor, faced scrutiny after deploying a predictive model for pregnancy-related risks. The tool, trained predominantly on data from white populations, proved less accurate for Black and Hispanic women, potentially resulting in delayed interventions and adverse outcomes for these marginalized groups.

AI is shaking things up in healthcare, turning what used to be long, stressful processes into faster, smarter, and more personalized experiences. Getting an early warning about a health issue before even feeling sick, that’s whats Google’s DeepMind is doing, spotting eye diseases in their earliest stages. Tools like IBM’s Watson help doctors make better treatment decisions, especially for complex illnesses like cancer. Startups like PathAI are making sure lab tests are more accurate, and Aidoc is helping ER teams catch serious problems like brain bleeds in seconds, not hours. Even apps like Babylon are stepping in as friendly health buddies, offering virtual checkups right from user’s phone. 

AI isn’t replacing the doctors but it’s working behind the scenes to make healthcare more human, helpful, and just plain better.

But it is not a magic fix and and it won’t solve all of healthcare’s deep-rooted problems unless models are created with more accountability and bias fixes.

📣 Want to advertise in AIM Research? Book here >

Picture of Upasana Banerjee
Upasana Banerjee
Upasana is a Content Strategist with AIM Research. Prior to her role at AIM, she worked as a journalist and social media editor, and holds a strong interest for global politics and international relations. Reach out to her at: upasana.banerjee@analyticsindiamag.com
Subscribe to our Latest Insights
By clicking the “Continue” button, you are agreeing to the AIM Media Terms of Use and Privacy Policy.
Recognitions & Lists
Discover, Apply, and Contribute on Noteworthy Awards and Surveys from AIM
AIM Leaders Council
An invitation-only forum of senior executives in the Data Science and AI industry.
Stay Current with our In-Depth Insights
The Most Powerful Generative AI Conference for Enterprise Leaders and Startup Founders

Cypher 2024
21-22 Nov 2024, Santa Clara Convention Center, CA

25 July 2025 | 583 Park Avenue, New York
The Biggest Exclusive Gathering of CDOs & AI Leaders In United States
Our Latest Reports on AI Industry
Supercharge your top goals and objectives to reach new heights of success!