‘Hype and Magical Thinking’: The AI Healthcare Boom Is Here
Artificial intelligence tech has infiltrated every industry, and health care is no exception. We now have Together by Renee, an app that tracks your medical history, aims to gauge your blood pressure with a selfie, and detect depression or anxiety symptoms by the sound of your voice. DrugGPT, developed at Oxford University, is a tool designed to help doctors prescribe medications and keep patients informed about what they’re taking. You can download Humanity, a generative AI “health coach” that promises to “reduce biological age,” and Google is working on a machine-learning model that will potentially diagnose a patient based on the sound of their cough.
But the potential consequences of these applications are somewhat different than what may happen when you use AI to create a song. To put it in the starkest terms: lives are at risk. And experts in the fields of health and technology tell Rolling Stone they have real doubts about whether these innovations can serve the public good.
For Bernard Robertson-Dunn, an experienced systems engineer who serves as chair of the health committee at the Australian Privacy Foundation, one major issue is that developers themselves have handled patient information all wrong from the very start. Decades ago, he says, there was a “big push” for digitizing medical records, but the promise of this revolution fell through because technologists think these data “are like financial transaction data.”
“They aren’t,” Robertson-Dunn says. “Financial transaction data are facts, and the meaning of an existing transaction does not change over time. If you look at your bank account, it will not have changed for no apparent reason.” Health data, meanwhile, “can change from day to day without you knowing it and why. You might catch Covid, HIV, a cold, or have a heart attack today, which invalidates a lot of your health record data as recorded yesterday,” says Robertson-Dunn. In his view, the old frenzy for electronic health records has carried over to the AI boom, “which is a far bigger problem.”
“I’m never going to say that technology is harmful or we shouldn’t use it,” says Julia Stoyanovich, the computer scientist who leads New York University’s Center for Responsible AI. “But in this particular case, I have to say that I’m skeptical, because what we’re seeing is that people are just rushing to use generative AI for all kinds of applications, simply because it’s out there, and it looks cool, and competitors are using it.” She sees the AI rush emerging from “hype and magical thinking, that people really want to believe there is something out there that is going to do the impossible.”
Stoyanovich and Robertson-Dunn both point out that AI health tools are currently evading the kinds of clinical trials and regulation that are necessary to bring a medical device to market. Stoyanovich describes a “loophole” that makes this possible. “It’s not really the tool that’s going to prescribe a medicine to you. It’s always a tool that a doctor uses. And ultimately the doctor is going to say, ‘Yes, I agree’ or ‘I disagree.’ And this is why these tools are escaping scrutiny that one would expect a tool to have in the medical domain.”
“But it’s problematic still, right?” Stoyanovich adds. “Because we know that humans — doctors are no exception — would rely on these tools too much. Because if a tool gives you an answer that seems precise, then a human is going to say, ‘Well, who am I to question it?’” Worse, she says, a bot could cite an article in a journal like Science or the Lancet to support its conclusion even if the research directly contradicts it.
Elaine O. Nsoesie, a data scientist at the Boston University School of Public Health who researches how tech can advance health equity, explains what a diagnostic AI model could be missing when it assesses a patient’s symptoms. These tools “basically learn all this information, and then they give it back to you, and it lacks context and it lacks nuance,” she says. “If a patient comes in, they might have specific symptoms, and maybe they have a history of different conditions, and the doctor might be able to provide medical advice that might not be standard, or what the data that has been used to train our algorithm would produce.”
According to Nsoesie, artificial intelligence can also replicate or exacerbate the systemic health inequities that adversely affect women, people of color, LGBTQ patients, and other disadvantaged groups. “When you see algorithms not doing what you’re supposed to do, the problem usually starts with the data,” she says. “When you look at the data, you start to see that either certain groups are not being represented, or not represented in a way that is equitable. So there are biases, maybe stereotypes attached to [the models], there’s racism or sexism.” She has co-authored a paper on the topic: “In medicine, how do we machine learn anything real?” It outlines how a “long history of discrimination” in health care spaces has produced biased data, which, if used in “naive applications,” can create malfunctioning systems.
Still, Nsoesie and others are cautiously optimistic that AI can benefit public health — just maybe not in the ways that companies are pursuing at the moment. “When it comes to using various forms of AI for direct patient care, the details of implementation will matter a lot,” says Nate Sharadin, a fellow at the Center for AI Safety. “It’s easy to imagine doctors using various AI tools in a way that frees up their time to spend it with their patients face-to-face. Transcription comes to mind, but so do medical records summarization and initial intake. Doctors have been indicating that their inability to spend meaningful time with their patients is a problem for decades, and it’s leading to burnout across the profession, exacerbated, of course, by Covid-19.”
Sharadin sees the potential risks as well, however, including “private for-profit long-term care facilities cutting corners on staff by attempting to automate things with AI,” or “charlatans selling the AI-equivalent of useless supplements.” He identifies the Together app as one example. “There’s absolutely no way they are accurately detecting SpO2 [blood oxygen levels] with a selfie,” he says. “I’m sure that they and other businesses will be careful to indicate that their products are not intended to diagnose or treat any disease. This is the typical FDA-compliant label for selling something that people don’t actually need that doesn’t actually work.”
Stoyanovich agrees with Sharadin that we need to think hard about what exactly we want from this technology, or “what gap we’re hoping these tools will fill” in the field of medicine. “These are not games. This is people’s health and people’s trust in the medical system.” A major vulnerability on that end is the privacy of your health data. Whether or not AI models like Google’s cough-analyzing tool can work reliably, Stoyanovich says, they are “sucking in a lot of information from us,” and medical data is especially sensitive. She imagines a future in which health insurance companies systematically raise premiums for customers based on information captured by such apps. “They’re going to be using this data to make decisions that will then impact people’s access to medical care,” Stoyanovich predicts, comparing the situation to the “irresponsible” and “arbitrary” use of AI in hiring and employment. “It ends up disadvantaging people who have been disadvantaged historically.”
Stoyanovich worries, too, that exaggerating the effectiveness of AI models in a clinical setting because of a few promising results will lead us down a dangerous path. “We have seen a lot of excitement from specific cases being reported that, let’s say, ChatGPT was able to diagnose a condition that several doctors missed and were unable to diagnose right,” Stoyanovich says. “And that makes it so that we now believe that ChatGPT is a doctor. But when we judge whether somebody is a good doctor, we don’t look at how many cases they’ve had right. We look at how many cases they got wrong. We should at the very least be holding these machines to a similar standard, but being impressed with a doctor who diagnosed one specific difficult case, that’s silly. We actually need to have robust evaluation that works in every case.”
The tech and health experts who spoke to Rolling Stone largely concur that having medical professionals double-check the output of AI models adds a layer of tedium and inefficiency to health care. Robertson-Dunn says that in the case of pathology tests — like those involving the reading of X-rays or MRI scans — “a qualified medic can assess the diagnosis of each one, but that turns the job of a highly skilled practitioner into a very boring, soul-destroying, mechanical routine.”
And, as Nsoesie observes, perhaps we can entirely reframe the opportunity AI poses in health care. Instead of trying to measure the biological qualities of individuals with machines, we might deploy those models to learn something about entire regions and communities. Nsoesie says that the AI movement in Africa has come up with promising solutions that include using AI for monitoring air pollution that impacts health. “Being able to collect that data and process it and make use of it for policymaking is quite important,” she says.
Where it comes to public health, Nsoesie says, the focus should be on “addressing the root causes causes of illnesses and health inequities, rather than just fixing the symptoms of it.” It would be better, in her view, to leverage AI tech to answer questions of why we have “particular populations with higher rates of diabetes or cancer” instead of designing an app that targets people with those conditions. The ideal solutions, she adds, require talking to patients and the clinicians serving them to find out what they really need, and letting their input guide the development process. App developers, Nsoesie says, are typically not doing that research or soliciting feedback.
“That’s just more effective,” she concludes. “But it requires that you actually prioritize people rather than money.”