Source: The New England Journal of Medicine
In this episode of “Intention to Treat,” Maia Hightower and Isaac Kohane join host Rachel Gotbaum to explore the promise and hazards of artificial-intelligence and machine-learning tools for both clinical and administrative uses in medicine.
Rachel Gotbaum: Welcome to “Intention to Treat,” from the New England Journal of Medicine. I’m Rachel Gotbaum.
Timothy Poterucha: My name is Tim Poterucha. I’m a cardiologist and clinical researcher at Columbia University. I codirect a artificial intelligence applications in cardiology research laboratory here at Columbia, where we study how we can use AI technologies in order to do a better job of taking care of patients with heart disease. The goal of this project is to decide who better needs an echocardiogram. So the problem is echocardiography, these ultrasounds of the heart, can be expensive, and that reduces their availability. So our question was can we use one of these less expensive technologies in order to predict the results of an echocardiogram?
So we chose to focus on the 12-lead ECG, which is very inexpensive, costs about $15 in order to perform each test in any doctor’s office, any emergency department — it’s available everywhere. So the idea is that you could have this technology that can analyze this really inexpensive test, and then you can quickly know who is best targeted by this more expensive, more accurate test. So we took all the patients at Columbia University that had had both of these tests, the ECG and the echocardiogram, and then we trained the artificial intelligence model, based on the ECG, to diagnose whether someone had a heart-valve problem.
This isn’t something that I myself, as a cardiologist, can do. I can’t look at an ECG and decide whether someone has a heart-valve problem. I need the more expensive test for it. So we wanted to see whether this artificial intelligence model could do it. And it actually can do it very well. It can pick out the patients that are most likely to have one of three heart-valve diseases, and then we actually try to go out and find those patients and have them undergo this test. And the initial results are quite promising — we do really seem to be finding patients who have significant heart disease and deserve to be identified. So it works. This is a chance for us to go head on to fight against the disparities that we know are widespread in medicine, in order to develop new programs and new systems that will enable us to target the patients that are most likely to not get care and connect them into the best possible health care that we can get them.
Rachel Gotbaum: This is “Intention to Treat,” from the New England Journal of Medicine. I’m Rachel Gotbaum. Artificial intelligence holds the promise of revolutionizing how we deliver health care. Today we’re going to explore how artificial intelligence, machine learning, is being used in medicine. How is AI helping our doctors, and how might it be hurting our patients? I’m joined by Dr. Isaac Kohane. He’s chair of the Department of Biomedical Informatics at Harvard Medical School, and he’s also coeditor of the “AI in Medicine” series in NEJM. And also Dr. Maia Hightower. She’s chief digital and technology officer at University of Chicago Medicine and the CEO of Equality AI. So Dr. Hightower, I’d like to start with you. Where is AI being used in medicine?
Maia Hightower: Right now, AI is being used in medicine in two main categories. First is operational/administrative, and the second is in clinical use. I would say there’s far higher adoption in operational and administrative tasks than currently in clinical AI.
Rachel Gotbaum: So can you give us some examples of how AI is being used on the administrative side of medicine?
Maia Hightower: Yeah. So in order to communicate with payers, with our insurance companies, we’ll often have bots or automation that transfers information from the health system to the insurance company and back. In the case of insurance companies, we know that they often will use AI algorithms for prior authorization of procedures, whether or not to cover a particular medication or test. And in those cases, there isn’t much transparency on our side as a provider organization.
Rachel Gotbaum: OK, so that’s some examples of AI on the administrative side. Dr. Kohane, tell us about AI on the clinical side.
Isaac Kohane: There are two kinds of applications that AI seems to me like a particularly exciting way to help our patients. The first is understanding that we as humans do some things very well, but we’re not very good at being alert around the clock and being fastidious about knowing everything about our patients, all the details, all the adverse events, all the allergies, all their history. There are so many such details that if we can have AI be part of the decision-making loop, so that things that we forgot about are in fact not forgotten, I think that for the routine care of patients, having AI as an assistant to the doctor is going to play to the strength of the doctor as a intellectual, compassionate provider of care, and this will be a significant boost in quality of care. So the application of large language models to clinical care is something that I fully anticipate is going to happen in the next year, year and a half.
There are so many companies that are moving forward on capturing interactions between patients and doctors, verbal interactions and turning that into grist for decision making. On the other extreme, there are areas of medicine where human beings are just not well-equipped to serve independently. And we’ve seen, for example, in undiagnosed patients, there is an important interaction between the ability of AI algorithms to sort through large patterns between genetic variants and clinical findings and the expertise of doctors to arrive at diagnoses for these long-suffering patients that previously we could not provide a diagnosis. So it’s that combination of this superhuman capacity to know a lot about what is, for most doctors, a relatively arcane area, genomics, and to what doctors do know well, which is a lot of clinical manifestations of disease, in a way that works for patients that without this we could not serve our patients well.
Rachel Gotbaum: And what does that look like?
Isaac Kohane: So in the undiagnosed disease network that I’m part of, we do in fact routinely use machine-learning models, AI models, to actually comb through all the associations that have been found of genetic variance mutations in the genome and various syndromes and try to match that to patient histories. And we do that working with doctors who are also expert in particular areas of medicine, we’ve found that combination has allowed us to diagnose on the order of 30% of the patients that we have been referred across 12 academic health centers throughout the United States that were previously undiagnosed for years, despite seeing many doctors. So that’s where it works.
Rachel Gotbaum: So Dr. Hightower, I want to go back to a concern you mentioned when discussing the administrative uses of AI. You said there isn’t really any transparency in the way insurance companies use AI to decide whether to cover a procedure or a test or a medication. So talk to me a little bit about the consequences of that lack of transparency.
Maia Hightower: Our payer partners, such as Optum and Blue Cross Blue Shield, have huge resources dedicated to AI and developing algorithms, but we don’t have transparency in how they’re using them. Patients are the recipient of unknown algorithms that may be affecting their ability to pay for their health care.
Rachel Gotbaum: And yet there are some really positive aspects of using AI in the administrative process, right?
Maia Hightower: If you think about some of the algorithms that we’ve deployed — say for example, there is an algorithm that predicts no-show, like what is the likelihood of a patient not to show up for their appointment? It’s a very neutral algorithm in that the health system can decide are they going to take that information and double-book a patient — so, in other words, slot in another patient when there’s a high risk of no-show? Or is a health system going to see that as an opportunity to reach out to that patient with added resources, identify what’s preventing them from being able to come to their appointment — is it transportation? Is it convenience? Is it hours of operation? And reach out to them and be able to provide that additional support. I think when health systems do the latter and use that information to better support patients, that’s actually a great outcome for both the patient and the health care system.
Rachel Gotbaum: So what are some other ways that you are seeing AI being used or that you would like to see it being used that excites you, Dr. Hightower?
Maia Hightower: I like the democratization of diabetic retinopathy screening. So there is an AI tool that you can deploy in the primary care office that would help to screen diabetic patients for diabetic retinopathy. And if they’re negative, then they get the clear, if they’re positive, they’re referred to an ophthalmologist, and it helps to extend diabetic retinopathy screening from high-cost ophthalmology care to the primary care setting.
Through voice recognition, physicians are able to get through their clinical documentation much more quickly through a tool that essentially transcribes our text into a complete note. Where before it was just like a transcription of a note, now it’s like this beautiful note based on a casual conversation with a patient. So allowing the doctors to return to the art of medicine and connect with their patients instead of typing on a keyboard.
Isaac Kohane: I’d like to just add one other area that is happening already today and in which I personally am very excited about the application of AI to medicine. It’s an area that I think strikes a lot of fear and concern, appropriately among some, but I think it’s going to be a net benefit. And that’s the use by patients directly of these tools. It’s not a secret that many patients have used search in the past, to the point that some people call search Dr. Google because they find things. But it’s in fact very hard to find enough detail about a disease. You have to read papers, read abstracts to really understand what’s going on. But now with these large language models like ChatGPT, you see patients going to and asking questions that they’ve wanted to ask their doctors — because, as is commonly the case, you forget things when you go to the doctor, perhaps because you’re stressed, and because, unfortunately, doctors don’t have that much time.
There’s also very worrisome trend in all societies, but certainly in the United States, which is the disappearance of primary care. In the absence of access to primary care, having some ability to ask some basic questions about your symptoms to see if perhaps it’s something you should be worried about will bring some patients to a doctor. And the negative side is we don’t know that the advice that they’re getting all the time is the right advice from ChatGPT, we don’t know that everything that it reports is necessarily a fact — it still has a tendency to make up facts or, as some computer scientists call it, hallucinate. And yet in the absence of any other authoritative source of information, patients are going to keep on going after these kinds of very informative sources. And the open question is: Do they know enough to know the difference between good advice and bad advice?
Rachel Gotbaum: So Dr. Hightower, I’d like to turn back to you. Patients are obviously vulnerable to AI in other ways — for example, to bias. You founded a company called Equality AI. How did you get involved with this area of work?
Maia Hightower: I very much, since the day I entered into medical school and you get your white coat and there’s a white coat ceremony, and you look up and you see all of the leaders of the health system greeting you to this wonderful profession, and it didn’t take rocket science for me to reflect and say, “Hm. That population doesn’t look like the community I come from.” So the question about Equality AI, how did I come to that, I was the chief population health officer at the time when Dr. Obermeyer, his paper on dissecting racial bias in algorithms used to manage the health of populations was published. And at the time, that study showed that a widely used algorithm had a flaw that resulted in a decrease by half in the rate of referral to case management for Black patients compared to White patients who were equally sick. And they were able to identify the bias and then mitigate the bias in the algorithm and fix it.
And at the time, I remember, I was the chief population health officer, and I wish I could say I had this epiphany moment and I was ready to combat bias in AI. No, instead it was a moment where I, like many, probably thought it was a one-off — the algorithm was fixed, and we could all move on our way. A few years later, of course, Covid happened, and there was this real awareness on social injustice and the dual pandemic of health care inequity. And it was showing in the numbers — like Black patients and Latino patients and the South Pacific Islander patients, Native American patients were dying a lot more frequently from Covid. And so I developed what’s called the health care IT equity maturity model, looking for all of the different areas within health care where systemic bias is embedded in our health care IT systems, and recognizing that my workforce is not very diverse.
And that was really what Obermeyer was highlighting, is that in his paper, there was that lack of diversity. And if there had actually been a more diverse team on creating the algorithm, that most likely some of the errors that had been made would’ve been avoided. The New England Journal of Medicine article “Hidden in Plain Sight” was a great example. It outlines 13 sort of rules-based historical algorithms that use race and are embedded in our IT systems.
Rachel Gotbaum: Can you give some examples?
Maia Hightower: Yes. A great example is on the American Heart Association, get to the heart guidelines, there is a calculator that helps you calculate the risk of heart failure and whether or not a patient should be referred to a cardiologist. And in that particular algorithm, if you push the button and you add Black race, then the patient’s less likely to be referred to a cardiologist. The score changes. In this case, the score goes down and is below the threshold for referring to a cardiologist.
Rachel Gotbaum: So tell us how these biases happen.
Maia Hightower: Throughout history, we have developed algorithms to help predict an outcome. And we’ve always used data to help inform those algorithms. And historically that data have not reflected the population that we serve. So for example, radiologists diagnose osteoarthritis using something called the KLG score. And that KLG score was developed from Welsh miners in the 1950s. So I can tell you that our population today does not reflect Welsh miners. That score, though, continues to be used as a primary diagnosis of osteoarthritis. And it’s been found that if you recalculate the score today using a population that reflects today’s patient demographics, that you come up with a very different algorithm and a better predictor of osteoarthritis pain, as well as potentially threshold for knee replacement or joint replacement. So the result of that has been undertreated pain for Black patients in the United States, and most likely for women and other minority groups as well.
We don’t know the full extent of the inequity in treatment of joint pain, but the studies to date suggest that, at least for Black patients, that pain has been underdiagnosed and undertreated. And for any population, if you’ve ever been the underdog, if you’ve ever been the outlier, if you’ve ever been the small sample size, then you’re at risk of bias from AI. A good example is in sepsis — many of our algorithm sepsis models are developed on middle-aged and older populations. If you are a 20-year-old male, if you’re just 20, and you find yourself in an adult hospital, those sepsis algorithms are less likely to perform as well for you as a 55-to-70-year-old.
Rachel Gotbaum: So what’s going on here, Dr. Hightower?
Maia Hightower: We haven’t owned up to our history of lack of inclusion, whether it’s from our clinical trials to the way that we practice evidence-based medicine. And that same dirty little secret of bias in medicine is being replicated and potentially scaled in AI.
Rachel Gotbaum: But is it really possible to remove this kind of bias from AI algorithms?
Maia Hightower: There are many types of bias, and for each type of bias, there are many methods for mitigating that bias. So some measures to mitigate bias are just increasing the diversity of our teams, attracting a diversity of talent and viewpoints to data science and in health care in general and in technology. Some of the approaches, though, are very statistical in nature, where an algorithm can be applied to the algorithm, where a small-sample-size data set can be augmented through synthetic data. So there’s some pretty cool methods out there to mitigate against bias.
Rachel Gotbaum: So let me turn back to you, Dr. Kohane. What are some of the safeguards you’d like to see put in place as we move forward, so there’s not perhaps a false sense of security and reliance on these new AI tools?
Isaac Kohane: Well, I certainly agree that having a human in the loop, having the doctor as part of the decision-making process is a necessary component to making sure that decisions that are made are appropriate. I do think that the concern that you brought up, which is that essentially we’ll fall asleep at the wheel because we become used to automation and just assume that whatever is going on is OK is a danger. And so that is, I think, an open question about ensuring doctors are making themselves responsible, ultimately for all decisions, and that they not assume that some automated process is always going to do the right thing.
But I think, just as importantly, we have to start measuring the system. And just as we have trials for drugs to see if they’re effective, we need to have trials for these AI artifacts to see are they effective? And we can actually measure are they biased, do they create different outcomes for different populations? If we don’t measure it, we’re not going to be able to do anything about it. We have to see and understand that the interventions that are a result of the application of artificial intelligence to medicine is at least as powerful as the use of drugs, and they have to be evaluated with the same rigor.
Rachel Gotbaum: And let’s talk a little bit more about the other benefits we’re seeing with AI. Dr. Kohane, I know you believe AI can be used to address physician burnout, for example.
Isaac Kohane: The pressures to provide sufficient revenue, those pressures push us to see a lot of patients and therefore to see patients in shorter and shorter intervals. And at the same time, the technology that we’ve used to date, electronic health record systems, have been optimized for billing but not for clinical care. And that combination have really sucked the marrow, the excitement, the joy of practice out of many clinicians. If, under the right governance, we have, for example, the conversations that are occurring between doctors and their patients automatically turned into the documentation that needs to go into the health record, that needs to go in for billing, that needs to be sent back to a referring doctor or to the patient — all these things which are technically possible — could take a huge administrative load off of physicians.
Similarly, if requests for referrals or prior authorizations really are just pressing button and say, “use the patient’s history as the basis of an argument for prior authorization,” that’s another 15 minutes of drudgery that we’re eliminating. And really having AI essentially serve as the automated equivalent of a very smart medical scribe who would then follow through with all the administrative tasks — if that would be done, then we could bring back doctors to what they went into medicine for in the first place, which is that contact with patients, that desire to help them where they live emotionally, and talking about the thing that they’re there most of all for, which is their health care. So that’s the first most important way it could help doctors.
The second way is, because we have a historic and ongoing deficit in primary care physicians, we all are beginning to count on physician assistants and nurse practitioners. If we can provide these nurse practitioners and physician assistants with AI so that they will be fully conversant with genomics, with expert diagnoses, I think that we may be able to bridge a lot of that missing workforce.
Rachel Gotbaum: So it sounds like it’s still a bit of a mixed bag of pros and cons right now.
Isaac Kohane: Well, the cons are a many of the cons that Dr. Hightower brought up already, which is if we’re not careful, it could pour concrete over existing biases. But I want to be optimistic and say, for example, these AI programs now can actually read a patient note and look at clinical trials and say, “Are there trials that this patient could qualify for?” That alone, just getting more people involved in clinical trials, I think will ultimately be an important contribution to leveling the playing field. Also, just reading the literature — I don’t know about you, but I try to stay up to speed with literature, but to do so, I just skim titles or, on a good day, abstracts of the papers I want to read. If I could have, as I have recently, get these chatbots to actually summarize a paper for me, I know much better which papers I want to invest time to actually understand.
Rachel Gotbaum: So at this point in its development, where do you think AI should be used and where should it not be used in medicine?
Isaac Kohane: I think that at present, AI should be used for where human beings are the weakest — namely, in knowing everything about all their patients and being as alert at 6:00 in the evening as they are at 8:00 in the morning. I don’t think at this time that AI should be used instead of the human intuition, the human contact, the human common sense that doctors bring to bear in their patient interactions. And I think by using AI widgets for its strengths and our human strengths as doctors, that we’ll probably have the best outcome. The worst outcomes will be doctors who don’t use AI and AI that doesn’t use doctors.
Maia Hightower: So AI is decision support, not decision maker.
Rachel Gotbaum: Thank you both very much.
Isaac Kohane: Thank you.
Maia Hightower: Thank you so much, Rachel. This has been wonderful.
Rachel Gotbaum: That’s Dr. Maia Hightower. She’s chief digital and technology officer at University of Chicago Medicine, and the CEO of Equality AI. And Dr. Isaac Kohane. He’s chair of the Department of Biomedical Informatics at Harvard Medical School and coeditor of the “AI in Medicine” series in NEJM.