New Algorithms Could Reduce Racial Disparities in Health Care

Researchers trying to improve healthcare with artificial intelligence usually subject their algorithms to a form of machine med school. Software learns from doctors by digesting thousands or millions of x-rays or other data labeled by expert humans until it can accurately flag suspect moles or lungs showing signs of Covid-19 by itself.

A study published this month took a different approach—training algorithms to read knee x-rays for arthritis by using patients as the AI arbiters of truth instead of doctors. The results revealed radiologists may have literal blind spots when it comes to reading Black patients’ x-rays.

The algorithms trained on patients’ reports did a better job than doctors at accounting for the pain experienced by Black patients, apparently by discovering patterns of disease in the images that humans usually overlook.

“This sends a signal to radiologists and other doctors that we may need to reevaluate our current strategies,” says Said Ibrahim, a professor at Weill Cornell Medicine, in New York City, who researches health inequalities, and was not involved in the study.

Algorithms designed to reveal what doctors don’t see, instead of mimicking their knowledge, could make health care more equitable. In a commentary on the new study, Ibrahim suggested it could help reduce disparities in who gets surgery for arthritis. African American patients are about 40 percent less likely than others to receive a knee replacement, he says, even though they are at least as likely to suffer osteoarthritis. Differences in income and insurance likely play a part, but so could differences in diagnosis.

Ziad Obermeyer, an author of the study, and professor at University of California Berkeley’s School of Public Health, was inspired to use AI to probe what radiologists weren’t seeing by a medical puzzle. Data from a long-running National Institutes of Health study on knee osteoarthritis showed that Black patients, and people with lower incomes, reported more pain than other patients with x-rays radiologists scored as similar. The differences might stem from physical factors unknown to keepers of knee knowledge, or psychological and social differences—but how to tease those apart?

Obermeyer and researchers from Stanford, Harvard, and the University of Chicago created computer vision software using the NIH data to investigate what human doctors might be missing. They programmed algorithms to predict a patient’s pain level from an x-ray. Over tens of thousands of images, the software discovered patterns of pixels that correlate with pain.

When given an x-ray it hasn’t seen before, the software uses those patterns to predict the pain a patient would report experiencing. Those predictions correlated more closely with patients’ pain than the scores radiologists assigned to knee x-rays, particularly for Black patients. That suggests the algorithms had learned to detect evidence of disease that radiologists didn’t. “The algorithm was seeing things over and above what the radiologists were seeing—things that are more commonly causes of pain in Black patients,” Obermeyer says.

The WIRED Guide to Artificial Intelligence

Supersmart algorithms won’t take all the jobs, But they are learning faster than ever, doing everything from medical diagnostics to serving up ads.

History may explain why radiologists aren’t as proficient in assessing knee pain in Black patients. The standard grading used today originated in a small 1957 study in a northern England mill town with a less diverse population than the modern US. Doctors used what they saw to devise a way to grade the severity of osteoarthritis based on observations such as narrowed cartilage. X-ray equipment, lifestyles, and many other factors have changed a lot since. “It’s not surprising that fails to capture what doctors see in the clinic today,” Obermeyer says.

The study is notable not just for showing what happens when AI is trained by patient feedback instead of expert opinions, but because medical algorithms have more often been seen as a cause of bias, not a cure. In 2019, Obermeyer and collaborators showed that an algorithm guiding care for millions of US patients gave white people priority over Black people for assistance with complex conditions such as diabetes.

Obermeyer’s new study showing how algorithms can uncover bias comes with a catch: Neither he, nor the algorithms, can explain what the algorithms see in x-rays that doctors miss. The researchers used artificial neural networks, a technology that has made many AI applications more practical, but is so tricky to reverse engineer that experts dub them “black boxes.”

Judy Gichoya, a radiologist and assistant professor at Emory University, aims to uncover what the knee algorithms know. It will depend on human labor and ingenuity.

She’s assembling a larger, more diverse collection of x-rays and other data to test the algorithms’ performance. By asking radiologists to make detailed notes on x-rays, and comparing what they see with the pain predicting algorithms’ output, Gichoya hopes to uncover clues about what it’s picking up on. She’s hopeful it won’t be anything too alien to human doctors. “It may be that it’s something we do see, but in the wrong way,” she says.