Revolutionizing Colorectal Health

An AI network-based approach to predicting colorectal cancer risk.

A researcher holds a microchip

Harbert College of Business Analytics Assistant Professor Dr. Pankush Kalgotra and researchers from Oklahoma State University and the Swedish Medical Center have developed a novel data-based method of predicting the risk of colorectal cancer (CRC) in patients under 50 without a family history of the disease.

Their network-based model is timely, given that the incidence of CRC is rising in the United States among younger individuals while health care costs are rapidly increasing.
According to the American Cancer Society (ACS), rates of CRC have been rising in younger individuals for nearly 40 years. ACS estimates that nearly 20,000 individuals under 50 will be diagnosed with CRC this year and 3,750 of them will die from the disease.

As a result, ACS now recommends that adults begin screening for CRC at age 45 — either with a stool-based test or colonoscopy.

“Affordability of health care is a major concern of the U.S. population,” said Kalgotra. “Our method could help identify the [people] that really need the screening rather than screening everyone. Our model helps efficiently manage those resources to help reduce health care costs.”

Identifying young people with a high risk of CRC can help them change their lifestyle, as the majority of all CRC cases and deaths are attributable to modifiable risk factors, such as smoking, an unhealthy diet, high alcohol consumption, physical inactivity and excess body weight.

Kalgotra’s research collaborators are Dr. Ramesh Sharda, professor of management science and information systems from Oklahoma State University, and Dr. Sravanthi Parasa, a gastroenterologist from the Swedish Medical Center in Seattle.

The novelty of their method lies in creating a model based on patient medical records rather than biological samples like tissue biopsy or blood tests.

“Novel analytics/AI approaches such as this project have the potential to make a significant impact in health care as well as other areas,” said Sharda. “The ability to use a large corpus of data in an innovative manner to predict early onset of a disease is a big motivator for Pankush and myself.”

In addition, their approach quantifies how multiple underlying diseases interact with one another and contribute to a CRC diagnosis. Earlier attempts to develop CRC predictive models have not considered the interaction of diseases.

“We used network analysis to come up with the new variables and then used those variables [to train] our machine learning models,” Kalgotra said.

To build the model, they examined the electronic medical records of 7,500 CRC patients and nearly 38,000 non-CRC patients to quantify how patients’ other diseases — diabetes, anemia and hypertension, for example — interact. From this, they were able to calculate CRC risk-score variables, which they compared to patients who received a CRC diagnosis and those who did not.

They then validated the model’s performance on an independent group of patients to see if its prediction was correct.

According to Kalgotra, their model performed very well, accurately predicting a CRC diagnosis 73% of the time. Even more importantly, he said, the model achieved an area under curve (AUC) of .81. AUC is a measurement tool that describes how well a machine learning model is performing — a score of 1 is excellent, while a 0 score is poor. “Our AUC of .81 is considered very good in the medical field,” he said.

They describe their model and results in the paper, “Quantifying disease-interactions through co-occurrence matrices to predict early-onset colorectal cancer,” published in the journal Decision Support Systems.

The team is continuing this research to create an interpretable machine-learning model.

“Amidst the rising incidence of early-onset colorectal cancer (CRC), interpretable machine learning models empower clinicians at the point of care to provide actionable guidance to patients based on complex algorithms,” said Parasa. “This becomes increasingly critical as we endeavor to translate these machine learning models and shed light on the potential risk reduction tied to addressing modifiable factors such as obesity, which empowers patients to make informed decisions about their health.”

They will release a new research paper specifically geared to the medical community, informing physicians about the promise of their model.

“Medicine is adopting machine learning and AI tools in medical practices, so our project has a lot of potential in the real world, helping physicians make decisions,” said Kalgotra. “The implementation of our model is simple and does not require any special training. The model generates the risk score for each patient based on their medical history, which is already there in the [patient’s] Electronic Medical Record (EMR).”

According to Kalgotra, their model could be used to predict other health outcomes, such as length of hospital stay or readmission rate.