Extreme gradient boosting (XGBoost) was found to perform the best among 7 different machine learning models in accurately predicting the risk of gastric cancer after eradication of Helicobacter pylori infection based on simple baseline patient information, according to the results of a study published in Alimentary Pharmacology & Therapeutics.

A team of investigators from the University of Hong Kong evaluated patients who underwent clarithromycin-based triple therapy for the eradication of H pylori from 2003 to 2014. Based on the period of eradication therapy, patients were separated into training (n=64,238) and validation sets (n=25,330). A total of 7 machine learning models were constructed to predict the risk of gastric cancer development within 5 years after H pylori eradication. Performance of the machine learning models was measured based on area under receiver operating characteristic curve (AUC) analysis.

During a mean follow-up of approximately 4.7 years, the researchers found that 0.21% of patients who had eradication of H pylori infection developed gastric cancer. The performance of extreme gradient boosting was the best among the 7 different machine learning models regarding predicting cancer development (AUC 0.97; 95% CI, 0.96-0.98), as well as being superior to conventional logistic regression (AUC 0.90; 95% CI, 0.84-0.92). The number of patients considered at high risk for gastric cancer was 6.6%, with a miss rate of 1.9% with the XGBoost model. Heavily weighted factors used by the XGBoost were patient age, presence of intestinal metaplasia, and gastric ulcer.  

Continue Reading

The limitations of the study include being retrospective in nature, as well as training and validation cohorts both from the same population, and the inclusion of all patients who underwent eradication of H pylori. Endoscopic, histologic, or serological parameters were not available for all patients enrolled in the study. Another limitation was that some patients could possibly have been missed due to failed H pylori eradication being identified by the needs for retreatment only. In addition, the underlying hidden algorithm of the machine learning model was overly complex. Therefore, this could possibly result in machine learning models being viewed as not applicable in a routine clinical practice setting. Lastly, additional research and follow-up regarding the machine learning algorithm are warranted to ascertain the potential medicolegal implications of this tool on clinical decision-making.    

Nevertheless, the researchers determined that based on simple clinical information and medication history, the XGBoost model can accurately predict the risk of gastric cancer development after H pylori eradication. These findings suggest that the XGBoost may have the potential to significantly reduce the number of patients who need endoscopic surveillance.  

The study authors concluded, “The application of machine learning on risk stratification that is based on simple patient’s information looks promising and deserves further evaluation.”

Disclosure: One author declared affiliations with industry. Please refer to the original article for a full list of disclosures.


Leung WK, Cheung KS, Li B, Law SYK, Lui TKL. Applications of machine learning models in the prediction of gastric cancer risk in patients after Helicobacter pylori eradication. Aliment Pharmacol Ther. Published online January 24, 2021. doi:10.111/apt.16272

This article originally appeared on Gastroenterology Advisor