Artificial intelligence can predict survival in patients with cancer using information from a patient’s initial consultation with an oncologist, according to a study published in JAMA Network Open.
Researchers found that natural language processing models could predict survival at 6 months, 36 months, and 60 months at least as well as prior models, without focusing on a specific cancer type.
“These findings suggest it is feasible to predict survival of patients with cancer using a common cancer document without additional data and without training separate models for specific types of cancer,” the researchers wrote.
The researchers trained and assessed both conventional and neural models to predict whether patients would survive 6 months, 36 months, or 60 months based on an initial consultation with an oncologist. The primary outcome was performance of the predictive models. The secondary outcome was what words the models used.
Data from 47,625 cancer patients who had an initial oncologist consultation within 180 days of diagnosis were included. Consultation documents in the data set were generated by oncologists practicing at 6 centers in various geographic locations.
Survival was calculated as the number of months between the selected document and either the patient’s death or the mortality cutoff date of April 6, 2022.
The researchers compared 4 models: the non-neural bag-of-words (BoW) algorithm and 3 neural models — convolutional neural networks (CNN), long short-term memory (LSTM), and bidirectional encoder representations from transformers (BERT).
Results showed a numerically similar performance for BoW, CNN, and LSTM (balanced accuracy [BAC] > 0.800; area under curve [AUC] > 0.900). BERT had a lower performance across the board.
BoW performed best for predicting 6-month survival (BAC, 0.856; AUC, 0.928). CNN had the best performance for predicting 36-month survival (BAC 0.842; AUC 0.918) and 60-month survival (BAC 0.837; AUC 0.918).
There was a lack of clear performance gain using neural models compared with BoW, suggesting that survival prediction is dependent on the presence of certain words in initial oncology consultation documents.
Similar tokens (words with endings removed) have top 10 importance for BoW models used to predict 6-month and 60-month survival. For example, “palliat“ (short for “palliative” or “palliation”) is the most important feature in both.
In addition, words for different cancer types are important for 6-month and 60-month survival. The words “breast” and “prostate” are positive predictors for 6-month survival, whereas “lung,” “liver,” and “glioblastoma” are negative predictors for 60-month survival. “N0” (negative lymph node involvement) was a top 10 positive predictor for 60-month survival.
“These findings suggest that models performed comparably with or better than previous models predicting cancer survival and that they may be able to predict survival using readily available data without focusing on 1 cancer type,” the researchers concluded.
“Our results suggest it is possible to predict the survival of patients with cancer without having to construct structured data sets or limiting the predictions to specific types or locations of cancer. Given the widespread availability of initial oncologist consultation documents, this opens up the possibility of more easily training and using such models across cancer types at different cancer centers.”
Disclosures: This research was partly supported by the Pfizer Innovation Fund. Some study authors declared affiliations with biotech, pharmaceutical, and/or device companies. Please see the original reference for a full list of disclosures.
Nunez J-J, Leung B, Ho C, Bates AT, Ng RT. Predicting the survival of patients with cancer from their initial oncology consultation document using natural language processing. JAMA Netw Open. Published online February 27, 2023. doi:10.1001/jamanetworkopen.2023.0813