Artificial intelligence (AI) chatbots can sometimes provide accurate information about cancers, but these tools have limitations, according to a pair of studies published in JAMA Oncology.1,2
In the first study, researchers assessed chatbots’ responses to the top Internet searches related to 5 cancers.1 The chatbots provided information that was generally of high quality but not always actionable, and it was written at a college reading level.
In the second study, researchers found that a chatbot’s responses to queries about cancer treatments did not always align with recommendations in National Comprehensive Cancer Network (NCCN) guidelines.2
Most-Searched Queries About Cancers
Alexander Pan, of SUNY Downstate Health Sciences University in Brooklyn, New York, and colleagues evaluated chatbots’ responses to the top 5 search queries for skin, colorectal, prostate, lung, and breast cancers.1 All queries contained the terms “cancer symptoms” and “what is [specific cancer].”
The researchers tested 4 chatbots — ChatGPT, Perplexity, Chatsonic, and Bing AI. The team used the DISCERN validation tool to assess the quality of information the chatbots provided and the Patient Education Materials Assessment Tool (PEMAT) to analyze the understandability and actionability of responses. On a scale of 1-5 (DISCERN) or 0%-100% (PEMAT), higher scores on the validation tools indicated higher-quality responses.
The quality of the cancer information provided by the chatbots was high, with a median DISCERN score of 5 (range, 2-5).
However, the information was of moderate understandability. The median PEMAT score was 66.7% (range, 33.3%-90.1%), which the researchers deemed “college reading level.”
Furthermore, the chatbots often failed to provide actionable responses, with a median PEMAT score of 20.0% (range, 0%-40.0%).
“These limitations suggest that AI chatbots should be used supplementarily and not as a primary source for medical information,” the researchers concluded.
ChatGPT and Cancer Treatment Recommendations
Shan Chen, of Mass General Brigham in Boston, and colleagues evaluated whether ChatGPT responded to queries about cancer treatments with recommendations that were in line with NCCN guidelines.2
Because ChatGPT’s knowledge cutoff was September 2021, the researchers measured responses against the 2021 NCCN guidelines. Responses were assessed by board-certified oncologists.
The researchers used 104 queries for breast, prostate, and lung cancer. The chatbot provided at least 1 treatment recommendation for 102 of the queries (98%). All of these responses included at least 1 NCCN-concordant recommendation, but 35 (34.3%) also included at least 1 non-concordant recommendation.
Additionally, 13 of 104 chatbot responses (12.5%) were “hallucinated” — that is, they were not part of any recommended treatment for the specified cancer.
“The chatbot did not purport to be a medical device and need not be held to such standards,” the researchers noted. “However, patients will likely use such technologies in their self-education, which may affect shared decision-making and the patient-clinician relationship. Developers should have some responsibility to distribute technologies that do not cause harm, and patients and clinicians need to be aware of these technologies’ limitations.”
Disclosures: Some study authors declared affiliations with biotech, pharmaceutical, and/or device companies. Please see the original references for a full list of disclosures.
1. Pan A, Musheyev D, Bockelman D, Loeb S, Kabarriti AE. Assessment of artificial intelligence chatbot responses to top searched queries about cancer. JAMA Oncol. Published online August 24, 2023. doi:10.1001/jamaoncol.2023.2947
2. Chen S, Kann BH, Foote MB, et al. Use of artificial intelligence chatbots for cancer treatment information. JAMA Oncol. Published online August 24, 2023. doi:10.1001/jamaoncol.2023.2954