Artificial intelligence is thrusting oncology into a new realm, promising increasingly powerful diagnostic and therapeutic tools — so powerful, in fact, that some applications are already surpassing human ability in preclinical tests.

But much of the potential remains largely theoretical, untested in clinical settings and eyed warily by physicians for real-world application.

In a recent test, for example, “deep learning algorithms” outdid human pathologists in examining digitized whole-slide images of tissue sections and identifying lymph node metastases in women with breast cancer.1

“Basically, deep learning shows that, on specific tasks involving interpretation of images that require intelligence, we can, in principle, be able to perform at least as well as humans,” the study’s lead author, Babak Ehteshami Bejnordi, PhD, of the department of radiology and nuclear medicine at Radboud University Medical Center in the Netherlands, told Cancer Therapy Advisor. “If you need to look at particular structures, deep learning algorithms have reached a level of maturity that can rival human performance.”

The test aimed to replicate routine workflow for pathologists — about 129 slides in 2 hours. Participants in the Cancer Metastases in Lymph Nodes Challenge 2016 (CAMELYON16) competition developed algorithms to detect metastases in sentinel axillary lymph nodes. The performance of each algorithm was then compared to the performance of a panel of 11 pathologists.

Related Articles

“The best algorithm,” the designers of the challenge reported in December, “performed significantly better than the pathologists.”

The pathologists were, if extra time were needed, allowed to continue their analyses beyond the 2-hour time limit. The top 5 algorithms outperformed each of the 11 pathologists in the time-limited group.

“In cross-sectional analyses that evaluated 32 algorithms submitted as part of a challenge competition,” the authors wrote, “7 deep learning algorithms showed greater discrimination than a panel of 11 pathologists in a simulated time-constrained diagnostic setting, with an area under the curve of 0.994 (best algorithm) vs 0.884 (best pathologist).”

And, they noted, among those working close to the 2-hour limit for average workflow, “even the best-performing pathologist on the panel missed more than 37% of the cases with only micrometastases.”