It may even compound the problem by perpetuating a study’s biases into its final report, giving it a patina of objectivity it lacks in reality.

Watson’s greatest and most impressive advantage is its ability to swiftly analyze the vast corpus of oncology literature. Physicians and patients alike dream of a computing system that can aggregate information about every cancer that has been sampled and, by accessing the world’s databases and analyzing outcome data, deliver new and life-saving insights into novel therapies for any given patient.

Continue Reading

Watson is not, however, the realization of this dream: it does not crunch the raw data to arrive at new conclusions. Watson can only search what has already been studied to bring it to bear on a posed question, establishing statistical relationships between fragments of information based on available literature. It cannot, however, evaluate the raw data behind the literature and therefore cannot arrive at novel conclusions about, say, a specific genetic mutation; it is a sophisticated but superficial search engine.

Suppose we send a blood sample of a patient with chronic lymphocytic leukemia (CLL) for analysis with next-generation sequencing, a parallel sequencing approach that is routinely employed. This provides a tremendous amount of molecular profiling data, creating millions of snippets of sequenced DNA that are then recomposed by computerized algorithms.

Of the many genetic alterations identified with this technology, there will be a handful associated with specific therapeutic considerations. Let’s say that 1 in particular, the CARD11 gene, is associated with resistance to ibrutinib, a monoclonal antibody that is the standard of care for relapsed or refractory CLL with 11q deletion. Watson’s algorithms may not identify that the patient’s leukemia would be resistant to ibrutinib, which would be a mistake, because CARD11 mutations have been studied only in large B-cell lymphoma, not extensively in CLL.

It is possible that Watson’s impressive array of algorithmic systems can be adjusted to analyze the unfiltered data in MSKCC’s and the Broad Institute’s databases. Its most reliable service, however, may simply be to identify promising references to be reviewed by the researcher and clinician.

Systems of artificial intelligence, like Watson, have the ability to learn from their past analyses. They can correct prior misalignments to optimize found matches and maximize statistical relevance. It will, however, require a large amount of input and constant feedback from researchers and physicians to train machine learning models. It remains to be seen how patient data will be used for this purpose.


  1. Ferrucci DA. Introduction to “This is Watson”. IBM J Res Dev. 2012;56:1-15. doi: 10.1147/JRD.2012.2184356
  2. Thompson C. What is I.B.M.’s Watson? The New York Times Magazine website. Updated June 16, 2010. Accessed November 2016.
  3. Wakeman N. IBM’s Watson heads to medical school. Washington Technology website. Updated February 17, 2011. Accessed November 2016.
  4. IBM Watson for Genomics. IBM website. Accessed November 2016.