Chinese scientists and clinicians have developed a learning artificial intelligence system which can diagnose and identify cancerous prostate samples as accurately as any pathologist. This holds out the possibility of streamlining and eliminating variation in the process of cancer diagnosis. It may also help overcome any local shortage of trained pathologists. In the longer term it may lead to automated or partially-automated prostate cancer diagnosis.
Prostate cancer is the most common male cancer, with around 1.1m diagnoses ever year, worldwide1 (for comparison, that’s around x4 the number of men who live in Copenhagen). Confirmation of the diagnosis normally requires a biopsy sample, which is then examined by a pathologist. Now an artificial intelligence learning system, presented at the European Association of Urology congress in Copenhagen, has shown similar levels of accuracy to a human pathologist. In addition, the software can accurately classify the level of malignancy of the cancer, so eliminating the variability which can creep into human diagnosis.
“This is not going to replace a human pathologist” said research leader Hongqian Guo (Nanjing, China), “We still need an experienced pathologist to take responsibility for the final diagnosis. What it will do is help pathologists make better, faster diagnosis, as well as eliminating the day-to-day variation in judgement which can creep into human evaluations”.
Prof. Guo’s group took 918 prostate whole mount pathology section samples from 283 patients, and ran these through the analysis system, with the software gradually learning and improving diagnosis. These pathology images were subdivided into 40,000 smaller samples; 30,000 of these samples were used to ‘train’ the software, the remaining 10,000 were used to test accuracy – the results showed an accurate diagnosis in 99.38% of cases (using a human pathologist as a ‘gold standard’), which is effectively as accurate as the human pathologist. They were also able to identify different Gleason Grades in the pathology sections using AI; ten whole mount prostate pathology sections have been tested so far, with similar Gleason Grade in the AI and human pathologist’s diagnosis. The group has not started testing the system with human patients.
Prof. Guo continued “The system was programmed to learn and gradually improve how it interpreted the samples. Our result show that the diagnosis the AI reported was at a level comparable to that of a pathologist. Furthermore, it could accurately classify the malignant levels of prostate cancer. Until now, automated systems have had limited clinical value, but we believe this is the first automated work to offer an accurate reporting and diagnosis of prostate cancer. In the short-term, this can offer a faster throughput, plus a greater consistency in cancer diagnosis from pathologist to pathologist, hospital to hospital, country to country.
Artificial intelligence is advancing at an amazing rate – you only need to look at facial recognition on smartphones, or driverless cars. It is important that cancer detection and diagnosis takes advantage of these changes”.
Commenting, Professor Rodolfo Montironi (Professor of Pathology, Polytechnic University of the Marche, Ancona, Italy) said:
“This is interesting work which shows how artificial intelligence will increasingly step into clinical practice. This may be very useful in some areas where there is a lack of trained pathologists. Like all automation, this will lead to a lesser reliance on human expertise, but we need to ensure that the final decisions on treatment stay with a trained pathologist. The really important thing though, is that we ensure the highest standard of patient care. The future will be interesting”. Professor Montironi was not involved in this work – this is an independent comment.
The software was developed in conjunction with Nanjing Innovative Data Technologies, Inc (they were not involved in funding this work, see notes for funding details). The newness of the system means that there is no information yet on costs or on implementation.
The authors note some limitations to the work. There were more samples of Gleason Grade 3 and 4 than other grade, which maybe influence the AI calculation to some extent. They are also looking for suitably objective standards to allow direct comparison of Gleason Grade with the AI.