Scientists say they have worked out a machine learning method with the ability to provide close to precise survival estimates of patients afflicted with prostate adenocarcinoma, which is by far the most common type of prostatic cancer cases.
Their findings, revealed in the journal Computers in Biology and Medicine, are the outcome of the application and investigation of eight ensemble methods which they combine to provide prediction of overall survival in prostate adenocarcinoma patients.
The ensemble models the scientists deploy include Random Forest (RF), AdaBoost, Gradient Boosting (GB), Extreme Gradient Boosting (XGB), LightGBM (LGBM), CatBoost, Hard Voting Classifier (HVC), and Support Vector Classifier (SVC).
Their data set is derived from the Cancer Genome Atlas (TCGA) PanCancer Atlas.
The study, led by scientists from the University of Sharjah in the United Arab Emirates (UAE) as well as Turkey's Near East University, evaluates the data, "using essential performance indicators, such as accuracy, precision, recall, F-1 score, and ROC AUC score."
Out of the eight ensemble methods they employ to assess prostate cancer survival predictions, the scientists find Gradient Boosting (GB), a machine learning technique, to have "outperformed other models by obtaining a perfect score of 1.0 in accuracy, precision, recall, and F-1 score, and 0.99 as ROC AUC."
Other AI models the scientists find useful for the prediction of overall survival for prostate cancer patients include RF and AdaBoost –also among the machine learning techniques which scientists lean on for predicting cancer. Both, the scientists claim, have shown robust efficiency, suggesting a potential for predicting the overall survival rate of patients with prostate adenocarcinoma.
"The outstanding performances of GB are suggestive that it is an ensemble model, highly capable of predicting PAC (Prostate adenocarcinoma), because it identifies all true positive cases, and can minimize the negative cases as well as can be clinically integrated," the study authors write. "RF performances showed its ability to distinguish between positive and negative cases of PAC highlighting its high level of accuracy, especially in predicting the presence of PAC."
The scientists describe prostate adenocarcinoma as "a complex and common cancer in males and is one of the leading causes of cancer-related death globally."
Positioned beneath the urinary bladder encircling the urethra and anterior to the rectum, and about the size of a walnut with an average weight of 11 grams, the prostate is the integral component of the male anatomy.
Developing in the gland cells, prostate adenocarcinoma is the type of cancer that is diagnosed in up to 99% of all prostatic cancer patients. It is reported to be the second most common cancer in men after skin cancer, with its risk increasing with age. In the United States alone, more than 3.3 million men are diagnosed with prostate cancer, and about 1 in every 44 of them die because of the disease.
But if diagnosed early, there is a high chance of full treatment. The study is significant for treatment as predicting the overall survival rate of individuals with prostatic cancer, according to the scientists, has been "a substantial clinical barrier due to the diverse nature of the illness, coexisting medical conditions, and constraints associated with conventional diagnostic markers." The reality of this situation drives the scientists to resort to machine learning techniques.
This shows that the ensemble models, if incorporated into the clinical workflow, will be of great benefit to the decision-makers and urologists, according to co-author Dr. Dilber Ozsahin, associate professor at Sharjah University's College of Health science.
"The study presents an effective way of using ensemble models, particularly Gradient Boosting (GB), which may be effortlessly included into clinical processes. Therefore, when the result or ensemble models are integrated into the clinical workflow, the, urologist, other physician and decision makers performs diagnosis with confident."
Gradient Boosting or GB, according to the scientists, "demonstrated exceptional predictive accuracy, and precision, an F-1 Score of 1.0 across various measures, and an ROC-AUC value of 0.99 ... the ensemble model was able to predict 70.6 % OS of PAC patients and 29.4 % not survive.
"This will be of great benefit to the decision-makers and urologists. Consequently, in the future, studies should focus on the use of more extensive datasets and apply and implement the results in clinical settings to improve the authenticity of the study. Also, additional variables such as lifestyle and newer biomarkers could be integrated for future studies," the scientists write.
While the authors regard the findings of their study as important by demonstrating how ensemble techniques can improve prediction precision of overall survival of prostate cancer patients, they at the same time underscore the need for further research particularly in clinical settings.