Resources

2019.06
Prediction of hepatocellular carcinoma patient survival using machine learning classification rules

Author:Wei Zhou, Huan Chen, Wenbo Han, Ji He, Henghui Zhang


Background: The outcome prediction of hepatocellular carcinoma (HCC) is conventionally determined by evaluating tissue samples obtained during surgical removal of the primary tumor focusing on their clinical and pathologic features. Recently, accumulating evidence suggests that cancer development is comprehensively modulated by the host’s immune system underlying the importance of immunological biomarkers for the prediction of HCC prognosis. However, an integrated predictive algorism incorporating clinical characteristic and immune features still remain to be established. 


Methods: We obtained respectable stage II HCC specimens, along with adjacent para-tumor tissues from 221 patients who underwent surgical resection at Eastern Hepatobiliary Surgery Hospital, (Shanghai, China) from 2015 through April 2018. Characteristics such as CD8+, CD163+, tumor-infiltrating lymphocytes (TILs) were obtained for further model construction used to predict the status of 3 survival indexes: Overall Survival (OS ,≤ 24 or > 24 month), Progression Free Survival (PFS, ≤ 6 or > 6 month), and Recurrence/Death (RD). Mutual information and coefficient between each feature and the survival indexes were tested to remove low scoring features after data cleaning and standardization. Furthermore, recursive features selection was preformed to obtain the optimal features combination. Finally, supervised learning techniques include either boosting or bagging strategy were used to fit and predict model with a grid-search method optimizing the parameters. Meanwhile, a cross validation procedure with 0.2 proportion of test cohort was randomly carried out for 10 times to evaluate the model. 


Results: We finally confirmed 15 biomarkers from the 46 candidates as features for the survival status prediction by using a 221 patients cohort. Among them, the top 10 most important biomarkers, included both clinical and immune attributes. The AUC of our model for survival indexes (OS, PFS, RD) was ranged from 0.76 (RD) to 0.8 (PFS), and the accuracy was above 0.85. 


Conclusions: We describe the integrative analysis of the clinical and immune features which collectively contribute to the survival index of HCC. Machine learning techniques, such as Gradient Boosting and random forest classifier , have a great promise for using in HCC cancer survival prediction.