Development and Validation of a Machine Learning Model for Predicting the Prognosis of Elderly Gastric Cancer Patients: A Multi-Center Study in China
Authors
Xing-Qi Zhang, MD1,2,#, Ze-Ning Huang, MD1,2,#, Ju Wu, MD1,2, Chang-Yue Zheng, MD1,2, Xiao-Dong Liu, MD3, Yan-Bing Zhou, MD, PhD3, Ying-Qi Huang, MD1,2, Jian-Xian Lin, MD, PhD1,2, Qi-Yue Chen, PhD1,2, Ping Li, MD, PhD1,2, Jian-Wei Xie, MD, PhD1,2, Chao-Hui Zheng, MD, PhD1,2, Chang-Ming Huang, MD, FACS1,2*
# Zhang XQ and Huang ZN contributed equally to this work and should be considered co-first authors.
Affiliations
- Department of Gastric Surgery, Fujian Medical University Union Hospital, Fuzhou, Fujian Province, China.
- Key Laboratory of Ministry of Education of Gastrointestinal Cancer, Fujian Medical University, 350108 Fuzhou, Fujian Province, China.
- Department of General Surgery, Affiliated Hospital of Qingdao University, Qingdao, Shandong Province, China.
Corresponding Author
Chang-Ming Huang (hcmlr2002@163.com)
Background
Despite numerous prognostic methods for gastric cancer, accurate tools for elderly patients aged 75 and above remain scarce.
Methods
This multicenter retrospective study analyzed data from elderly patients who underwent radical gastrectomy at nine tertiary medical centers from 2009 to 2018. Patients were randomly assigned to training and testing groups. Core variables were selected using Random Forest (RSF) methods, and both RSF and Cox Proportional Hazards (CPH) models were used to predict overall survival (OS) and disease-free survival (DFS). Validation, calibration, and discrimination were performed using bootstrap resampling.
Results
Among 16,344 patients, 1,202 were included in the final analysis. The 5-year OS and DFS rates were 49.93% and 49.48%, respectively. The OS prediction model encompassed nine core variables, with the RSF model demonstrating better consistency (CiD: 0.719) than the CPH model (CiD: 0.704). The DFS model had seven core variables and similar discriminative abilities. At 60 months, the tAUC for OS prediction was higher for the RSF model (0.722) than the CPH model (0.775). Both models showed good calibration. An online prediction tool based on the RSF model was also developed.
Conclusion
This study developed an RSF model with excellent discriminative and calibration abilities for predicting long-term survival after radical gastrectomy for elderly gastric cancer patients, facilitating individualized treatment and follow-up strategies.
Keywords
Elderly; Gastric cancer; Machine learning; Overall survival
