Abstract
It is critical for healthcare providers to accurately determine lung cancer patients' prognostics and develop customized treatment plans. However, lung cancer has proven to be a complex disease, and every patient responds differently to treatment options, making survivability predictions highly challenging. This study proposes a holistic machine learning model that can assist healthcare providers in predicting the temporal effects of lung cancer-related factors on one-, five-, and ten-year survival rates. Variable selection algorithms such as genetic algorithm (GA) and Baruta are employed along with data balancing methods to achieve parsimonious models for survival prediction. Classification results are obtained through logistic regression and extreme gradient boosting algorithms followed by an information fusion technique to combine the classification results and identify the temporal effects of lung cancer variables over time. Results demonstrate that the prediction power of the classification models improved as the survival period increased. The models trained using the GA and intersection variable sets generated better average prediction scores. The study contributes to the cancer literature by analyzing the varying temporal impacts of lung cancer variables over varying time periods. Medical professionals can use these findings to understand better the longitudinal characteristics of lung cancer patients’ survival indicators.
Original language | English |
---|---|
Article number | 100263 |
Journal | Healthcare Analytics |
Volume | 4 |
DOIs | |
State | Published - Dec 2023 |
Keywords
- Data balancing
- Feature selection
- Lung cancer
- Machine learning
- Predictive analytics
- Survival analysis