Lung Cancer Prediction and Classification Using Decision Tree and VGG16 Convolutional Neural Networks

The Open Biomedical Engineering Journal 22 Apr 2024 RESEARCH ARTICLE DOI: 10.2174/0118741207290271240322061032



A malignant abnormal growth that starts in the tissues of the lungs is called Lung Cancer. It ranks among the most common and lethal cancers globally. Lung Cancer is particularly dangerous because of its aggressive nature and how quickly it can extend to other areas of the body. We propose a two-step verification architecture to check the presence of Lung Cancer. The model proposed by this paper first assesses the patient based on a few questions about the patient's symptoms and medical background. Then, the algorithm determines whether the patient has a low, medium, or high risk of developing lung cancer by diagnosing the response using the “Decision Tree” classification at an accuracy of 99.67%. If the patient has a medium or high risk, we further validate the finding by examining the patient's CT scan image using the “VGG16” CNN model at an accuracy of 92.53%.


One of the key areas of research on Lung Cancer prediction is to identify patients based on symptoms and medical history. Its subjective nature makes it challenging to apply in real-world scenarios. Another research area in this field involves forecasting the presence of cancer cells using CT scan imagery, providing high accuracy. However, it requires physician intervention and is not appropriate for early-stage prediction.


This research aims to forecast the severity of Lung Cancer by analyzing the patient with a few questions regarding the symptoms and past medical conditions. If the patient has a medium or a high risk, we further examine their CT scan, validate the result and also predict the type of Lung Cancer.


This paper uses the “Decision Tree” algorithm and the Customised “VGG16” model of CNN for the implementation. The “Decision Tree” algorithm is used to analyze the answers given by the patient to distinguish the severity of Lung Cancer. We further use Convolution Neural Networks with a Customised “VGG16” model to examine the patient's CT scan image, validate the result and categorize the type of Lung Cancer.


The “Decision Tree” approach for forecasting the severity of lung cancer yields an accuracy of 99.67%. The accuracy of the customized “VGG16” CNN model to indicate the type of Lung Cancer suffered by the patient is 92.53%


This research indicates that our technique provides greater accuracy than the prior approaches for this problem and has extensive use in the prognosis of Lung Cancer.

Keywords: Lung cancer, Decision tree classification, Convolutional neural networks, VGG16, Machine learning, Deep learning, Image processing.
Fulltext HTML PDF ePub