Hybrid Deep Learning Framework for Histopathology Image Classification of Lung and Colon Cancers Using ResNet18, ViT, GCN, and ViT+GAT

Tanvi Dhole; Suprabha Devane; Sanchi Jadhav; Trupti Jadhav; Prachi Pramod Waghmare

doi:10.14445/22312803/IJCTT-V74I5P101

Research Article | Open Access | Download PDF

Volume 74 | Issue 5 | Year 2026 | Article Id. IJCTT-V74I5P101 | DOI : https://doi.org/10.14445/22312803/IJCTT-V74I5P101

Hybrid Deep Learning Framework for Histopathology Image Classification of Lung and Colon Cancers Using ResNet18, ViT, GCN, and ViT+GAT

Tanvi Dhole, Suprabha Devane, Sanchi Jadhav, Trupti Jadhav, Prachi Pramod Waghmare

Received	Revised	Accepted	Published
15 Mar 2026	20 Apr 2026	11 May 2026	28 May 2026

Citation :

Tanvi Dhole, Suprabha Devane, Sanchi Jadhav, Trupti Jadhav, Prachi Pramod Waghmare, "Hybrid Deep Learning Framework for Histopathology Image Classification of Lung and Colon Cancers Using ResNet18, ViT, GCN, and ViT+GAT," International Journal of Computer Trends and Technology (IJCTT), vol. 74, no. 5, pp. 1-9, 2026. Crossref, https://doi.org/10.14445/22312803/IJCTT-V74I5P101

Abstract

Cancer causes a large number of deaths around the world every year. To diagnose cancers correctly, doctors examine tissue images carefully, but doing this manually takes a lot of time, and different doctors can reach different conclusions from the same image. This study presents a deep learning model that combines several techniques, such as ResNet18, Vision Transformer (ViT), Graph Convolutional Network (GCN), and Graph Attention Network (GAT), to classify these cancer images more accurately. ResNet18 is used to capture detailed local features from the images, while ViT analyzes the global context by understanding how different parts of the image relate to each other. GCN and GAT further model and refine the structural relationships between features. The novelty of this work lies in integrating convolutional, transformer-based, and graph-based learning within a single framework to jointly capture local, global, and relational information for histopathological image classification. Experimental evaluation on the LC25000 dataset demonstrates that the proposed ViT+GAT architecture achieves improved classification accuracy and generalization performance compared to standalone ResNet18, ViT, and GCN models. These results indicate that the proposed approach can support more reliable and efficient automated cancer diagnosis in computational pathology.

Keywords

Lung Cancer, Colon Cancer, Vi-sion Transformer (ViT), Graph Convolutional Network (GCN), Graph Attention Network (GAT), Histopathology Classification.

References

[1] Kaiming He et al., “Deep Residual Learning for Image Recognition,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016.
[CrossRef] [Google Scholar] [Publisher Link]

[2] Alexey Dosovitskiy et al., “An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale,” International Conference on Learning Representations, pp. 1-22, 2021.
[CrossRef] [Google Scholar] [Publisher Link]

[3] Thomas N. Kipf, and Max Welling, “Semi-Supervised Classification with Graph Convolutional Networks,” International Conference on Learning Representations, pp. 1-14, 2017.
[CrossRef] [Google Scholar] [Publisher Link]

[4] Petar Veličković et al., “Graph Attention Networks,” International Conference on Learning Representations, pp. 1-12, 2018.
[CrossRef] [Google Scholar] [Publisher Link]

[5] B. Cao et al., “LC25000: Lung and Colon Histopathological Dataset for Cancer Classification,” Data in Brief, vol. 35, pp. 106-112, 2021.
[CrossRef] [Publisher Link]

[6] Siemen Brussee et al., “Graph Neural Networks in Histopathology: Emerging Trends and Future Directions,” Medical Image Analysis, vol. 101, pp. 1-22, 2025.
[CrossRef] [Google Scholar] [Publisher Link]

[7] Linhao Li et al., “An Adaptive Feature Fusion Framework of CNN and GNN for Histopathology Images Classification,” Computers and Electrical Engineering, vol. 123, 2025.
[CrossRef] [Google Scholar] [Publisher Link]

[8] Mus'ab S. Alkasasbeh et al., “Hybrid CNN–GCN Framework for Brain Tumor MRI Classification: A Graph-Based Approach to Smart Healthcare Diagnostics,” Journal of Applied Clinical Medical Physics, vol. 27, no. 4, pp. 1-21, 2026.
[CrossRef] [Google Scholar] [Publisher Link]

[9] Ji Woong Kim, Aisha Urooj Khan, and Imon Banerjee, “Systematic Review of Hybrid Vision Transformer Architectures for Radiological Image Analysis,” Journal of Imaging Informatics in Medicine, vol. 38, pp. 3248-3262, 2025.
[CrossRef] [Google Scholar] [Publisher Link]

[10] Zhan Shi et al., “Integrative Graph-Transformer Framework for Histopathology Whole Slide Image Representation and Classification,” International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 341-350, 2024.
[CrossRef] [Google Scholar] [Publisher Link]

[11] Yassine El Kati, Shu-Lin Wang, and Talal Ahmed Ali Ali, “Hybrid GNN-Transformer Model for Multi-Omic Cancer Classification with Interpretable Pathway-Driven Feature Selection,” PeerJ Computer Science, vol. 12, pp. 1-23, 2026.
[CrossRef] [Google Scholar] [Publisher Link]

[12] Fahad Shamshad et al., “Transformers in Medical Imaging: A Survey,” Medical Image Analysis, vol. 75, 2022.
[CrossRef] [Google Scholar] [Publisher Link]

[13] Yaqi Wang et al., “Graph Neural Network Enhanced Dual-Branch Network for Lesion Segmentation in Ultrasound Images,” Expert Systems with Applications, vol. 256, 2024.
[CrossRef] [Google Scholar] [Publisher Link]

[14] Yi Zheng et al., “Graph Attention-Based Fusion of Pathology Images and Gene Expression for Prediction of Cancer Survival,” IEEE Transactions on Medical Imaging, vol. 43, no. 9, pp. 3085-3097, 2024.
[CrossRef] [Google Scholar] [Publisher Link]

[15] Mingxing Tan, and Quoc Le, “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks,” Proceedings of the International Conference on Machine Learning, vol. 97, pp. 6105-6114, 2019.
[Google Scholar] [Publisher Link]

[16] Ze Liu et al., “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows,” Proceedings of the IEEE International Conference on Computer Vision, pp. 10012-10022, 2021.
[CrossRef] [Google Scholar] [Publisher Link]

[17] Wen-Ling Chou et al., “Light-weight Vision Transformer-based Semantic Segmentation for Medical Images,” 2025 IEEE International Conference on Advanced Visual and Signal-Based Systems (AVSS), Tainan, Taiwan, pp. 1-4, 2025.
[CrossRef] [Google Scholar] [Publisher Link]

[18] I-Chung Hsieh, and Cheng-Te Li, “Fortifying Robustness in Graph Neural Networks: A Loss Correction Approach to Mitigate Label Noise,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1-15, 2026.
[CrossRef] [Google Scholar] [Publisher Link]

[19] Yiqing Shenet al., “MoViT: Memorizing Vision Transformers for Medical Image Analysis,” Conference proceedings 14^thInternational Workshop Machine Learning in Medical Imaging, Vancouver, BC, Canada, pp. 205-213, 2023.
[CrossRef] [Google Scholar] [Publisher Link]

[20] George Batchkala, Bin Li, and Jens Rittscher, “Evaluating Histopathology Foundation Models for Few-Shot Tissue Clustering: An Application to LC25000 Augmented Dataset Cleaning,” Conference Proceedings Second MICCAI Workshop: Data Engineering in Medical Imaging, Marrakesh, Morocco, pp. 11-21, 2024.
[CrossRef] [Google Scholar] [Publisher Link]