QCovSML: A reliable COVID-19 detection system using CBC biomarkers by a stacking machine learning model

Tawsifur Rahman, Amith Khandakar, Farhan Fuad Abir, Md Ahasan Atick Faisal, Md Shafayet Hossain, Kanchon Kanti Podder, Tariq O. Abbas, Mohammed Fasihul Alam, Saad Bin Kashem, Mohammad Tariqul Islam, Susu M. Zughaier, Muhammad E.H. Chowdhury* (Corresponding Author)

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

he reverse transcription-polymerase chain reaction (RT-PCR) test is considered the current gold standard for the detection of coronavirus disease (COVID-19), although it suffers from some shortcomings, namely comparatively longer turnaround time, higher false-negative rates around 20–25%, and higher cost equipment. Therefore, finding an efficient, robust, accurate, and widely available, and accessible alternative to RT-PCR for COVID-19 diagnosis is a matter of utmost importance. This study proposes a complete blood count (CBC) biomarkers-based COVID-19 detection system using a stacking machine learning (SML) model, which could be a fast and less expensive alternative. This study used seven different publicly available datasets, where the largest one consisting of fifteen CBC biomarkers collected from 1624 patients (52% COVID-19 positive) admitted at San Raphael Hospital, Italy from February to May 2020 was used to train and validate the proposed model. White blood cell count, monocytes (%), lymphocyte (%), and age parameters collected from the patients during hospital admission were found to be important biomarkers for COVID-19 disease prediction using five different feature selection techniques. Our stacking model produced the best performance with weighted precision, sensitivity, specificity, overall accuracy, and F1-score of 91.44%, 91.44%, 91.44%, 91.45%, and 91.45%, respectively. The stacking machine learning model improved the performance in comparison to other state-of-the-art machine learning classifiers. Finally, a nomogram-based scoring system (QCovSML) was constructed using this stacking approach to predict the COVID-19 patients. The cut-off value of the QCovSML system for classifying COVID-19 and Non-COVID patients was 4.8. Six datasets from three different countries were used to externally validate the proposed model to evaluate its generalizability and robustness. The nomogram demonstrated good calibration and discrimination with the area under the curve (AUC) of 0.961 for the internal cohort and average AUC of 0.967 for all external validation cohort, respectively. The external validation shows an average weighted precision, sensitivity, F1-score, specificity, and overall accuracy of 92.02%, 95.59%, 93.73%, 90.54%, and 93.34%, respectively.
Original languageEnglish
Article number105284
Number of pages12
JournalComputers in Biology and Medicine
Volume143
Early online date15 Feb 2022
DOIs
Publication statusPublished - 1 Apr 2022

Bibliographical note

Funding
This work was supported by the Qatar National Research Fund (QNRF) Grant: UREP28-144-3-046. The statements made herein are solely the responsibility of the authors.

Data Availability Statement

Dataset and code availability
The dataset used for the development and validation of this study is available at [27] and the code and models are available at https://github.com/tawsifur/QCovSML-COVID-19-detection system-using-CBC-biomarkers.

Keywords

  • COVID-19
  • Detection
  • Complete blood count (CBC)
  • Stacking machine learning
  • RT-PCR

Fingerprint

Dive into the research topics of 'QCovSML: A reliable COVID-19 detection system using CBC biomarkers by a stacking machine learning model'. Together they form a unique fingerprint.

Cite this