A rapid screening classifier for diagnosing COVID-19

Rationale: Coronavirus disease 2019 (COVID-19) has caused a global pandemic. A classifier combining chest X-ray (CXR) with clinical features may serve as a rapid screening approach. Methods: The study included 512 patients with COVID-19 and 106 with influenza A/B pneumonia. A deep neural network (DNN) was applied, and deep features derived from CXR and clinical findings formed fused features for diagnosis prediction. Results: The clinical features of COVID-19 and influenza showed different patterns. Patients with COVID-19 experienced less fever, more diarrhea, and more salient hypercoagulability. Classifiers constructed using the clinical features or CXR had an area under the receiver operating curve (AUC) of 0.909 and 0.919, respectively. The diagnostic efficacy of the classifier combining the clinical features and CXR was dramatically improved and the AUC was 0.952 with 91.5% sensitivity and 81.2% specificity. Moreover, combined classifier was functional in both severe and non-serve COVID-19, with an AUC of 0.971 with 96.9% sensitivity in non-severe cases, which was on par with the computed tomography (CT)-based classifier, but had relatively inferior efficacy in severe cases compared to CT. In extension, we performed a reader study involving three experienced pulmonary physicians, artificial intelligence (AI) system demonstrated superiority in turn-around time and diagnostic accuracy compared with experienced pulmonary physicians. Conclusions: The classifier constructed using clinical and CXR features is efficient, economical, and radiation safe for distinguishing COVID-19 from influenza A/B pneumonia, serving as an ideal rapid screening tool during the COVID-19 pandemic.


Introduction
Coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS Cov-2) first reported in December 2019 and has caused a global pandemic. Up to now, more than 61 million infected patients were reported globally, with 1.4 million deaths.
The diagnosis of COVID-19 is challenging in many countries due to its nonspecific symptoms and variable incubation period. Clinically, patients with COVID-19 can have presentations ranging from asymptomatic to severely ill. The most common symptoms are fever, fatigue, dry cough, myalgia, and Ivyspring International Publisher dyspnea [1][2][3][4][5], while less common symptoms include diarrhea, hemoptysis, and headaches [3][4][5]. These clinical manifestations are to some degree identical to those of other known pneumonias, especially influenza.
It is important to devise a rapid, economical, accurate approach to identify and diagnose the suspected population with COVID-19. The reference standard procedure for confirming the diagnosis is reverse transcriptase-polymerase chain reaction (RT-PCR) [5]. However, debates remained due to varied sensitivity of RT-PCR [3][4][5][6].
Recent studies have suggested that chest computed tomography (CT) can be used as a screening and diagnostic tool in epidemic areas with the sensitivity up to 97% [5][6][7]. However, several factors are concerns, delayed use because it is a limited medical resource during the COVID-19 pandemic, financial burden, the labor involved, difficulty disinfecting the device [5].
Deep learning has been widely used in medical image analysis, and the application of artificial intelligence (AI) to the diagnosis of COVID-19 has been proposed [8]. Trained with much labeled data, several deep learning systems were shown to be more accurate than human radiologists at identifying COVID-19 [9,10]. Most of these findings were based on CT information, and the performances were consistently satisfactory [11][12][13][14][15][16]. Given its restricted availability, higher radiation exposure, and complicated disinfection procedures, alternative approaches to chest CT are needed and chest x-ray (CXR) diagnosis in COVID-19 should be examined. Compare to chest CT, CXR is a simple procedure with lower cost and radiation exposure. Several studies proposed that CXR imaging can be used for the early diagnosis of COVID-19 [17,18], but the reported accuracy was inferior to that of chest CT [19,20]. Therefore, the current study used deep learning methods to construct a classifier combining clinical features with CXR information as a simple, efficient, economical, and accurate approach to differentiate COVID-19 from influenza A/B.

Patients and Materials
The retrospective study analyzed 525 patients with nucleic acid-confirmed COVID-19 and 107 patients with nucleic acid-confirmed influenza A/B pneumonia who were admitted to Wuhan Tongji Hospital and Second Affiliated Hospital Zhejiang University School of Medicine from January 2017 to June 2020. Since not all the patients underwent CXR, we used the chest CT localizer scan as a surrogate of a standard CXR. We excluded the patients with incomplete clinical data. The final cohort included 106 patients with influenza A/B epidemic viral pneumonia and 512 COVID-19 cases. We divided the cohort randomly into a training set of 290 cases (44 influenza A/B and 246 COVID-19) and a test cohort of 328 cases (62 influenza A/B and 266 COVID-19) ( Figure 1A). The division of subset was totally random with simple random sampling principles. Their detailed demographic characteristics are listed in Table 1. The baseline clinical features, multi-stage chest CT and localizers, course of the disease, severe events, and interventions (drugs and supportive therapies) were collected from all patients. This retrospective study was approved by the ethics committees of the participating hospitals.

Data Augmentation
Since a deep network should be trained on sufficient data and we enrolled fewer influenza cases than COVID-19 cases, data augmentation was done to produce more training cases. This process is widely used in deep learning and has proved useful for improving accuracy, especially when the number of cases is small or unbalanced. We firstly manually segment the lung areas because the localizer scan be totally unaligned. After segmentation, the image patches were resized to 256×256, and then random rotation (for -15~15 degree), scale (to 0.8~1.2 of the raw size), and transmit (1.0~1.1 of the raw size) were performed to augment our cases. Then we cropped a 224×224 patch from the augmented patches, and Gaussian noise was also randomly added to the training samples. Using the torchvision toolbox of Pytorch, the augmentation was done automatically in the training process and a sampler is introduced to make sure that influenza cases and COVID-19 cases are in the same amount in every batch.

Deep Neural Network
A deep neural network (DNN) is a powerful machine-learning tool. Feature extraction, selection, and classification can be all formatted as neural layers using a DNN. To merge clinical patterns with CXR images, which combines clinical vectors with images, the proposed DNN input the clinical vectors and CXR images separately ( Figure 1B). The CXR images were input as matrixes of gray values and convolutional construction extracted representative features from the images. The shallow layers focused on structural features, such as the shape, edge, and texture of lesions. The deep layers mined targeted semantic information, such as the presence or absence of a lesion and non-severe or severe grades of disease. We used Alexnet [21], a widely used deep network, for the CXR processing. For the clinical information, every element of the vector has different dimensionality and could be associated with every other element. We used a fully connected layer and a batch normalization layer to extract deep clinical features. The deep features derived from the CXR were then concatenated with the clinical features, resulting in fused features. Another fully connected layer and batch normalization layer were used to compute the final output, the combined diagnosis. The proposed fused network can be conveniently degraded into a clinical network or a CXR network by removing the other, as shown in Figure 1B. The DNN parts were all implemented based on Pytorch tools [22]. We also evaluated the diagnostic performance of chest CT, which has proven to be a valuable tool for diagnosing COVID-19. Here, we used 3D densenet [23] as the CT diagnosis network. Since our cases are unbalanced in categories, we use number-balanced weights to keep loss function sensitive to both categories. Cross entropy of each category was multiplied by weights. Mathematically, in which B is batch size and is the softmax-probability of case and category . I� = � is indicative function which is 1 when the equitation = is true otherwise 0. is the ground-truth label of category of case j.
is the weight of category is computed by w i = ∑ .

Statistical Analysis
In addition to the deep learning method, basic statistical analyses were performed. For univariate analyses, t-tests were used to compare 56 clinical features and demographic information. For multivariate analysis, we used the trained parameters identified from the clinical part of our deep learning network ( Figure 1B) to compute the relative coefficients, which indicate the importance of each feature (Table S1). The area under the receiver operating curve (AUC), sensitivity, specificity, and negative and positive predictive values were also computed.

Univariate and Multivariate Analysis of Clinical Features
We identified different patterns of clinical features for COVID-19 and influenza (Table 1). Fever, nasal congestion, sore throat, pharyngeal congestion, and productive sputum were common in both, while septic shock was more common in influenza and diarrhea was a salient symptom in COVID-19. Underlying lung disease and an impaired immune system were significantly associated with influenza, but not COVID-19, indicating that the entire population is susceptible to COVID-19, while influenza virus pneumonia is prone to affect specific patients. In influenza, the white blood cell count, C-reactive protein (CRP), procalcitonin, and bilirubin were increased, while in COVID-19 the number of platelets was significantly increased. Table 2 shows the coefficients of each element derived from the multivariate model. Procalcitonin, urea nitrogen, and CRP were negatively related to COVID-19, as were fever and underlying lung disease.

DNN Performances of Classifiers
Given the difference in clinical manifestations shown above, we first constructed a classifier using clinical features (Figure 2A). The AUC was 0.909 (95% CI 0.891-0.914) with a sensitivity of 90.5% and specificity of 59.4% in the test cohort. Next, we explored the diagnostic value of CXR and the validated AUC was 0.919 (95% CI 0.909-0.930) with a sensitivity of 86.9% and specificity of 74.2%.
Separately, the clinical features and CXR performed comparably, but less than satisfactorily. When we combined the two, the AUC was increased to 0.952 (95% CI 0.944-0.960), with an improved sensitivity of 91.5% and notably specificity of 81.2%. Heatmap visualized top ranked 500 deep features among 5120 features and showed largely consistent differences between individuals with COVID-19 and influenza A/B ( Figure 3A). Principal component analysis (PCA) double confirmed that COVID-19 and influenza A/B is a predominant source of variation in the dataset ( Figure 3B). Since our method output possibility scores for cases, threshold points can be selected as cut-offs according to clinical needs. We also compared the diagnostic power of chest CT and our combined clinical-CXR modality. This showed that chest CT had a numerically higher AUC of 0.994 (95% CI 0.993-0.997), indicating that our combined modality is sufficiently accurate and capable of rapid screening.

DNN Performance in severe and non-severe subgroups
Furthermore, we tested our rapid screening classifier (combined modality) in severe and non-severe subgroups ( Figure 2B, 2C). In non-severe cases, our rapid screening classifier had an AUC of 0.971 (95% CI 0.964-0.980) with a sensitivity of 96.9% and specificity of 73.2%. The CXR classifier had an AUC of 0.926 (95% CI 0.914-0.941) with a per-exam sensitivity of 92.7% and specificity of 63.2%. The CT-based classifier had a per-exam sensitivity of 99.5% and specificity of 85.5%, with an AUC of 0.992 (95% CI 0.989-0.995). Our rapid screening classifier was as good as the CT-based classifier in the nonsevere subgroup.   -19) and influenza. While the accuracy for diagnosing influenza using clinical features is relatively low and that for COVID-19 using CXR is lower, combining the clinical features and CXR improves both. (B) Diagnostic performance in the non-severe subset. As shown in the ROC curves for the non-severe subset (B1), clinical data (blue) perform better than chest x-ray (CXR) (orange). B2-B5: confusion metrics for clinical only (B2), CXR only (B3), combined (B4), and CT (B5) in non-severe patients. Combining the CXR and clinical data improves the diagnostic accuracy of COVID-19; although the diagnostic accuracy for influenza is slightly lower than with the clinical features only, the overall area under the curve is improved in the combined method. (C) Diagnostic performance in the severe subset. As presented in the ROC curves for the severe subset (C1), the diagnostic accuracy of CT outperformed the clinical feature or CXR. The area under the curve of the combined method is no better than for CXR only (p = 0.46). C2-C5: The confusion metrics for clinical only (C2), CXR only (C3), combined (C4), and CT (C5). AUC: area under the receiver operating curve.  . Comparison between pulmonary physicians and artificial intelligence (AI) system. The blue line is the receiver operating characteristic (ROC) curve of proposed AI system using fused clinical and chest x-ray (CXR) data, while the yellow one is the performance for CXR only. The round points are readers' results using only CXR and the star points are performances of pulmonary physicians using clinical data together with images.

The superiority of AI system to pulmonary physicians
We further conducted validation study which compared the diagnostic accuracy between AI system and 3 experienced pulmonary physicians (Figure 4). 50 cases, consisting of 25 COVID-19 individuals and 25 influenza individuals, were randomly selected. All readers were asked to read CXR independently without any clinical information in the first round, and to read with combined CXR and clinical information in the second round. The results showed that the average diagnostic accuracy of CXR for pulmonary physicians with and without clinical information was 0.467 and 0.473, respectively. The average reading time was 25 minutes. By contrast, the diagnostic AUC for AI system using CXR alone and CXR plus clinical information was 0.935 and 0.958, respectively, and the processing time was only 0.2 second.

Discussion
We developed a rapid screening classifier to distinguish COVID-19 from influenza A/B pneumonia constructed using clinical and CXR features. It not only had comparable efficacy to chest CT but was also efficient, economical, and radiation safe. Of importance, for non-severe cases, the classifier combining clinical and CXR features had satisfactory efficacy, with an AUC of 0.9719. As most patients in the early stage of COVID-19 have mild illness, this is in line with our vision that the combined classifier is an ideal rapid screening tool. We also confirmed the value of chest CT in the diagnosis of COVID-19, especially its critical role in severe cases. However, the combined classifier based on clinical features and CXR remains a reliable alternative for screening severe COVID-19 when CT is not feasible for various reasons.
Our study revealed different patterns of symptoms in influenza and COVID-19. First, patients with COVID-19 pneumonia experienced less fever and had lower body temperatures than the patients with influenza, which indicates that patients with COVID-19 pneumonia can be asymptomatic. It is important that any screening system identify asymptomatic infectors to prevent them from turning into super-spreaders or severe cases. Our rapid screening classifier fits this role perfectly during the COVID-19 pandemic. Second, diarrhea was again found to be a typical symptom of COVID-19. Angiotensin-converting enzyme II (ACE2), which was highly expressed in both lung type II alveolar cells and gastrointestinal enterocytes, was proven to be the cell receptor of the novel SARS Cov-2 [24]. Therefore, diarrhea should be regarded as a warning sign for SARS CoV-2 infection [25,26]. Third, a higher percentage of influenza patients had an impaired immune system and underlying lung diseases than did COVID-19 patients. It could be explained by blunted T and NK cell amount and function in influenza patients resulted in greater susceptibility [27][28][29]. Conversely, the entire population is susceptible to SARS CoV-2, but older patients with comorbidities need greater vigilance regarding worsening disease.
COVID-19 infection promotes the transformation of pathogenic T lymphocytes and induces inflammatory monocytes to express IL-6 and accelerate inflammation [38]. Hence, a coagulation cascade may be activated by a cytokine storm [32].
The lower incidence of septic shock in COVID-19 was in line with the finding that up to 76% of the COVID-19 cohort was culture-negative for bacteria and fungi [24]. The hypothesis of virus sepsis and severe COVID-19 is a topic of lively debate [32]. Immune response disorders characterized by cytokine storms may be positively involved in the pathogenic mechanism of viral sepsis [39]. The cytokine storm induced by invasion of the novel coronavirus causes diffuse lung damage and systematic inflammation, leading to multiple organ failure and viral sepsis [39]. Interleukin 6 and GM-CSF are two key triggers in cytokine storms [40]. Application of cytokinemodulatory therapy, especially anti-IL-6 agents, is expected to improve the prognosis of severe COVID-19.
We used CT localizer scans as surrogates of a standard CXR. Localizer scans are physically equivalent to an x-ray, although differences remain. Localizer scans can be presented as coronal and sagittal scans of patients in a supine position, although their parameter adjustment is not as precise as for x-rays, leading to lower-quality images containing less radiological information. Therefore, we may have underestimated the value of CXR. In addition, localizers usually cover a wider view than CXR and the additional imaging, such as of ventilators, may result in artificial effects. To overcome these issues, we cropped out the lung areas of every scan manually to force the system to focus on lung area when making diagnostic decisions. Although viral pneumonias usually have similar imaging characteristics, there are still different radiological patterns between COVID-19 and influenza, in either CT or x-rays. The predominant pattern in COVID-19 is characterized by ground-glass opacities and consolidation opacities with a peripheral distribution, while the typical radiological findings of influenza are diffuse ground-glass opacities and small nodules with more central locations [41][42][43][44]. In the bronchovascular area, a crazy-paving pattern is observed more often in COVID-19 and often indicates a poor prognosis [44,45]. Pleural effusions are more common in influenza, while pleural thickening may occur in COVID-19 [44,46,47].
There are several limitations to our study. First, the numbers of cases of COVID-19 and influenza were not balanced, which may increase the overfitting risk. The prediction of all cases in the category with the greatest numbers (COVID-19 in our experiments) can also yield good accuracy performance, with high sensitivity, but low specificity. Our model overcame this drawback by adding number-balanced weights for loss function and augmenting the size of the influenza category with fewer patients, resulting in high sensitivity and specificity, and successfully eliminating the influence of unbalanced numbers. Second, because this study examined retrospective cohorts, larger prospective validation cohorts are warranted in the future.
In conclusion, we devised a rapid screening classifier constructed using clinical and CXR features to distinguish COVID-19 from influenza A/B pneumonia. The classifier was efficient, economical, and radiation safe. Our combined classifier may be an ideal rapid screening tool during the COVID-19 pandemic.