Identification of pneumonia based on chest x-ray images using wavelet scattering network

Lungs pneumonia is one of the most dangerous diseases that affect the human lung. Pneumonia is often caused by bacteria, or viruses, fungi .and It affects one or all parts of the lung The X-ray image is one of the most diagnoses tools that used in medical field to detect pneumonia. Therefore, in this paper, deep learning method as wavelet scattering network implemented as classification model of lung pneumonia. Besides, the X-Ray images features extraction have been implemented by wavelet scattering transform. Since wavelet scattering networks need high replication datasets for training and testing wavelet scattering model, 500 XRay images for pneumonia and 500 Normal X-ray have been used to training data and the creation of reliable training details automatic identification images. In this work, networks will gain information from pre-trained networks on 650 images datasets, and 350 images are used for testing. The proposed system results specify that the wavelet scattering network classified chest X-ray images by accuracy reached to 98 %.


Introduction
Many experts have recently expressed attention in various topics such as cloud computing [1,2], internet of things [3,7], cryptography [8,14], machine learning, medical image recognition and detection [15,21]. One of famous medical image is Pneumonia which is the result of an infection from a bacteria, virus, or fungi that travels through the air into the lung [22]. Infection (purulent material), causing cough with coolness or pus, can fill the air sacs., illness, chills, and difficulty breathing. Pneumonia can range in seriousness from slight to poisonous. This disease is more dangerous on infants, children, and people elder than age 65, and people with health complications like heart disease and chronic diseases [23]. The most commonly sought diagnostic technique for all forms of pneumonia involves studying the increased opacity in regions of the lungs as shown by the chest radiograph, or chest X-ray (CXR). The increased opacity is caused due to the inflammation of the lungs with the high amounts of liquid in the affected areas [24]. There can be complications to the diagnosis of pneumonia through CXR because of the possibility of existence of pulmonary edema [25] which is mostly caused by cardiac problems, or internal lung bleeding, lung cancer, or in some patients, atelectasis [26] which results in unilateral collapse or shutdown of a part of a lung or the whole lung itself. In this condition, alveoli are deflated to very low volumes, visible from the increased opacity of the affected part seen in the CXR. Due to these complications, it becomes vital for having trained physicians and specialists, equipped with the patients' clinical record, to study the CXRs at different time frames for comparison and proper diagnosis radiograph of the chest, named a chest X-Ray (CXR) is a chest prediction radiograph used to detect chest symptoms. Like all the chest x-ray has been shown to reduce the number of erroneous diagnoses, and to be especially useful in teaching hospitals. It is essential to the control of tuberculosis at its source and offers hope of earlier detection of cancer of the lung [27,28]. The chest X-ray has real problems in terms of analyzing the data when it is taken directly to patients on a stretcher or bed [29,30]. Therefore, a scattering wave network was used to give a display power of the image with the true values of the patient data. One of the wavelets scattering architectures is the wavelet scattering network that used in this work [31]. The wavelet scattering network has been usually used in the medicinal turf, because of powerful feature extraction illustration. The accurate recognition and classification of wavelet scattering network technique is the most successful technology in medical images applications [32,34]. Working on a correct medical diagnosis to detect lung diseases will reduce the suffering of patients, so X-ray images are used. Majority of research in this area focuses on feature detection using modern CNN architectures that are more complex than conventional, few layer architectures. Kermany et al. [35], for example, conduct a thorough investigation into the development of diagnostic tools for patients with treatable diseases. Using a state-of-the-art deep CNN, researchers are able to treat blinding retinal diseases and pneumonia. One of the issues with these deep, modern CNN architectures is the difficulty of training all of the corresponding layers, which takes a long time and requires a lot of computing power. ChexNet, suggested by Rajpurkar et al. [36], is another example of the use of very deep convolutional architectures. Their design properly detects pneumonia and uses a heat map to pinpoint the areas with the most severe lung inflammation. Zech et al. [37] investigated and tested the performance of ChexNet training on an internal dataset of pneumonia and normal medical CXRs. For better generalization of the model, the work done in [38] revealed that a generalized pneumonia detection model must be trained on pooled data from different sources (for example, hospitals or different departments within a hospital). Pankratz et al. [39] used a machine learning algorithm called logistic regression to differentiate between cases of ordinary interstitial pneumonia (UIP) and non-UIP cases, with an area under the receiver-operator characteristic curve (AUC) of 0.92. In general, there is a tradeoff between machine learning systems' intelligibility and their precision in the field of medicine. The models that achieve high precision are generally not very understandable. To put it another way, one cannot fully comprehend every step of a less intelligible model's process, making it difficult to comprehend, edit, or validate the parameters of such models, despite their high accuracy. When deciding between simple and understandable machine learning algorithms like logistic regression or random forest versus more complex, less understandable deep learning models like artificial neural networks that offer higher accuracy, we see such trade-offs. Because these AI systems are augmented, high accuracy may not always be the primary objective in a field like medicinewith the final say being exercised by a qualified person (say, a doctor). Caruana et al. worked on developing an understandable model using generalized additive models to solve this difficult trade-off (GAMs) [40]. to create state-of-the-art accuracy on CXR data using generalized additive models with pairwise interactions (GA2Ms) [41]. Wang et al. developed a hospital-scale chest X-ray dataset with over 100,000 frontal view CXRs from over 30000 different patients for eight different thoracic diseases [42]. Data mining was used to obtain the ground-truth labels, which were based on the judgment of a single radiologist. This ground truth may not always be correct due to the ambiguity of pathologies' appearance on CXRs In this paper, we will use the scattering wavelet network method that will be suitable for dealing with chest Xray images in order to overcome the disadvantages of other methods that are used in diagnosing medical images that need more accurate training and testing for the data set. Therefore, increasing accuracy when using the wave scatter network.

Material and methods
The mathematical basics of Wavelet Scattering give in [43,44]. It's well understood, relies on few parameters in contrast deep networks. The processes data of wavelet scattering transform are in stages. The mean of three stages wavelet scattering transform has been identified in figure1 when the output of first stage develops Feedback for the next step. The zeroth-order. By simple averaging of the data, scattering coefficients are computed [45]. The wavelet scattering structure used tree algorithm as shown in Figure 2.  [25] |ψ j,k | : wavelets coefficients. ϕ J :scaling function. f : input data.
{j, k} will be waves, ϕJ will be the measure of the function, and f will be a data entry in the image data being for both j, k, there are several rotations that are specified for the wavelet. The sequence of edges from root to node will be a path [46]. The coefficients of the graph wrapped with the scale function ϕJ are the dispersion coefficients. Low-contrast features extracted from data are represented by a set of dispersion coefficients. Lowpass filtering happens as a result of wrapping with the scaling function, and the information is lost. However, when measuring transactions in the next step, information is retrieved [47,48]. To extract features from the data, first build and configure the framework using wavelet scattering (for time series) or wavelet scattering (for image data). The size of the parameter scale, the number of candidate banks, and the number of waves per response in each candidate bank are all parameters you can set. You can also set the number of revolutions per wavelet in wavelet Scattering. You may also set the number of revolutions per wavelet in wavelet Scattering. Using the wave scattering object's functions to remove features from time series. transform, or the matrix property. To derive features from image data, use the wavelet Scattering function of the scattering Transform object or feature matrix. The transformation creates features in a rapidly method. start, transform the data with the function, f * J to obtain S [0], the zero-order coefficients. Next, proceed as follows [49,50]: 1. Each wave filter in the first candidate bank, take the waveform of the input data 2. with parameter for each of the filteres outputs. The nodes are Scalogram, U [1]. 3. Taken Average all of the units with filter size. The results by are the coefficients of the first order, S [1]. Repeat these process with each node.
The scaling and scattering coefficients are returned by the scattering Transform function. Dispersion features are returned by the Feature Matrix function. By learning the algorithms, all outputs can be easily consumed.

Proposed system
The main objective of the proposed system is to discover the natural images of the abnormal, depending on a group of medical images, where the proposed system consists of several stages as in the diagram shown in Figure 3.

Dataset
The database is supported in this work has been published by a team of researchers at Mdamlas/ Covid-19 X-Ray_ detection. The dataset 1000 images has two type images (500 normal and 500 pneumonia) and size images 1024*1024 and as shown in figure 4 .
Normal Pneumonia Figure 4. The chest X-ray images The dataset will be divided into trained data and tested data according to the type of images as shown in table 2.

Preprocessing
This stage will process the data set, and it will be basic and necessary to improve the model, as it contains two steps of conversion and resizing, as shown in figure 5.

Figure 5. Preprocessing step
The step of converting the image from a color image to gray scale will facilitate the work from 3-band to 1band, meaning it converts from 24 bytes / pixel to 1 byte / pixel. Resize step The images will be converted from 1024*1024 into 32*32 to be prepared for the wavelet scattering network and their values are converted from 0 to 255.

Wavelet scattering networks
This stage represents the heart of the proposed system, where the images are passed from the preprocessing stage to a network of wavelet filters consisting of 3 levels and 8 nodes for each level, where in each layer the image is entered into a group of filters, where this stage is similar to the convolution filters and as used in a convolutional neural network where the output of this step is a feature that is useful for the classification output, as shown figure 6. Figure 6. The wavelet scattering filters (L=3, N=8)

SoftMax classifier
At this stage features resulting from the scattering wavelet network are trained, where priority is given to each class 0.5 to pneumonia, either abnormal. SoftMax is used to obtain the class-specific probabilities that have value (0,1). Is a mathematical function that transforms a vector of numbers into a vector of probabilities, with the probabilities of and value proportional to the vector's relative size? The class of pneumonia in SoftMax will take a probability. If there is a pneumonia in the image, we give it a probability (0). If it is not, then it takes a probability (1) and the probability of the image is normal.

Evaluation Results
The proposed work is implementing using python 3.9 with CPU 2.6, RAM 8G and under Window 10 64-bit environment. The accuracy and loss standard factors, result between training and testing of wavelet scattering network system shown in Figure 7. The wavelet scattering model has high accuracy between training and testing in the first 3 epoch and 98% matching for more than 6 epochs. The testing gives less loss camper with training loss. The performance calculation for proposed wavelet scattering frameworks in train 650 images and test other 350 images. While figure 8 show result confusion matrix that used to describe dataset that's testing 350 images. The classification metrics accuracy (Acc) and F-score are executed. The proposed system architecture has reached best presentation with a typical classification accuracy and S-Score.

Conclusions
The proposed work concluded. The experimental work has been detected that the feature extraction of X-Ray images has more effective by select 4 wavelet scattering filter banks. The wavelet scattering method gives high accuracy to identify the chest X-Ray images, this help to diagnostics the pneumonia .In this method, at extracting the features by means of a wavelet scattering network that contains a set of filters is more than the convolutional neural network filters.