REVIEW ARTICLE


An Exhaustive Study on Deep Neural Network-based Prediction of Heart Diseases and its Interpretations



Jothiaruna Nagaraj1, Anny Leema A.1, *
1 Department of School of Information Technology Science and Engineering, Vellore Institute of Technology University, Vellore, Tamil Nadu 632014, India


Article Metrics

CrossRef Citations:
1
Total Statistics:

Full-Text HTML Views: 447
Abstract HTML Views: 290
PDF Downloads: 247
ePub Downloads: 212
Total Views/Downloads: 1196
Unique Statistics:

Full-Text HTML Views: 315
Abstract HTML Views: 192
PDF Downloads: 184
ePub Downloads: 158
Total Views/Downloads: 849



Creative Commons License
© 2022 Nagaraj and Leema A

open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: https://creativecommons.org/licenses/by/4.0/legalcode. This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

* Address correspondence to this author at the Department of School of Information Technology Science and Engineering, Vellore Institute of Technology University, Vellore, Tamil Nadu 632014, India; E-mail: annyleema.a@vit.ac.in


Abstract

Cardiovascular disease prediction is important in day-to-day life. A tool to diagnose cardiovascular diseases is an Electrocardiogram (ECG), which records electrical activities happening in the heart through a wave. A determination is made by checking the wave changes in an ECG. Predicting wave changes and diagnosing the disease requires domain expertise like cardiologists/physicians. Deep Neural Network techniques extract the features accurately and automatically predict the type of disease. This article lists different types of cardiac disorders, and parallelly different disease interpretations of all types of diseases are discussed to manually identify the disease type; segmentation of leads, pre-trained models, and different detection techniques are discussed to predict the type of diseases from an ECG image. Finally, this article discussed the different challenges in predicting heart diseases, and solutions to some of the challenges are given.

Keywords: Cardiovascular Disease, 12 Lead ECG, Deep Neural Network (DNN), Challenges, ECG image, Heart diseases.



1. INTRODUCTION

Cardiovascular disease occurs in the heart and blood vessels [1]. Prediction of cardiovascular disease is important and challenging too. Electrocardiogram (ECG) data predicts heart diseases [2]. Physicians or domain experts are needed to predict the diseases in ECG. But diagnosing using a large amount of data by experts is difficult to achieve scalability [3]. Machine learning or deep learning concepts are used to overcome scalability issues.

Researchers conducted their research in two ways: diagnosing single diseases like myocardial infarction [4], arrhythmias [5], etc. using 12 lead ECGs. The second is taking particular waves like QRS wave [6], ST interval [7], etc. from the ECG and diagnosing the disease. But collective research finding is challenging. Instead of finding individual diseases or individual waves to predict the diseases, we can diagnose all the possible diseases using 12 lead ECGs with the help of machine learning and deep learning concepts. Further work carried out for diagnosing all cardiovascular diseases is pattern matching. Pattern matching is done by matching a given or extracted pattern with the patterns already trained [8]. If the pattern matches exactly, it will classify, or else it will search for other patterns and so on. Likewise, in the ECG signal [9, 10], continuous matching of each wave with the waves already trained is carried out. A particular wave of change will happen for particular diseases, and by matching the changes, we can classify the diseases. Similarly, each lead is segmented anonymously in ECG images, and segmented images are matched with the trained image [11, 12]. Matching is not done using the pixel values, which are done using scalar values and image shape [13].

An intelligent decision-making system widely helps us to monitor the patients like heart patients heart rates, and also it helps us to automatically diagnose heart disease, manage the patients, record their heartbeats, and store the data [14]. This is useful when the patients need treatment for a particular disease because heart disease can’t diagnose properly within an hour, we need to monitor the heart rate continuously, and accordingly, treatment is to be started [15]. An expert in cardiology is there to predict heart diseases, but one main challenge is scalability because a cardiologist can’t monitor all the patients. Time complexity and cost factors are important properties [16]. Artificial intelligence concepts can achieve these properties and automatically detect diseases that can be done with good complexity and less cost [17].

Artificial Intelligence (AI) will mimic the tasks done by humans. Machines will be trained like humans by giving all input possibilities to learn and predict. It consists of Machine Learning (ML) and Deep Learning (DL) [18]. Commonly, DL is good at analyzing the images, and it will learn the features using a huge number of images, but ML is good at analyzing data, and also, it will not give good results in image-based predictions. This study analyzes the ECG images for predicting cardiovascular diseases. Deep Learning involves a Deep Neural Network (DNN), Convolutional Neural Network (CNN), and Recurrent Neural Network (RNN). In ECG images, lead information is important for predicting the type of heart disease. An object is detected with a DNN concept to extract accurate information from each lead. Furthermore, related work on ECG signals and images, interpretation of ECG images, challenges in the publically available dataset and possible solutions are discussed.

1.1. Related Work

1.1.1. Research Carried Out

Cardiovascular disease is the leading cause of death worldwide [19]. Cardio diseases mainly occur due to a lack of blood supply and blockage in arteries. If chest pain occurs, it may be angina or else a heart attack. In angina, there won’t be any permanent damage to the heart, but in a heart attack, it's not the case [20]. Fig. (1) represents the total number of research studies on different diseases [21] taken from Google scholar.

1.2. About 12 Lead ECG

12 lead ECG tests are used to diagnose heart disease. In this test, 10 electrodes are placed on the human body and then from the 10 electrodes, 12 leads are calculated. Each electrode records the skin movement when heat impulses an electrical signal. The 12 leads are divided into 3 subparts a) limb nodes, b) augmented limb nodes c) chest or precordial leads. Limb nodes are categorized into Lead I, II, and III. Augmented limb nodes are categorized into Lead augmented vector right (aVR), Lead augmented vector right (aVL), and Lead augmented vector right (aVF) [22]. Chest leads are categorized into V1, V2, V3, V4, V5, and V6. Each lead is calculated using the electrodes placed in the body shown in Fig. (2). Recorded waves are printed on a graph sheet; each small box in a sheet represents 0.04s, and a large box represents 0.2s, shown in Fig. (3) [23].

Fig. (1). Research history of cardiovascular diseases.

Fig. (2). Bipolar Leads (a) I = LA - RA (b) II = LL - RA (c) III = LL – LA (d) aVF = 3/2(LL-Vw) (e) aVL = 3/2 (LA-Vw) (f) aVR = 3/2 (RA - Vw) (g) Chest Leads.

Fig. (3). 12 Lead ECG representation.

1.3. ECG Interpretation

1.3.1. Rate

Heart rate defines the speed of heartbeats. The rate is calculated in an electrocardiogram by measuring the number of the boxes between the RR intervals [24, 25]. A rate between 60 to 99 bpm indicates the patient is normal Eq 1, 2; if the rate is lower or higher, it shows some abnormalities Eq 3 shown in Table 1.

Table 1. Rate Interpretation (Best seen in Lead II).
Interpretation Beats Per Minute
Normal 60 - 99
Bradycardia Less than 60
Tachycardia Greater than 100

If the rhythm is regular,

(1)
(2)

If the rhythm is irregular,

(3)

1.3.2. Rhythm

Rhythm abnormalities are checked by measuring the RR interval and PP interval. If the intervals are equal, it is normal or abnormal, as shown in Table 2. RR interval is used to measure ventricular rhythm, and the PP interval is used to measure the atrial rhythm [26].

1.3.3. Waves

P wave indicates the atria depolarization. QRS indicates ventricular depolarization, and T wave indicates ventricular repolarization. Each wave’s normal and abnormal frequencies are shown in Table 3, and abnormalities interpretations are also mentioned [33].

2. DATASET AVAILABILITY

A dataset collection is a challenging task when research is based on the medical field. Here, the discussion is based on ECG images; research on ECG images is much less [45]. Most of the data on the internet is time series, which is inappropriate for this work. The Benchmark dataset is available on Mendeley, published by the University of Management and Technology in 2021 [46, 47]. Table 4 describes the dataset availability in a different class.

Table 2. Rhythm (Best seen in Lead II).
Interpretation Rate Rhythm P-wave QRS Interval PR Interval T Wave
Normal Sinus 60 to 100 bpm No Upright in the lead I and II QRS preceded by P Constant No
Sinus Bradycardia [27] <60 bpm Regular No No No No
Sinus Pause [28] Varies from slow to normal Irregular Absent No Normal or long PR No
Atrial Fibrillation [29] ~150 bpm No Absent < 0.12s No No
Atrial Flutter [29] ~300 bpm No No 3 flutters to 1 QRS wave No No
Ventricular Fibrillation [30] Can’t discerned Unorganized Absent Absent Absent No
Ventricular tachycardia [30] 100 to 250 bpm Irregular Absent No No Absent
Supraventricular tachycardia [30] 150 to 250 bpm Regular Can’t discerned No No No
Atrial Escape Rhyth [31] 60 to 80 bpm No No No No No
Asystole [30, 32] Absent Absent Absent Absent No No
Table 3. ECG interpretations.
Normal Abnormal Interpretation
P-Wave [33] Best seenPositive – Lead I and IINegative – aVRBiphasic – V1DurationLess than 0.12s Tall Peaked P Wave, Duration - Greater than 2.5 mm - Congenital Heart diseases- Pulmonary Hypertension
Biphasic, Duration -Greater than 40s wide, Greater than 1mm deep - Left atrial enlargement
Mitrale, Duration - Greater than 0.12s (Wide P wave) - mitral stenosis- Left atrial Enlargement
PR interval [34] Duration0.10s to 0.20s Long PR, Duration - Greater than 0.22s - First-degree atrioventricular block [34]
The first PR interval is normal and successive as a P wave with no following QRS complex - Second-degree atrioventricular block [35]
No relationship between the P wave and the QRS complex -Third-degree atrioventricular block
Short PR, Duration - lesser than 0.12s - Tachycardia
Q Wave Duration- Less than 0.04s- >2mm in III and aVR- No Q wave in V1 to V3 Abnormal Q wave, Duration - 1mm wide, 2mm deep. Best seen - V1 to V3 Past or current infraction [36]
No Q wave, Best seen - V5, V6 NSTEMI, LBBB [36]
R Wave - Size is not absolute, small in V, large in V1 to V6- R > S in (V4)- For children and young adults, the tall R wave in lead V1 is normal Taller R wave in V1 - Right ventricular hypertrophy- Posterior Myocardial Infarction- Right bundle branch block [37]
Taller R wave in aVR - Dextrocardia- Ventricular tachycardia
Absent - Dextrocardia
- <= 3mm in V3 - Prior anteroseptal MI [37]- LVH- Inaccurate lead placement
QRS Complex [38] - upright in Lead I and II, Duration - 0.07s to 0.10s Sokolow & Lyon criteria - S(V1) +R(V5 or V6) > 35mm, Cornell Criteria - S(V3)+R(aVL) > 28mm (men), >20mm(women). OthersR(aVL)>11mm Left ventricular hypertrophy [38]
V1(R>S), V6(S>R), Strain T wave inversion Right ventricular hypertrophy
Duration - 0.10s to 0.11s Incomplete Bundle Branch
Duration - Greater than or equal to 0.12s Complete Bundle Branch [39]
QT interval Duration - 0.35 to 0.45s [33] Long QT interval, Duration - Greater than 0.45 Ventricular arrhythmia
Short QT interval, Duration - Lesser than 0.35 - Paroxysmal atrial, Ventricular fibrillation [40]
ST-Segment Duration0.8 to 0.12s - Concave ST elevation with PR depression (I, II, III, aVF, V5, V6), Reciprocal ST depression Pericarditis [41]
- Concave ST elevation (Precordial, Inferior leads) with J point Benign Early Repolarization
- ST-elevation with deep S wave (V1 to V3), ST depression with tall R wave (I and aVL) Left Bundle Branch Block
- ST-elevation with deep S wave (V1 to V3), ST depression, and T-wave inversion (V5 to V6) Left Ventricular Hypertrophy [42]
- ST-elevation with deep Q wave and inverted T wave (V1 to V3) Ventricular Aneurysm
- ST-elevation and partial RBBB (V1 to V2) Brugada Syndrome
- ST depression with dominant R wave (V1 to V3), Upright T Wave, ST-elevation (V7 to V9) Posterior Myocardial Infarction [43]
- Downsloping ST depression Digoxin Effect
- Downsloping ST depression with T wave flattening/ inversion, Prolonged QU interval Hypokalaemia
- ST depression and T wave inversion (V1 to V3) Right Ventricular hypertrophy
- abnormalities in RVH with ST depression, T wave inversion (V1 to V3) Right Bundle Branch Block [44]
-widespread horizontal ST depression (V4 to V6) Supraventricular Tachycardia
T Wave [45] Duration0.15s to 0.20s Inverted T wave (aVR, aVL, aVF, V1, III) Myocardial Infarction
Biphasic T Wave (V2 to V3) - Myocardial Infarction (rise and fall below), Hypokalemia (fall and rise above)
Flattened T wave, Duration - Varies from -1 mm to +1 mm Hypokalemia
U Wave [46] DurationLess than 2 mm Inverted U wave Myocardial ischemia or Left ventricular volume
Present Hypokalemia
Normal waves - QRS and T wave incline to have the same direction; all waves are negative in lead aVR
Table 4. Dataset description.
Class Number of Images
Abnormal Heart Beat 779
Covid 250
History of MI 375
Normal 1,243
Myocardial Infarction (MI) 387

3. DEEP NEURAL NETWORK (DNN)

A 12 Lead ECG image is taken and fed into different pre-trained models to extract features from the image [48], we can use any pre-trained techniques like ResNet [49], VGG [50], and FPN [51], etc. after getting the feature map from the pre-trained models, object detection techniques are applied like Fast R-CNN, Faster R- CNN, and RPN to detect the object in an image. Fig. (4) shows the workflow diagram of the proposed approach.

3.1. Pretrained Models

3.1.1. VGG

A visual Geometry Group (VGG) recognizes the object in an image [50]. An RGB image is given as input of any size. An RGB image is passed through the convolutional layer with the filter size 3 x 3 and stride 1. In VGG, we can determine how many layers we need, but the limit is up to 19 because the backpropagation algorithm will make changes in weights, and as the layer increases, the number of feature maps also increases. Also, it contains millions of parameters [52].

3.1.2. ResNet

Residual Network (ResNet) is an object recognition method [53]. Many methods have been used to detect the object accurately, and they were found to increase the number of layers. But increasing the number of layers will detect the object; on the first layer, it will learn some features of the image, and after some more layers, it will learn many features, but one disadvantage is its computational time is high, and its error percentage is also high while testing and training the data [54]. The ResNet technique solves these issues using the skip connection (x) shown in Fig. (5). The skip connection solves the vanishing gradient problem [55].

Fig. (4). Workflow diagram.

Fig. (5). Skip connection [55].

Fig. (6). (a) Identity block (b) convolutional block.

In the residual network, there are two blocks, one is an identity block, and another is a convolutional block. In the Identity block [56], the input size (x) and output size (F(x)) will be equal to add-up shown in Fig. (6a). In the Convolutional block [56], the input size (x) and the output size (F(x)) will not be equal while adding. A convolutional block uses a shortcut path using a 1 x 1 convolution layer from the input, and then it will add both the output, as shown in Fig. (6b), to solve the size variant issues.

In ResNet there are several layers ResNet 18, ResNet 34, ResNet 50 etc. [57, 58]. For example, we will discuss ResNet 18; the input image is fed into the first layer. In layer one 64 times 7 x 7 filter is used in an input image with stride 2, and then max pooling is performed on the image (outcome of the first layer) using a filter size of 3 x 3 with stride 2. A third layer x2 indicates a 3x3 convolution filter will 2 times, so 4 layers and the same in the next step. ResNet-18 layers calculated as 2+4+4+4+4 = 18 layers shown in Fig. (6b).

3.1.3. Feature Pyramid Network (FPN)

FPN is mainly used to extract features in an image [51]. FPN will merge low-resolution features and high-resolution features to get an accurate result. It takes input as a single image and gives proportionally sized feature maps at different levels. FPN is not dependent on the backbone convolutions network; we can use any convolutional network. FPN is constructed in two ways: top-down and bottom-up [59].

In Fig. (7), an input image is taken with any size. At first, the Bottom-up strategy is performed by giving the input to the first convolutional layer, and we will get one feature map after the first convolution layer (C1), and stride 2 is used to reduce the size of the image by 2. By comparing the input and output of the first convolutional layer, the first convolutional layer's result resolution is lower than the input image. After reducing the image by 50%, the next level convolution is performed: downsampling using stride 4. The output of the second convolution layer is given as input to both the top-down and bottom-up layers [60].

Before giving input to the top-down layer, the output of the convolutional layer should process a 1 x 1 convolution filter to decrease the channel depth with a fixed size of channel 256. And this output with the M3, M3 is the top-down approach. Addition can be done using the previous layer (M3) by 2 with the second convolutional layer (Fig. 8). The output of the addition is again applied to a 3 x 3 filter. This filter is aliasing to make the image smoother because we upsampled the image by 2, so the image size increases. Aliasing is performed to balance the noise in an image. And the prediction is made (P2, P3, P4, and P5), and this prediction is called a feature map [61].

3.2. Detection Methods

3.2.1. Fast R-CNN

R-CNN is a Region-based Convolutional Neural Network. It detects objects in an image [62], and this method achieves a good result while detecting the object. However, R-CNN [63] has limitations like higher time complexity and slow detection. Because warping and separate algorithms for region proposals are used, it will make the training process slower and also increases time complexity. To increase the speed and to get accurate results Fast R-CNN is proposed. There is no need for a separate storage disk in Fast R-CNN to store the data.

Fast R-CNN (Fig. 9) states that it takes an input image and then processes the convolution layer and Region of Interest (ROI) to get the feature map [62]. From the feature map, feature vectors are extracted and then the vector is applied to a fully connected layer and it gives two layers: the SoftMax layer and the Bounding Box Regressor. SoftMax will identify the background classes, and the regressor will plot the bounding box on the object using the SoftMax layer [64]. Fast R-CNN uses selective search to find the region, but it increases the time and is slow to detect the object. R-CNN is introduced to overcome that faster [65].

Fig. (7). Feature pyramid network architecture [51].

Fig. (8). Lateral connection.

Fig. 9. Fast R-CNN.

Fig. (10). (a) Faster R-CNN architecture [69], (b) region proposal network.

3.2.2. Faster R-CNN

Faster R-CNN is also an object detector method; at first input image is given to Region Proposal Network (RPN) [66]; it produces the region proposals, and it won’t tell us what object is present, it only tells us whether the object is present or not (Fig. 10a). In Faster R-CNN, one advantage is it won’t use any selective search algorithm separately; it will automatically produce the region proposals [67]. And the proposals are fed into the ROI pooling layer or max-pooling layer to get two tasks one is a classification task, and another is a regression task. The classification task tells us whether the object is present in an image, and the regressor will plot or adjust the bounding box on an image [68].

3.2.2.1. Region Proposal Network (RPN)

RPN is used to detect an object in an image. It won’t tell us what object is present; instead, it will tell us whether it is present or not [70]. It will act as binary classification; if the object is found, it gives 1; otherwise, 0. First, the image is pretrained using feature extractor techniques (VGG, ResNet, etc.); after the features are extracted, the intermediate layer is taken to feed into an RPN first layer. In the first layer, 3*3 is the filter size of that layer, and 512 is the number of channels divided into classification and regression. In classification Conv, 1*1, and 2*9 are used in Fig. (10b). 1*1 is filter size, and in 2*9, 2 bounding box coordinates and 9 is the number of anchor boxes. In the anchor box, different boxes are used; here Intersection over Union (IoU) technique is used. It computes the IoU on two bounding boxes [71]. if IoU 0.5, foreground clause and < 0.5 Background clause carried out based on the bounding boxes and if the two bounding boxes match 50% and greater, it is a foreground clause. If it matches less than 50%, it is a background clause shown in Eq 4 [72]. Classification says whether the object is present or not, regression takes the output from convolution and plots the bounding boxes. The region produced from RPN is fed into an ROI pooling layer or maximum pooling layer of 7*7*512. After the pooling layer, it passes to two fully connected layers and gets output as Regressor and Classifier. The softmax classifier will tell us the type of clause, and the Regressor will form the bounding box [73].

(4)

4. DISCUSSION

Diagnosing cardiovascular Diseases is essential to give proper treatment. The primary technique for diagnosing heart disease is an electrocardiogram. The most popular ECG used is the 12 Lead ECG. Signals from the body will be recorded and printed on a graph sheet. Heart abnormalities are diagnosed using the printed wave. Deep learning reduces time complexity and increases diagnosing accuracy.

Different convolution networks with different backbone network layers are used to detect the object in an image. Faster R-CNN is a recent technique that takes less time to train the data, and for backbone techniques, ResNet-50 and VGG-16-like techniques are used. Many datasets are publicly available; different convolutional layer techniques are used to detect the object. Different researchers used these techniques and accurately detected the object shown in Table 5 . An important challenge is detecting the small objects in an image; therefore, detecting a smaller object FPN will give a good object recognition result because FPN will predict the object in a pyramidal fashion.

We can’t use precision techniques; instead, Mean Average Precision (mAP) is used. Because for detecting objects, we need to calculate both the localization and classification shown in Eq 5.

(5)

Further work can be carried out using Faster R-CNN methods with independent backbone networks like Resnet and FPN. In existing work, SSD is used to detect the object in an image, but it only detects the large object in an image and fails to detect the smaller object [77, 78]. Detecting the smaller object is important in the medical field, like ECG; a small wave change in an image may indicate some abnormality. So, Faster R-CNN techniques can be used.

4.1. Issues/Challenges

4.1.1. Data Imbalance

The dataset available in the repository [48] has an imbalanced set of images in all classes, which will affect the performance. So, balancing the available dataset is important. There are several ways to balance the dataset; let us take any two classes Abnormal Heart Beat (AHB) and Covid class, as 779 and 250 images.

4.1.1.1. Increasing the Image in the Lesser Class

Lesser class indicates the class has less number of images. Likewise, in our example, Covid has fewer images. Increasing the number of images from 250 to 779 will balance the class. We can increase the number of images by using image augmentation techniques.

4.1.1.2. Decreasing the Image in Higher Class

Higher class indicates the class having a higher number of images. Likewise, in our example, AHB has fewer images. Decreasing the number of images from 779 to 250 images will make the class balanced.

4.1.1.3. Moderating

Moderating indicates keeping some count as 500, try to increase or decrease the number of images from each class using the cross entropy technique.

4.2. Image Augmentation Issues

In most cases, image augmentation is carried out by rotating to the left and right for some degrees, but these augmentation techniques can’t be used for the ECG image dataset. Because the ECG image contains information in a waveform, if we rotate the image meaning of that waveform will be different, leading to information loss. Only the way is to change the Contrast, Hue, convert to grayscale, Saturations, etc.

Table 5. Different convolutional networks for detecting the object.
Network Backbone PCB Dataset mAP
Faster R-CNN [74] ResNet-50 PCB 85.2%
Faster R-CNN [75] VGG-16 VOC 2007 73.2%
Faster R-CNN [74] ResNet-101 PCB 87.1%
Faster R-CNN [75] VGG-16 VOC 2012 70.4%
Faster R-CNN [74] ResNet-101 with FPN PCB 90.8%
Faster R-CNN [75] ResNet-50 with FPN PCB 86.4%
Fast R-CNN with RPN [67] VGG-16 COCO + VOC 2007 + VOC 2012 78.8%
Fast R-CNN with RPN [67] VGG-16 VOC 2007 + VOC 2012 73.2%
YOLOv3 [74] MobileNet PCB 70.6%
Faster R-CNN NAS-FPN Flower 87.6%
SSD [76] MobileNet v2 Flower 80.6%
Fig. (11). Two different leads alignment in ECG images.

4.3. Mis-labeling

It is important to be careful while labeling the class of an image, as the wrong indication of an image will raise issues.

4.3.1. Machine Problem

While taking ECG, in the ECG graph sheet, we will get the class of the particular patients. Sometimes the waves in the graph sheet will be normal, but the machine gives the patients diseases like MI. This is an important problem that needs to be cross-checked.

4.3.2. Lack of Knowledge

The person not from a medical background trying to label the dataset is another problem that needs to be cross-checked with experts who are knowledgeable in ECG.

4.3.3. Image Belongs to many Classes

This is a major problem that arises while classifying diseases. If the patient has one or more diseases, that patient’s image will belong to two or more different classes. This issue can be solved by keeping the same patient details in all the respective classes is important.

4.4. Format Difference

From Fig. (11), we can see two different lead alignments of ECG. This lead alignment is different; classifying it may lead to the misclassification of the diseases. By using Deep Learning, we can correctly predict diseases.

CONCLUSION

Deep Learning is widely used in the medical field to diagnose diseases automatically. It helps us minimize manual time and accurate detection and avoids the wrong prediction because of manual and machine detection. Before using the Deep Learning techniques on ECG images to diagnose heart diseases, learning manual procedures to detect the type of diseases by seeing the ECG graph sheet is important. For this purpose, disease interpretation for various heart diseases is discussed. Then Deep Learning techniques are discussed; segmentation of each lead, pre-trained models, and different detection techniques are discussed. This paper compared different DNN techniques’ accuracy and their various associated challenges.

LIST OF ABBREVIATIONS

ECG = Electrocardiogram
ML = Machine Learning
DL = Deep Learning
DNN = Deep Neural Network
CNN = Convolutional Neural Network
aVR = Augmented Vector Right
aVL = Augmented Vector Right
aVF = Augmented Vector Right
RNN = Recurrent Neural Network
VGG = Visual Geometry Group
ROI = Region of Interest
RPN = Region Proposal Network
AHB = Abnormal Heart Beat

CONSENT FOR PUBLICATION

Not applicable.

AVAILABILITY OF DATA AND MATERIALS

The data supporting the findings of the article is available in the Mendeley Data at http://dx.doi.org/10.17632/gwbz3 fsgp8.1, reference number [47].

FUNDING

None.

CONFLICT OF INTEREST

The authors declare no conflicts of interest, financial or otherwise.

ACKNOWLEDGEMENTS

Declared none.

REFERENCES

[1] F. Metivier, S.J. Marchais, A.P. Guerin, B. Pannier, and G.M. London, "Pathophysiology of anaemia: Focus on the heart and blood vessels", Nephrol. Dial. Transplant., vol. 15, pp. 14-18, 2000.
[2] M.H. Vafaie, M. Ataei, and H.R. Koofigar, "Heart diseases prediction based on ECG signals’ classification using a genetic-fuzzy system and dynamical model of ECG signals", Biomed. Signal Process. Control, vol. 14, pp. 291-296, 2014.
[3] N. Shanmathi, and M. Jagannath, "Computerised decision support system for remote health monitoring: A systematic review", IRBM, vol. 39, no. 5, pp. 359-367, 2018.
[4] H.D. White, and D.P. Chew, "Acute myocardial infarction", Lancet, vol. 372, no. 9638, pp. 570-584, 2008.
[5] G. Peretto, S. Sala, S. Rizzo, G. De Luca, C. Campochiaro, S. Sartorelli, G. Benedetti, A. Palmisano, A. Esposito, M. Tresoldi, G. Thiene, C. Basso, and P.D Bella, "Arrhythmias in myocarditis: State of the art", Heart Rhythm, vol. 16, no. 5, pp. 793-801, 2019.
[6] A. Sharma, S. Patidar, A. Upadhyay, and U.R Acharya, "Accurate tunable-Q wavelet transform based method for QRS complex detection", Comput. Electr. Eng., vol. 75, pp. 101-111, 2019.
[7] M. Gimbel, K. Qaderdan, L. Willemsen, R. Hermanides, T. Bergmeijer, E. De Vrey, T. Heestermans, M.T.J. Gin, R. Waalewijn, S. Hofma, F. Den Hartog, W. Jukema, C. Von Birgelen, M. Voskuil, J. Kelder, V. Deneer, and J.T. Berg, "Clopidogrel versus ticagrelor or prasugrel in patients aged 70 years or older with non-ST-elevation acute coronary syndrome (POPular AGE): The randomised, open-label, non-inferiority trial", Lancet, vol. 395, no. 10233, pp. 1374-1381, 2020.
[8] I. Jekova, V. Krasteva, and R. Schmid, "Human identification by cross-correlation and pattern matching of personalized heartbeat: Influence of ECG leads and reference database size", Sensors, vol. 18, no. 2, p. 372, 2018.
[9] Ö. Yildirim, "A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification", Comput. Biol. Med., vol. 96, pp. 189-202, 2018.
[10] A.K. Sangaiah, M. Arumugam, and G.B. Bian, "An intelligent learning approach for improving ECG signal classification and arrhythmia analysis", Artif. Intell. Med., vol. 103, p. 101788, 2020.
[11] A. Ullah, S.M. Anwar, M. Bilal, and R.M. Mehmood, "Classification of arrhythmia by using deep learning with 2-D ECG spectral image representation", Remote Sens., vol. 12, no. 10, p. 1685, 2020.
[12] S. Modi, Y. Lin, L. Cheng, G. Yang, L. Liu, and W.J. Zhang, "A socially inspired framework for human state inference using expert opinion integration", IEEE/ASME Trans. Mechatron., vol. 16, no. 5, pp. 874-878, 2011.
[13] R. Bajcsy, and S. Kovačič, "Multiresolution elastic matching", Comput. Vis. Graph. Image Process., vol. 46, no. 1, pp. 1-21, 1989.
[14] R.R. Bond, D.D. Finlay, C.D. Nugent, and G. Moore, "A review of ECG storage formats", Int. J. Med. Inform., vol. 80, no. 10, pp. 681-697, 2011.
[15] D.E. Lake, K.D. Fairchild, and J.R. Moorman, "Complex signals bioinformatics: Evaluation of heart rate characteristics monitoring as a novel risk marker for neonatal sepsis", J. Clin. Monit. Comput., vol. 28, no. 4, pp. 329-339, 2014.
[16] A.Y. Hannun, P. Rajpurkar, M. Haghpanahi, G.H. Tison, C. Bourn, M.P. Turakhia, and A.Y. Ng, "Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network", Nat. Med., vol. 25, no. 1, pp. 65-69, 2019.
[17] F. Jabeen, M. Maqsood, M.A. Ghazanfar, F. Aadil, S. Khan, M.F. Khan, and I. Mehmood, "An IoT based efficient hybrid recommender system for cardiovascular disease", Peer-to-Peer Netw. Appl., vol. 12, no. 5, pp. 1263-1276, 2019.
[18] W. Zhang, G. Yang, Y. Lin, C. Ji, and M.M. Gupta, On Definition Of Deep Learning, 2018 World Automation Congress (WAC), June 03-06, 2018, Stevenson, 2018
[19] I.S. Okwuosa, S.C. Lewsey, T. Adesiyun, R.S. Blumenthal, and C.W. Yancy, "Worldwide disparities in cardiovascular disease: Challenges and solutions", Int. J. Cardiol., vol. 202, pp. 433-440, 2016.
[20] N. Campbell-McBride, Put Your Heart in Your Mouth: Natural Treatment for Atherosclerosis, Angina, Heart Attack, High Blood Pressure, Stroke, Arrhythmia, Peripheral Vascular Disease.. Chelsea Green Publishing: Unites States, 2018.
[21] G. Mourad, T. Jaarsma, A. Strömberg, E. Svensson, and P. Johansson, "The associations between psychological distress and healthcare use in patients with non-cardiac chest pain: Does a history of cardiac disease matter?", BMC Psychiatry, vol. 18, no. 1, p. 172, 2018.
[22] "Electrocardiography Wikipedia Page", Available from: http://en.wikipedia.org/wiki/Electrocardiography
[23] B.G. Petty, Basic electrocardiography.. Springer Nature: Germany, 2020.
[24] J. Wannenburg, R. Malekian, and G.P. Hancke, "Wireless capacitive-based ECG sensing for feature extraction and mobile health monitoring", IEEE Sens. J., vol. 18, no. 14, pp. 6023-6032, 2018.
[25] T. Sawyer, R. Umoren, and M.M. Gray, "Neonatal resuscitation: Advances in training and practice", Adv. Med. Educ. Pract., vol. 8, pp. 11-19, 2016.
[26] D. Atwood, and D.L. Wadlund, "Ecg interpretation using the crisp method: A guide for nurses", AORN J., vol. 102, no. 4, pp. 396-408, 2015.
[27] C.V. Serrano Jr, L.A. Bortolotto, L.A.M. César, M.C. Solimene, A.P. Mansur, J.C. Nicolau, and J.A.F. Ramires, "Sinus bradycardia as a predictor of right coronary artery occlusion in patients with inferior myocardial infarction", Int. J. Cardiol., vol. 68, no. 1, pp. 75-82, 1999.
[28] T.D. Vermeulen, B.M. Shafer, A.V. Incognito, M. Nardone, A.L. Teixeira, P.J. Millar, J.K. Shoemaker, and G.E. Foster, "Case studies in physiology: Sympathetic neural discharge patterns in a healthy young male during end-expiratory breath hold-induced sinus pause", J. Appl. Physiol., vol. 129, no. 2, pp. 230-237, 2020.
[29] A. Shiyovich, A. Wolak, L. Yacobovich, A. Grosbard, and A. Katz, "Accuracy of diagnosing atrial flutter and atrial fibrillation from a surface electrocardiogram by hospital physicians: Analysis of data from internal medicine departments", Am. J. Med. Sci., vol. 340, no. 4, pp. 271-275, 2010.
[30] J. Hulting, "Detection of asystole, ventricular fibrillation and ventricular tachycardia with automated ECG monitoring", Acta Med. Scand., vol. 205, no. 1-6, pp. 17-23, 1979.
[31] A. Başoğlu, and U. Aydoğdu, "Terminal atrial standstill with ventricular escape rhythm in a neonatal calf with acute diarrhea", Turk. J. Vet. Anim. Sci., vol. 37, no. 3, pp. 362-365, 2013.
[32] J. Burd, and P. Kettl, "Incidence of asystole in electroconvulsive therapy in elderly patients", Am. J. Geriatr. Psychiatry, vol. 6, no. 3, pp. 203-211, 1998.
[33] F. Şap, Z. Karataş, H. Altin, H. Alp, B. Oran, T. Baysal, and S. Karaarslan, "Dispersion durations of P-wave and QT interval in children with congenital heart disease and pulmonary arterial hypertension", Pediatr. Cardiol., vol. 34, no. 3, pp. 591-596, 2013.
[34] S. Cheng, M.J. Keyes, M.G. Larson, E.L. McCabe, C.N Cheh, D. Levy, E.J. Benjamin, R.S. Vasan, and T.J. Wang, "Long-term outcomes in individuals with prolonged PR interval or first-degree atrioventricular block", JAMA, vol. 301, no. 24, pp. 2571-2577, 2009.
[35] A.G. Coumbe, N. Naksuk, M.C. Newell, P.E. Somasundaram, D.G. Benditt, and S. Adabag, "Long-term follow-up of older patients with Mobitz type I second degree atrioventricular block", Heart, vol. 99, no. 5, pp. 334-338, 2013.
[36] F.M. Fesmire, R.F. Percy, R.L. Wears, and T.L. MacMath, "Initial ECG in Q wave and non-Q wave myocardial infarction", Ann. Emerg. Med., vol. 18, no. 7, pp. 741-746, 1989.
[37] A.B. De Luna, D. Rovai, G. Pons Llado, A. Gorgels, F. Carreras, D. Goldwasser, and R.J. Kim, "The end of an electrocardiographic dogma: A prominent R wave in V1 is caused by a lateral not posterior myocardial infarction-new evidence based on contrast-enhanced cardiac magnetic resonance-electrocardiogram correlations", Eur. Heart J., vol. 36, no. 16, pp. 959-964, 2015.
[38] L. Bacharova, "Missing link between molecular aspects of ventricular arrhythmias and QRS complex morphology in left ventricular hypertrophy", Int. J. Mol. Sci., vol. 21, no. 1, p. 48, 2019.
[39] P. Varriale, and B.E. Chryssos, "The RSR′ complex not related to right bundle branch block: Diagnostic value as a sign of myocardial infarction scar", Am. Heart J., vol. 123, no. 2, pp. 369-376, 1992.
[40] M. Yılmaz, C. Altın, A. Tekin, T. Erol, İ. Arer, T.Z. Nursal, N. Törer, V. Erol, and H. Müderrisoğlu, "Assessment of atrial fibrillation and ventricular arrhythmia risk after bariatric surgery by P wave/QT interval dispersion", Obes. Surg., vol. 28, no. 4, pp. 932-938, 2018.
[41] M.D. Witting, K.M. Hu, A.A. Westreich, S. Tewelde, A. Farzad, and A. Mattu, "Evaluation of Spodick’s sign and other electrocardiographic findings as indicators of stemi and pericarditis", J. Emerg. Med., vol. 58, no. 4, pp. 562-569, 2020.
[42] J.J. Goy, J.C. Stauffer, J. Schlaepfer, and P. Christeler, Electrocardiography (ECG)., vol. 1. Bentham Science Publishers: Netherlands, 2013.
[43] Y. Zhao, J. Xiong, Y. Hou, M. Zhu, Y. Lu, Y. Xu, J. Teliewubai, W. Liu, X. Xu, X. Li, Z. Liu, W. Peng, X. Zhao, Y. Zhang, and Y. Xu, "Early detection of ST-segment elevated myocardial infarction by artificial intelligence with 12-lead electrocardiogram", Int. J. Cardiol., vol. 317, pp. 223-230, 2020.
[44] J. Brugada, P. Brugada, and R. Brugada, "The syndrome of right bundle branch block ST segment elevation in V1 to V3 and sudden death—the Brugada syndrome", Europace, vol. 1, no. 3, pp. 156-166, 1999.
[45] A. Malhotra, H. Dhutia, S. Gati, T.J. Yeo, H. Dores, R. Bastiaenen, R. Narain, A. Merghani, G. Finocchiaro, N. Sheikh, A. Steriotis, A. Zaidi, L. Millar, E. Behr, M. Tome, M. Papadakis, and S. Sharma, "Anterior T-wave inversion in young white athletes and nonathletes: Prevalence and significance", J. Am. Coll. Cardiol., vol. 69, no. 1, pp. 1-9, 2017.
[46] J.K. Fitzpatrick, and N. Goldschlager, "The clue is in the U wave: Torsades de pointes ventricular tachycardia in a hypokalemic woman on methadone", Ann. Emerg. Med., vol. 71, no. 4, pp. 473-476, 2018.
[47] A.H. Khan, M. Hussain, and M.K. Malik, "ECG images dataset of cardiac and COVID-19 patients", Data Brief, vol. 34, p. 106762, 2021.
[48] P. Hao, X. Gao, Z. Li, J. Zhang, F. Wu, and C. Bai, "Multi-branch fusion network for myocardial infarction screening from 12-lead ECG images", Comput. Methods Programs Biomed., vol. 184, p. 105286, 2020.
[49] Z. Wu, C. Shen, and A. Van Den Hengel, "Wider or deeper: Revisiting the resnet model for visual recognition", Pattern Recognit., vol. 90, pp. 119-133, 2019.
[50] K. Simonyan, and A Zisserman, "Very deep convolutional networks for large-scale image recognition", arXiv, 2018. Available from: https://arxiv.org/abs/1409.1556
[51] T.Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, Feature pyramid networks for object detection, Proceedings of the IEEE conference on computer vision and pattern recognition, June 17-19, 1997, , 2017pp. 2117-2125
[52] I. Ha, H. Kim, S. Park, and H. Kim, "Image retrieval using BIM and features from pretrained VGG network for indoor localization", Build. Environ., vol. 140, pp. 23-31, 2018.
[53] Y. Cai, T. Tang, L. Xia, B. Li, Y. Wang, and H. Yang, "Low bit-width convolutional neural network on RRAM", IEEE Trans. Comput. Aided Des. Integrated Circ. Syst., vol. 39, no. 7, pp. 1414-1427, 2020.
[54] K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, Proceedings of the IEEE conference on computer vision and pattern recognition, June 17-19, 1997, , 2016pp. 770-778
[55] J. Yamanaka, S. Kuwashima, and T. Kurita, "Fast and accurate image super resolution by deep CNN with skip connection and network in network", International Conference on Neural Information Processing, Oct 14-18, 2017, , 2017pp. 217-225
[56] L.D. Quach, N.P. Quoc, N.H. Thi, D.C. Tran, and M.F. Hassan, "Using SURF to improve ResNet-50 model for poultry disease recognition algorithm", 2020 International Conference on Computational Intelligence (ICCI), Oct 8-9, 2020, 2020pp. 317-321
[57] S. Targ, D. Almeida, and K Lyman, "Resnet in resnet: Generalizing residual architectures", arXiv, 2016. Available from: https://arxiv.org/abs/1603.08029
[58] K. Deeba, and B. Amutha, "WITHDRAWN: ResNet-deep neural network architecture for leaf disease classification", Microprocess. Microsyst., p. 103364, 2020.
[59] S.W. Kim, H.K. Kook, J.Y. Sun, M.C. Kang, and S.J. Ko, "Parallel feature pyramid network for object detection", Proceedings of the European Conference on Computer Vision (ECCV), Aug 23-28, 2020, , 2018pp. 234-250
[60] C. Deng, M. Wang, L. Liu, Y. Liu, and Y. Jiang, "Extended feature pyramid network for small object detection", IEEE Trans. Multimed., vol. 14, no. 2, 2021.
[61] J. Liu, L. Cao, O. Akin, and Y. Tian, "3DFPN-HS $$^ $$: 3D Feature Pyramid Network Based High Sensitivity and Specificity Pulmonary Nodule Detection", International Conference on Medical Image Computing and Computer-Assisted Intervention, Sept. 18-22, 2022, , 2019pp. 513-521
[62] R. Girshick, "Fast r-cnn", Proceedings of the IEEE international conference on computer vision, Dec. 7-13, 2015, , 2015pp. 1440-1448
[63] R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation", Proceedings of the IEEE conference on computer vision and pattern recognition, June 23-28, 2014, , 2014pp. 580-587
[64] J. Li, X. Liang, S. Shen, T. Xu, J. Feng, and S. Yan, "Scale-aware fast R-CNN for pedestrian detection", IEEE Trans. Multimed., vol. 20, no. 4, pp. 985-996, 2017.
[65] M.C. Roh, and J.Y. Lee, "Refining faster-RCNN for accurate object detection", In 2017 fifteenth IAPR international conference on machine vision applications (MVA), May 08-12, 2017, , 2017
[66] S. Ren, K. He, R. Girshick, and J. Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks", Adv. Neural Inf. Process. Syst., vol. 28, pp. 91-99, 2015.
[67] Z. Qian, Y. Lv, D. Lv, H. Gu, K. Wang, W. Zhang, and M.M. Gupta, "A new approach to polyp detection by pre-processing of images and enhanced faster R-CNN", IEEE Sens. J., vol. 21, no. 10, pp. 11374-11381, 2021.
[68] Y. Kawazoe, K. Shimamoto, R. Yamaguchi, Y.S Domoto, H. Uozaki, M. Fukayama, and K. Ohe, "Faster R-CNN-based glomerular detection in multistained human whole slide images", J. Imaging, vol. 4, no. 7, p. 91, 2018.
[69] A. Salvador, X. Giró-i-Nieto, F. Marqués, and S.I. Satoh, "Faster r-cnn features for instance search", Proceedings of the IEEE conference on computer vision and pattern recognition workshops, June 27-30, 2016, , 2016pp. 9-16
[70] Z. Zhong, L. Sun, and Q. Huo, "An anchor-free region proposal network for Faster R-CNN-based text detection approaches", Int. J. Doc. Anal. Recognit., vol. 22, no. 3, pp. 315-327, 2019.
[71] F. van Beers, A. Lindström, E. Okafor, and M.A. Wiering, "Deep Neural Networks with Intersection over Union Loss for Binary Image Segmentation", In: Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods ICPRAM. Feb. 19-21, 2019. Prague, Czech Republic, 2019.
[72] F. Manigrasso, F.D. Miro, L. Morra, and F Lamberti, "Faster-LTN: a neuro-symbolic, end-to-end object detection architecture", arXiv, vol. 12892, pp. 40-52.
[73] R. Gavrilescu, C. Zet, C. Foșalău, M. Skoczylas, and D. Cotovanu, "Faster R-CNN: an approach to real-time object detection", In 2018 International Conference and Exposition on Electrical And Power Engineering (EPE), IEEE, Oct. 18-19, 2018, , 2018pp. 0165-0168
[74] B. Hu, and J. Wang, "Detection of PCB surface defects with improved faster-RCNN and feature pyramid network", IEEE Access, vol. 8, pp. 108335-108345, 2020.
[75] G. Han, X. Zhang, and C. Li, "Revisiting faster R-CNN: a deeper look at region proposal network", Int. Conference Neural Inf. Proc., vol. 10636, 2017pp. 14-24
[76] I. Patel, and S. Patel, "An optimized deep learning model for flower classification using NAS-FPN and faster R-CNN", Int. J. Sci. Tech. Res., vol. 9, no. 03, pp. 5308-5318, 2020.
[77] A.H. Khan, M. Hussain, and M.K. Malik, "Cardiac disorder classification by electrocardiogram sensing using deep neural network", Complexity, vol. 2021, pp. 1-8, 2021.
[78] N. Jothiaruna, and A.A. Leema, "SSDMNV2-FPN: A cardiac disorder classification from 12 lead ECG images using deep neural network", Microprocess. Microsyst., vol. 2022, p. 104627, 2022.