Design of Cascaded CNN for Medical Image Super Resolution

: The images captured through a camera usually belong to over or under exposed conditions. The reason may be inappropriate lighting conditions or camera resolution. Hence, it is of utmost importance to have a few enhancement techniques that could make these artefacts look better. Hence, the primary objective pertaining to the adjustment and enhancement techniques is to enhance the characteristics of an image. The initial numeric values related to an image get distorted when an image is enhanced. Therefore, enhancement techniques should be designed in such a way that the image quality isn’t compromised. This research work is focused on proposed a network design for deep convolution neural networks for application of super resolution techniques. To improve the complexity of existing techniques this work is intended towards network designs, different filter size and CNN architecture. The CNN model is most effective model for detection and segmentation in image. This model will improve the efficiency of medical image reconstruction from LR to HR. The proposed model showed its efficiency not only PET medical images but also on retinal database and achieved advance results as compared to existing works.


INTRODUCTION
The past couple of years, multimedia content has generated at a rapid rate across the globe. If we look at the past, we would realize that only a very small section of the society had the privilege to possess cameras or camcorders. On the contrary, in the modern day world, digital equipment's are possessed by almost everyone we see.
Therefore, it is so easy to create the content today. As the prices went down and the features got integrated in mobile phones, these devices have become so popular. The videos and images can today be captured at the click of a button. Also the data can be stored at very cheap pricess.
A large amount of content gets created every single day. The classification, labelling and searching of content has become so easy due to its voluminous content. Almost each one of us in the modern day world uses internet for some or the other use. Relevant content can be downloaded from any corner of the world at an instant. Classification has made the tasks simple.
The rapid increase in the amount of content has made it much more difficult to figure out people related to a particular stream of surveillance.

A. Image
An image is generated through the arrangement of pixels in rows and columns which result into a matrix. The amplitude off is represented by selecting two spatial coordinates x and y. Using these variables, let us form a 2D function. All coordinates (x, y) in combination represent the intensity or gray level related to an image (x, y).

B. Processing of Digital Image
Let x represent the abscissa and y represent the ordinate which form a function f(x, y). Any image within image processing can be represented by this two dimensional function.
Therefore, it is necessary to convert all images in form of f(x, y) function in the very first step, for which calculations has to be done. The calculations are then done by appropriate use of these values.
Hence, the process is given the name digital image processing. The reason being, all images are converted in readable format which could be easily read by a digital computer.
The intensity related to a digital image and the location as per the abscissa and ordinate are represented by pixels pertaining to the image. The locations pertaining to each pixel is different but the intensity values might be the same. Hence, in order to make proper distinction amongst pixels, it is necessary to focus on those parameters.

III. OBJECTIVES
• Improving image analysis and visualization.
• Optimization of deep network architecture.
• Improving efficiency of system.
So, this research work is focused on proposed a network design for deep convolution neural networks for application of super resolution techniques. Many researchers focused their work in designing of network learning techniques and loss function, but existing techniques are quite complex in nature which increases the overall computational complexities. To improve the complexity of existing techniques this work is intended towards network designs, different filter size and CNN architecture.

A. Proposed methodology
The given objectives of the thesis is performed by designing a supervised CNN model. For proposed methodology, following steps are performed: • Step 1: In this step low resolution images are taken as input from datasets. • Step 2: In this step, high resolution images are taken as reference image. All the steps are discussed as below: a. Input LR Image Images from different resources are collected from different data sources or datasets that are described in details in methodology. For creating low resolution image dataset, the degradation level value is set and trained.

b. Reference Image Generation
For reference image generation, high resolution images are used.

➢ Network Description
Network Description: The robust system is developed by application of convolution neural network (CNN). In this work, residual learning is taken as proposed approach in which biblically down sampled degradation of image is mapped with residual learning. For residual learning loss function is: Where, N= number of input training images. = LR image = Input image R= Residual learning function Ɵ re= Input parameters for CNN. Proposed CNN Architecture: In this work, CNN layers designed as below ( Fig. 1): • The first layer of proposed CNN architecture is combination of Conv and LReLU. In this layer 64 feature maps are taken having convolution filter of size 3 × 3 × . Where, = The number of image channel ( = 1 ) ( = 3) . • For non-linear function LReLU activation function is used.
• Layers 2 to 15 generates 64 feature maps, whereas from layer 16 to 29, generates feature maps increasing gradually.
In this proposed architecture, the feature maps are constructed using principal component analysis (PCA). For multivariate analysis, the PCA helps to create feature maps and for multivariate analysis. All input images are fed to PCA which generates feature maps from input images. The degradation vector Dv is fed into PCA. From each feature maps, eigen vector is calculated which is used to reduce redundant features and also extract useful information to generate reconstructed output image. The use of PCA in intermediate layers of CNN doesn't increase complexity of the overall network structure. Moreover, integration of principal components with feature maps enhance the performance. ➢ Network Depth The proposed network model consists of 40 layers which takes input LR images. The LR image is created from normal image by adding degradation level. The repetitions unit of convolution and leaky ReLU activation function is taken as hidden layer and in last output layer is taken as regression layer. The output layer gives the HR whose loss function is taken and is minimized. The size of the output is set to be 3*3*Cw. Where, Cw is the width of channel or feature maps. In this network the pooling layer is removed. The size of the receptive field network is for depth . Where, d= (2 + 1) × (2 + 1). The depth selection determines the suitable trade-off between performance and efficiency level. Kernel size is another important factor that determines the efficiency and performance of the network. Generally, size of kernel is set to be n*n with degradation level of σ. The image size is taken to be w*h*c. Where, w= width of image h = Height of image c= Number of channels. Initially, in this stage, the kernel vectors are created having size (n*n) and these kernels are fed into PCA where the ndimensional vectors are reduced into m-dimensional feature vector. Then, the low dimensional degraded image is converted into feature vector of size w*h*m. All the ith map is mapped with vector vi.
Where, x= input data The gradient of sigmoid activation function is defined as ( ) = ( )(1 − ( )) (3) Where, S(x)= sigmoid function When input value is large, the sigmoid function results in zero value. So, the ReLU action function is equated as below in equation (4): When input value is less than zero, the gradient vanishes to zero. So, ReLU is not working on negative slope and train itself. In negative data, the slope in LReLU is learned and not set to zero. The random initialization of weight, the CNN with LReLU is trained. The efficiency level of LReLU is increased and enhances the efficiency and reduces the over fitting risk which occurs in ReLU and reduces the computational cost. The LReLU is defined as in equation 5: Where, = The learnable parameter that controls the slope of negative part.
= The input data for the sigmoid activation function which is non-linear function on the ℎ layer.
The activation function, LReLU, is used to train and learn the deep convolution neural network. This solves the gradient vanishing problem that arise in ReLU activation function. For image classification, the efficiency is improved by replacing the parameter free ReLU activation and learning problem. In this the efficiency level is improved and converges fast and reduces the complexity. The slope is parameterized in negative slope as well as positive slope. This dual parameterization controls the effectiveness of the gradient of the function. The LReLU is equated as below: Where, is termed as learning co-efficient or learnable parameter If the value of is set to very small, then this LReLU activation function is converted in converted into Leaky ReLU (LReLU). This reduces the zero gradient issue and leads to improvement in accuracy level. The entire network is train itself and improves the efficiency of the low-resolution images and enhances the efficiency level of the entire network. An additional small parameter is used to learn and enhances as compared to ReLU. The proposed resolution network effectively adapts and learns the LReLU parameters and jointly improves the efficiency of the entire network. A small parameter equal to small quantity and neglect the total number mn= the mean weight value related to In(x, y) = the standard deviation of the weight value related to In(x, y) The global gradient is given by: Wg n (x, y) = Grad n (I n (x, y)) −1 ∑ Grad n (I n (x, y)) −1 N n=1 +∈ (10) ∈ = small positive constant random value which results in nonzero denominator. Grad n (I n (x, y)= The cumulative histogram pixel gradient with intensity, In(x, y). As the gradient is not local to the pixels but related to all pixels that exist within a single image, it is given the name global gradient.
The LR image and their degradation is improved by applying weight updation of the CNN features. The pixels of LR images is reconstructed by enhancing the image resolution. The degraded pixels are enhanced and pixels values are improved.
The degradation values are removed using updation of local as well global parametric values of the LR input image. Therefore, the proposed CNN model effectively reduces the degradation parameters and improves the effectiveness of the entire methodology.

B. Algorithm
Return → X high_resolution exit

A. Description of Dataset
In this section, on publicly we will compare the state-of-theart algorithms in the available PET dataset [11] and retina database [12]. The emblematic images from all the datasets are shown in Figure 5. PET images are taken from brain tumour from Kaggle repository. This contains approximate 250 images of brain tumour MRI. For simulation analysis this repository is considered. PET images are taken from brain tumour from Kaggle repository. This contains approximate 250 images of brain tumour MRI. For simulation analysis this repository is considered. The retinal database [12] has been established by a collaborative research group to support comparative studies on automatic segmentation algorithms on retinal fundus images. The public database contains at the moment 15 images of healthy patients, 15 images of patients with diabetic retinopathy and 15 images of glaucomatous patients. The gold standard data is generated by a group of experts working in the field of retinal image analysis and clinicians from the cooperated ophthalmology clinics. Samples of both datasets are illustrated in figure 5.

B. Performance Metric Evaluation
To improve the quality of image, few parameters quality metrics are developed such as denoising, de blurring, color correction which varies in human visual system [HVS]. Some of Image Qualities Assessment (IQA) are PSNR, CNR and MSE. HVS are of two types: a. SSIM b. MSSIM. This help in improving quantitative and qualities measurement. In this proposed methodology, following performance parameters are used:

a. Peak Signal to Noise Ratio (PSNR)
The very important tool to enhance the image quality the image is known as PSNR. The greater the value of PSNR better the technique is MSE is basically depends on the scaling, so the scaling algorithm gives image nearer to the original image. The best quality of image will get when there is decrease in value of mean square error.

b. Structural Similarity Index (SSIM)
One of the quality assessment methodology in digital image processing is structural similarity (SSIM). SSIM is used to find and measure of similarity between images as well as videos. This index is completely reference based and determines the noisy or compressed or distorted images with some reference images. The peak signal to noise ratio is improved with reduction of mean square error (MSE) and ultimately improved the qualitative measures. The calculation of structural similarity formula with different window for measuring x and y of same size NxN is shown below: ( , ) = (2 + 1 )(2 + 2 ) ( 2 + 2 + 1 )( 2 + 2 + 2 ) Where, = mean of x, = mean of y, 2 = variance of x, 2 = variance of y, = co-variance of x and y, c1 and c2 are variables to stabilize the division with weak denominator.  Table 2 shows the performance evaluation of PET dataset on proposed model and observed average PSNR of 40 whereas average SSIM evaluated is approx. 0.9819.  Table 3 shows the performance evaluation of PET dataset on proposed model and observed average PSNR of 40 whereas average SSIM evaluated is approx. 0.9869.

D. Comparative Performance Evaluation
The performance is evaluated using proposed CNN model in this section and compared with existing work. The work presented by Zhang et al. [9] is used for comparative analysis purpose. In this work PET images of brain are taken as data for performance evaluation of proposed model. The author used image patches to extract the spatial location information and provided it into CNN model to identify the blur kernels in PET images. The model was designed with 40 layers of CNN with various network parameter combination. Table 4 represents some examples for HR resolution from LR images. In recent years, the image super-resolution (SR) has gained a lot of attention of researchers to contribute their efforts. As it is known that low resolution images with coarse details are allowed to convert corresponding image into high resolution images. This will provide a better visual effects and more refined details about image so that any application can be performed more efficiently. Image super resolution can define in terms of scaling, up sampling, enhancement, enlargement, etc. High resolution images are further applied in many applications or domains such as for object detection, contrast enhancement, medical imaging, remote sensing imaging, geo satellite images, etc. It is still a challenging task due to ill exposed images arise due to several reasons. There are several methods or techniques that provide solution to low resolution issues. But still there are some complexities such as up scaling increases the training complexity increases. At high complexity factors, missing value recovery is very challenging task and make the entire system more complex and also assessment of quality is very low. www.ijoscience.com 29 challenges. As SR is a process examines the LR images and construct high-resolution (HR) images. While during LR to HR conversion the frequency components get increased and degradations are removed. In many applications of digital world are using high resolution images for analysis. So, it arises a scope for future developments. But still, image super resolution is a challenging research issue with significant genuine applications. The achievement of any learning approaches has come about in quick development in profound convolution arrange based systems for picture super-goals. A differing set of approaches have been proposed with energizing developments in organize structures and learning systems. In future work, this model will be explored for unsupervised super resolution.