Classification and Detection of Various Geographical Features from Satellite Imagery

It is a challenging task to classify and detect various geographical features from the satellite imagery of the Earth as well as the celestial bodies. This paper puts forward several pixel based classification algorithms to classify geographical features from the satellite images of the Earth. The recorded experimental results, from a total of 606 satellite images to classify miscellaneous geographical features, demonstrate that the maximum algorithmic performances can approximate to 87%. This paper also addresses a simple algorithm based on edge approximation and circular Hough transformation to detect craters from the satellite imagery of celestial bodies. An online available dataset to detect craters evaluates the performance of the algorithm. In general, all the proposed algorithms are straightforward but in many ways effective.


Introduction
It is a difficult work to classify and detect miscellaneous geographical features from the satellite imagery of the Earth as well as the celestial bodies.For example, Figure 1 shows sample satellite images of the Bosnian city of Banja-Luka [1].We need to classify geographical features (e.g., cemeteries, fields, houses, industries, rivers, streets, forests, mountains, and so on) from those images.Cemeteries would contain lots of small and white patches or rectangles, while the industrial class might be quite varied and much more difficult to detect.Houses in the housing class may have red roofs, which would make the classification process a bit easy after extracting necessary features properly.But forests and rivers are not easy to classify because of their similar nature.Many researchers tried to solve the existing problem of classifying geographical features from the satellite images of the Earth with great accuracy [2][3][4][5].For examples, Haralick et al. [2] addressed how texture plays a vital role in image classification of satellite imagery, photographic imagery, and photomicrograph of 1:20000 scales with the maximum accuracy of 83%.Cleve et al. [3] compared pixel and made object based classifications using high resolution aerial photographic images with an accuracy score of approximately 40%.Risojevic et al. [4] analyzed the importance of texture in remote image classification by employing the Gabor wavelet coefficients [6] and support vector machines [7] with an accuracy score of 85%.Risojevic [5] suggested how the images can be classified by dint of convolutional neural networks with the maximum accuracy of 99%.Weih et al. [8] claimed that the object based classification is better than the pixel based classification.But in this paper, we are interested in pixel based classification algorithms -by motivating the proposition that the pixel based methods analyze the spectral properties of every pixel within the area of interest [9].With this vein, we propose several pixel based classification algorithms to classify geographical features from the satellite images of the Earth.As experimental dataset, we utilize the satellite imagery of the Bosnian city of Banja-Luka to DOI: 10.21533/pen.v6i1.144PEN Vol. 6, No. 1, April 2018, pp.84 -94 Figure 1.Sample satellite images from Banja-Luka dataset [1].How to classify the geographical features (e.g., cemetery, fields, houses, industry, river, and trees) from those images?Figure 2. A sample satellite image of the Mars.How to detect the number of craters from this image?classify geographical features.The Banja-Luka dataset [1] consists of 28 cemetery images, 178 field images, 143 houses images, 75 industry images, 77 river images, and 105 tree images.We also apply WEKA 1 to train neural networks for classifying geographical features.A crater can be defined as a cavity on the surface of the Earth or any celestial body marking the orifice of a volcano.To provide ground truth of the craters in satellite imagery is a challenging task.If a dataset of satellite images from a celestial body is available, then one would interest to count the number of craters that exist in every image.For example, Figure 3 shows a sample satellite image of Mars.We wish to detect the number of existing craters on it.It is very hard to verify the algorithmic results if no ground truths are available.Besides, craters come in all shapes and sizes.Thus it is really a difficult task to detect them.The size is one of main problems, since some craters can be a few pixels wide while others can be hundreds of pixels wide.Craters can be inside of other craters.They can be cut in half due to the size of the image.Color of the craters can play a significant role.Distinct celestial bodies usually have different surface colors.For instance, the Mars is more orange and red in color, while its moon the Phobos [10] is grayer in color.Shades of craters play an important role, i.e., craters can contain shades due to neighboring hills or mountains.The detection of such craters is extremely hard.Another problem is that there exist no ground truths for the dataset of celestial bodies.Explicitly, no dataset ever mentions the exact number of craters.An approximate ground truth has been put for each dataset by using manual detection and counting.Like events detection (e.g., [11][12][13]), the crater detection became key interests of many computer vision researchers [14,15].For instance, Honda et al. [14] added a variant of the genetic algorithm to form a candidate of craters before using a self-organizing mapping to categorize the craters from non-craters.Meng et al. [15] constructed candidate areas that might contain craters using Hough transforms.The performance of their algorithm was not bad, but viability became a problem (e.g., false positive) when the craters did not contain a clear edge.Mu et al. [16] used cell algorithms to detect craters on celestial bodies with 86% accuracy.Upadhyay et al. [17] tried to detect the boundary of crater using functions from MATLAB.We propose a simple algorithm based on edge approximation and circular Hough transform to detect craters from the satellite imagery of celestial bodies.It minimizes color and shape related problems of the craters.It is tested against an online available dataset [18] and its performance cannot be eschewed.The rest of the paper is organized as follows.Section 2. illustrates necessary implementation steps.Section 3. reports experimental results followed by some hints for future work.Finally, Section 4. concludes the paper.

Implementation Steps
In this section, five key algorithms to classify satellite imagery and an algorithm to detect craters are discussed.The sum of all RGB pixel values for each image can be a straightforward algorithm to classify various geographical features from satellite images.The RGB pixel values of an image can distinguish between certain features quite easily.For example, cemetery images usually have large amounts of white pixels due to the gravestones, which could be seen from a high count of all RGB features.The river images do not contain many red pixels.But housing images possess a lot of red pixels because of their many red roofs.The images of field and forest have many green pixels, but they have much less red and blue pixels.Taking into account these common cases, a simple but somewhat effective classifier can be proposed.
2.1.2.P W GR : White-Green-Red (WGR) pixels count The addition of all RGB pixel values is not sufficient for a good classifier.Consequently, it is a good idea to count exactly how many WGR pixels exist in a given image.A white pixel cannot be considered as apparently red even though it contains a high level of red.Henceforth, it can be counted how many WGR pixels exist according to the following threshold conditions: 1.A pixel is considered as white if it contains RGB pixels greater than 200.And then the average of the three RGB pixel values is deemed.
2. A pixel is considered as green if it contains: a red pixel value smaller than 100, a blue pixel value smaller than 150, and a green pixel value greater than 100.
3. A pixel is considered as red if it contains: a red pixel value greater than 190, a green pixel value smaller than 200, and a blue pixel value not greater than 200.
The aforementioned constant values (i.e., 200, 100, 150, 100, 190, and 200) are experimentally defined to get good performance of the algorithm, so no theoretical reasoning is explained herewith.

D T : Texture (T) difference in certain colors and images
Normally, there is no formal mathematical definition of texture.One of the first quantitative or physiological texture description can be found in [19], where six basic textural features, namely, coarseness, contrast, directionality, line-likeness, regularity, and roughness were approximated in computational.Bharati et al. [20] discussed various methods for image texture analysis.However, an image texture can be a series of metrics computed in image computation intended to measure the sensed texture of an image.Image texture provides knowledge about the spatial organisation of color or image intensities or exclusive domain of an image.The texture differences in certain colors and images can be implemented.

C W R : White-Rectangle (WR) count
Due to the count of how many white shapes exist on a given image, the cemeteries are much easier to classify by employing WR count.A flood fill algorithm is implemented to go from one white pixel to another and then count the number of white pixels.The following steps for C W R are used: 1. Convert an image to a gray image using a simple averaging method.
2. Convert the gray image into a binary image considering a simple thresholding technique that distinguishes pixels between extremely white, e.g., higher than 230 in its gray value, and non-white pixels.
PEN Vol. 6, No. 1, April 2018, pp.84 -94 3. Flood every fifth pixel with respect to width and height.The flooding is recursive with the stopping criteria being the edges of image, not going into another indexed flood fill and if it hits a non-white pixel.
4. A map is made that displays how many pixels should contain each index, i.e., how many pixels should each flood-fill make.
5. Indexes that have a moderate pixel count (e.g., between 5 and 30) are regarded as white rectangles.
The aforementioned constant values (i.e., 230, 5, and 30) are experimentally defined to get good performance of the algorithm, so no theoretical reasoning is explained hereby.

C RP : River-Pixel (RP) count
One of the simplest ways to classify river images is to look at the individual pixels present in the image.The existing pixels on images of rivers are unique.The final status of a pixel whether it will be considered as a river pixel or not, it depends on the following a few sessions of analysis: 1.The absolute difference between green and blue values is negligible.
2. The absolute difference between green and red is considered.
3. The green value is considered.

Training of neural networks
WEKA is used to train neural networks.Its following classification methods to train neural networks are used.
1. Multilayer Perceptron (MulPer): It gives evidence of a neural network with feed forward capabilities and back propagation those results in a better learning outcome.

Logistic Regression (LogReg)
: It shows one of the possible regression models in which dependent variables are grouped but not continuous.
3. Sequential Minimal Optimization (SeqOpt): It makes clear and visible that one of the solutions to quadratic programming problems which presents in the training of support vector machines.
4. Random Forest (RanFor): It gives evidence of an algorithm in which a series of random decision trees are constructed and then giving statistical information such as mode or mean.

Algorithm to detect craters in celestial bodies
We have proposed a crater detection algorithm based on circular Hough transformation to detect craters from images of celestial bodies.The algorithm is simple but in many ways effective.The Hough transform is a technique that can be used to distinguish the parameters of a curve which best fits a set of given edge points.Under normal conditions, this edge description can be come into the possession of either a Roberts cross operator [21] or a Sobel filter [22] or a Canny edge detector [23].Nevertheless, these edge detectors would be noisy and lacking in restraint as there exist multiple edge fragments corresponding to a single whole feature.The output of an edge detector exclusively ascertains the whereabouts of features in an image, and then Hough transform determines both kind of features and how many of them exist in the image.Notwithstanding, these properties of Hough transform give a rational assortment to recognize crater without circumscribing the possibilities to one astronomical object.It models candidate objects that resemble the looked for objects, whether they are perfectly or imperfectly shaped.The candidates are given preference to the basis as a local maximum in the parameter space.Both the standard Hough transform and the circular Hough transform are used in many computer vision applications.The standard Hough transform utilises the parametric representation of a line to detect straight line.Under normal conditions, its input is a binary image containing the edge pixels.During the transformation, it iterates through the edge points and calculate all e.g., angle, distance pairs for them.Each sinusoid curve belongs to a point.The crossing of two sinusoids suggests that their corresponding points belong to the same line.The PEN Vol. 6, 1, April 2018, pp.84 -94 more sinusoids cross a given point, the more edge points are on the identical line.In spite of that, it can be taken into account the circular Hough transform to detect astronomical objects such as craters.The circular Hough transform relies on equations for circles.Consequently, it has three parameters namely the radius of the circle as well as the x and y coordinates of the center.A larger computation time and memory for storage are required to compute these parameters.As a result, it increases the complexity of extracting information from an image.For this reason, the radius of the circle would be fixed at a constant value.For each edge point, a circle can be drawn with that point as origin and radius.The circular Hough transform also uses a three-dimensional array, where the first two dimensions point out the coordinates of the circle and the last third specifying the radius.The values in the array are increased every time a circle is drawn with the desired radius over every edge point.The array, which kept counting of how many circles pass through coordinates of each edge point, proceeds to a vote to find the highest count.The coordinates of the center of the circles in the images are the coordinates with the highest count.The Listing 1 illustrates our proposed crater detection algorithm.
Listing 1.Our proposed crater detection algorithm  [ h e i g h t , w i d t h , d i m e n s i o n ] = s i z e ( r g b I ) ; 5 x = s p r i n t f ( ' An o r i g i n a l c a m e r a view image o f s i z e %d x %d x %d ' , h e i g h t , w i d t h , d i m e n s i o n ) ; 6 s e t ( gca , ' f o n t s i z e ' , 2 1 ) ; % S e t f o n t s i z e .

f ( ' C o n v e r t e d g r a y s c a l e i n t e n s i t y image o f s i z e %d x %d x %d ' , h e i g h t , w i d t h , d i m e n s i o n ) ;
13 s e t ( gca , ' f o n t s i z e ' , 2 1 ) ; % S e t f o n t s i z e .The circle hough(bi, radii,'same','normalise') is one of the MATLAB (MATrix LABoratory) built-in functions.It takes a binary two dimensional image bi and a vector radii giving the radii of circles to detect.It returns the three dimensional accumulator array h, and an integer margin such that h(i,j,k) contains the number of votes for the circle centred at bi(i-margin, j-margin), with radius radii(k).Circles which pass through bi but whose centres are outside bi receive votes.Each feature in bi is allowed 1 vote for each circle.The option same returns only the part of h corresponding to centre positions within the image.In this case h(:,:,k) has the same dimensions as bi, and margin = 0.This option should not be used if circles whose centres are outside the image are to be detected.The option normalise multiplies each slice of h, h(:,:,k), by 1/radii(k).This may be useful because larger circles get more votes, roughly in proportion to their radius.The circle houghpeaks(h,radii,'npeaks',15) be another MATLAB built-in function.It locates the positions of 15 peaks in the return of circle hough function.The result peaks is a 3 × N array, where each column gives the position and radius of a possible circle in the original array.The first row of peaks has the x-coordinates, the second row has the y-coordinates, and the third row has the radii.Assume that Figure 3 (a) demonstrates an image of the Moon surface.We are interested to look for specific patterns which exist on its surface.Especially, there exists a visual crater in it.How can we detect that?The images of Figure 3 (b)-(f) depict the algorithmic output of Listing 1 to detect craters.Figure 3 (f) displays that our algorithm in Listing 1 detected correctly the existing two craters in Figure 3 (a).

Experimental setup
Primarily, we have performed experiments on a computer of 8-Core CPUs at 3.50 GHz with 16 GB RAM.WEKA was used to train neural networks.Satellite images of both Banja-Luka dataset [1] and one of the online available crater datasets [18] were used to verify the performance of our algorithms.The number of hidden layers in the MulPer was considered as 3.The default values of WEKA algorithms are used for other parameters, e.g., learning rate and configuration of MLP (number of neurons per layer).

Results of geographical features classification
Table 1 depicts the results of our algorithms to classify various geographical features from Banja-Luka dataset without using the classification algorithms of WEKA.Algorithm C RP performed the best average of 86.66% among its alternative individual algorithms of P RGB , P W GR , D T , and C W R .This is widely due to the images of fields, houses, industries, rivers, and trees highly matched the following three cases: (i) The absolute difference between green and blue values has been neglected; (ii) The absolute difference between green and red has been taken in account; (iii) The green value has been taken in account.But those conditions did not suit very well for the case of cemetery images.Besides, cemeteries have huge amounts of white pixels because of gravestones.The number of white pixels is very high as compared to the number of other RGB pixels.Thus the algorithm C RP resulted the worst performance (78%) in the case of cemetery classification.The combined algorithm P RGB +P W GR +D T +C W R +C RP demonstrated the best performance as 86.83% from Banja-Luka dataset without using the classification algorithms of WEKA.Notwithstanding, from Table 1 it is noticeable that the average performance of the algorithm P RGB +P W GR +D T +C W R +C RP without using the classification algorithms of WEKA is slightly greater than that of C RP .Table 2 presents the performances of our various algorithms to classify various geographical features from Banja-Luka dataset using the classification algorithms of WEKA.The gain with respect to the P RGB algorithmic performance in Table 2 indicates the performance increment or decrement when we add or remove any new feature to the existing features of an algorithm.Approximately 70% average success rate was resulted in the cases of individual algorithms either P RGB or P W GR or D T or C W R or C RP .But the combined algorithm P RGB +P W GR +D T +C W R +C RP performed the best success rate of 87.34% from Banja-Luka dataset using the classification algorithms of WEKA.Using the sequential minimal optimization of WEKA classification algorithm, the gains of the algorithms of P W GR , D T , and C W R were decreased with the amount of 1.45%, 3.37%, and 2.28%, respectively.But the gain of C RP was increased by 13.16%.The gains of the algorithms P W GR , D T , C W R , and C RP were not reported for multilayer perceptron, logistic regression, and random forest.In case of the algorithm of P RGB +P W GR +D T +C W R +C RP , the gains were increased up to 30.18%, 21.44%, 21.48%, and 26.67% for sequential minimal optimization, multilayer perceptron, logistic regression, and random forest, respectively.The comparative performance factor for the algorithm of P RGB +P W GR +D T +C W R +C RP without using WEKA or using WEKA is not significant.This is widely due to the fact that P RGB +P W GR +D T +C W R +C RP without using WEKA is already presented its optimal performance.Consequently, no optimization options have significant effect on its further performance increment.Normally, performance would be increased if WEKA classification algorithms are used.But this proposition may not be true for optimal cases.For example, the performance would be decreased if WEKA classification algorithms are used.The average performance of the algorithm C RP without using WEKA classification algorithms was 86.66% but after using WEKA classification algorithms its performance became to 70%, i.e., approximately  they can play some sort of important role in the machine learning and computer vision applications including classification of miscellaneous geographical features from satellite imagery.

Comparison with state-of-the-art method
The method of Risojevic et al. [4] produced 85% correct classification results from Banja-Luka dataset.The average performances of our proposed algorithms C RP and P RGB +P W GR +D T +C W R +C RP as well as the algorithmic performance of Risojevic et al. [4] have been demonstrated in Table 3.It appears to exist that our simple algorithms having up to 87% correct classification results function more or less in parallel with the state-of-theart method (e.g., Risojevic et al. [4]) without any serious concern of effectiveness.correctly by our proposed algorithm.But small craters, the craters which were on the border, as well as many non-crater-like objects made a serious challenge of our algorithm.

Future works
Our straightforward classification algorithms need less processing time to extract simple features.On the average, their accuracy was recorded more than 70% from the Banja-Luka dataset.This performance would arouse attention for many applications related to computer vision and pattern recognition.But more advanced features (e.g., features regarding texture and object recognition) could be extracted.If more features would be extracted and utilized, then dimensionality reduction techniques (e.g., principal component analysis) should be employed to find exactly which features will not affect the solution.Only 606 satellite images were used to test our classification algorithms.It is important to test them with a large series of other datasets.More data means more possible observations to make in order to enhance the algorithm.To improve the algorithmic results further, the algorithms would need to be analyzed and their functionalities learnt to see which model fits datasets the best.Areas under the receiver operating characteristic curves [13] and multiple comparisons with statistical et al.
PEN Vol. 6, No. 1, April 2018, pp.84 -94 tests [24] would be performed by considering the proposed algorithms and state-of-the-art algorithms.Time complexity [25] of each algorithm with cache memory effect [26,27] would be computed and compared.

Conclusion
Our first aim was to classify geographical features of a large dataset with high-resolution images in a short amount of time.With this vein, we proposed several pixel based classification algorithms to classify geographical features from the satellite images of the Earth.The recorded experimental results of the algorithms from the Banja-Luka dataset showed their highest effectiveness approximately 87%.Our second aim was to detect craters from the images of the other planets rather than the Earth by proposing a simple but effective algorithm.Its output from an online available crater dataset showed the partial fulfilment of our aim.Craters come in different colors, shapes, and sizes.Our algorithm minimizes the problems related to surface color and shape of the craters.But the size problem of craters remains for the future investigation.Future study would also include further improvement of the algorithmic performances along with error analysis and time complexity.

1 r 2 f i g u r e 3 imshow
g b I = i m r e a d ( ' I m a g e 0 0 0 1 .p n g ' ) ; % Get a c a m e r a view RGB i m a g e .( r g b I ) ; % The I be a 3D image b u t we n e e d 2D i m a g e .

7 d
i s p ( t i t l e ( x ) ) ; % D s p l a y 8 g r a y I = r g b 2 g r a y ( r g b I ) ; % C o n v e r t from 3D t o 2D u s i n g a w e i g h t e d mean o f RGB c h a n n e l s .i g h t , w i d t h , d i m e n s i o n ] = s i z e ( g r a y I ) ; 12 x = s p r i n t

14
d i s p ( t i t l e ( x ) ) ; 15 b i = e d g e ( g r a y I , ' c a n n y ' ) ; %C r e a t e a b i n a r y image o f g r a y I u s i n g Canny a p p r o x i m a t i o n .( gca , ' f o n t s i z e ' , 2 1 ) ; % S e t f o n t s i z e .19 t i t l e ( ' The f o u n d e d g e s i n t h e g r a y s c a l e i n t e n s i t y image u s i n g Canny a p p r o x i m a t i o n .' ) ; 20 r a d i i = 1 0 : 1 : 3 0 ; % S e a r c h r a d i i from 10 t o 30 p i x e l s i n s t e p s o f 1 p i x e l .21 [ h , m a r g i n ] = c i r c l e h o u g h ( b i , r a d i i , ' same ' , ' n o r m a l i s e ' ) ; %Hough t r a n s f o r m f o r c i r c l e s .22 f i g u r e 23 s e t ( gca , ' f o n t s i z e ' , 2 1 , ' L i n e W i d t h ' , 3 , ' C o l o r ' , ' g r e e n ' ) ; 24 f o r i = 1 : l e n g t h ( h ( 1 , : )

32 r e c
t a n g l e ( ' P o s i t i o n ' , [ px ( i ) py ( i ) d ( i ) d ( i ) ] , ' C u r v a t u r e ' , [ 1 b e l ( ' x−c o o r d i n a t e ' ) ; 37 y l a b e l ( ' y−c o o r d i n a t e ' ) ; 38 t i t l e ( ' The c i r c l e s on Hough t r a n s f o r m i m a g e .' ) ; k s = c i r c l e h o u g h p e a k s ( h , r a d i i , ' n p e a k s ' , 1 3 ) ; % S e l e c t any number ( s a y 1 3 ) o f p r o m i n e n t p e a k s .41 f i g u r e 42 s e t ( gca , ' f o n t s i z e ' , 2 1 ) ; 43 xy = [ p e a k s ( 1 , : ) ' , p e a k s ( 2 , : ) ' ] ; 44 h i s t 3 ( xy , [ max ( p e a k s ( 3 , : ) ) max ( p e a k s ( 3 , : ) ) ] ) ; 45 s e t ( g e t ( gca , ' c h i l d ' ) , ' F a c e C o l o r ' , ' i n t e r p ' , ' CDataMode ' , ' a u t o ' ) ; % C o l o r t h e b a r s b a s e d on h e i g h t .46 x l a b e l ( ' x−p i x e l v a l u e s ' ) ; 47 y l a b e l ( ' y−p i x e l v a l u e s ' ) ; 48 z l a b e l ( ' R a d i i v a l u e s ' ) ; 49 t i t l e ( ' 13 p r o m i n e n t p e a k s on t h e Hough t r a n s f o r m i m a g e .' ) ; 50 p e a k s = c i r c l e h o u g h p e a k s ( h , r a d i i , ' n p e a k s ' , 2 ) ; % Get any number ( s a y 2 ) o f most p r o m i n e n t p e a k s .51 f i g u r e 52 s e t ( gca , ' f o n t s i z e ' , 2 1 ) ; et al.PEN Vol. 6, No. 1, April 2018, pp.84 p e a k = p e a k s 56 [ x , y ] = c i r c l e p o i n t s ( p e a k ( 3 ) ) ; 57 p l o t ( x+ p e a k ( 1 ) , y+ p e a k ( 2 ) , ' r− ' , ' L i n e W i d t h ' , 5 , ' C o l o r ' , ' r e d ' ) ; 58 end 59 t i t l e ( ' The most p r o m i n e n t 2 p e a k s h a v e b e e n l o c a t e d on t h e o r i g i n a l i m a g e .' ) ; 60 h o l d o f f

Figure 3 .
Figure 3.A sample output of the algorithm on the Listing 1.

Figure 4
Figure4depicts a sample of input image (a), edge detection (b), circular Hough transformation (c), and the best performance for the detection of craters (d) using the algorithm as depicted in the Listing 1. Figure5displays a sample of input image (a), edge detection (b), circular Hough transformation (c), and the worst performance for the detection of craters (d) using the algorithm as depicted in the Listing 1.It seems that our crater detection algorithm worked well in many case but with modest average results.In general, many craters were detected

Figure 5 .
Figure 5.A sample of one of the worst detection results of craters using our proposed algorithm.

Table 1 .
Performance of our algorithms to classify various geographical features without using WEKA.RGB +P W GR +D T +C W R +C RP 19.22% performance degradation.Similarly, about 8% mean performance degradation can be observed in case of the algorithm P RGB .As a concluding remark we would say, the overall performance of our proposed algorithms demonstrate that P

Table 2 .
Performance of our algorithms to classify various geographical features using WEKA.

Table 3 .
Performance comparison for classifying various geographical features from Banja-Luka dataset.RGB +P W GR +D T +C W R +C RP without WEKA [Ours] 86.83% P RGB +P W GR +D T +C W R +C RP using WEKA [Ours]