An Efficient feature selection algorithm for the spam email classification

Authors

  • Hadeel M. Saleh

DOI:

https://doi.org/10.21533/pen.v9.i3.867

Abstract

The existing spam email classification systems are suffering from the problems of low accuracy due to the high dimensionality of the associated feature selection (FS) process. But being a global optimization process in machine learning, FS is mainly aimed at reducing the redundancy of dataset to create a set of acceptable and accurate results. This study presents the combination of Chaotic Particle Swarm Optimization (PSO) algorithm with Artificial Bees Colony (ABC) for the reduction of features dimensionality in a bid to improve spam emails classification accuracy. The features for each particle in this work were represented in a binary form, meaning that they were transformed into binary using a sigmoid function. The features selection was based on a fitness function that depended on the obtained accuracy using SVM. The proposed system was evaluated for performance by considering the performance of the classifier and the selected features vectors dimension which served as the input to the classifier; this evaluation was done using the Spam Base dataset and from the results, the PSO-ABC classifier performed well in terms of FS even with a small set of selected features.

Downloads

Published

2021-08-31

Issue

Section

Articles

How to Cite

An Efficient feature selection algorithm for the spam email classification. (2021). Periodicals of Engineering and Natural Sciences, 9(3), 520-531. https://doi.org/10.21533/pen.v9.i3.867