A Comparison of Support Vector Machines, Memory-based and Naïve Bayes Techniques on Spam Recognition

G. Eryigit and A.C. Tantug (Turkey)

Keywords

Spam Recognition, Support Vector Machines, Memory Based Learning, Nave Bayes Learning

Abstract

This paper presents a comparison of support vector machines (SVM), memory-based learning (MBL) and Nave Bayes (NB) techniques for the classification of legitimate and spam mails. Although there are a number of method-comparative studies regarding spam mail filtering, most of the studies are tested on separate data sets. In order to evaluate the effectiveness of SVM, MBL and NB methods, we have used a common publicly available corpus (LINGSPAM). As MBL and NB methods are previously tested with this corpus, the obtained best parameters are used in the experiments with few changes. On the other hand, intense experiments are made to find the best attribute dimensions with SVMs. Results show that SVM has significantly better performance for no-cost and high-cost cases, but NB performs best when the cost is extremely high.

Important Links:



Go Back