The BBC is running a strangely-named article (“How to make spam unstoppable”) today which describes how hard it is for spammers to defeat bayesian-based spam filters. The article describes anti-spam researcher John Graham-Cumming’s attempts to fool his own bayesian filter. The filter has been trained for him, and was reporting a 99% success rate in filtering out spam.
Graham-Cumming found he was able to get a group of words that would always allow spam through his filter. But to find this group of words, he had to use web bugs in his email, and send himself 10,000 different copies of an email before he was able to do so. No spammer alive would do this much work to be able to spam one person.
Though the article doesn’t come out and say it, Graham-Cumming’s experiment shows that bayesian filtering is by far the best way to filter out spam. With proper training, a bayesian filter can be 99% effective – and spammers will not be able to fool the filters (as opposed to SpamAssassin in non-bayesian mode, which has to come out with periodic updates as spammers find new ways around SA’s filters).
I strongly recommend using a Bayesian filter to weed out spam. Some email programs (Eudora, Mozilla Thunderbird) come with bayesian filters built-in. For those that don’t (such as Microsoft Outlook and Outlook Express), I recommend downloading and installing POPFile. (POPFile has the advantage of being able to filter for more than just SPAM. It gets my highest recommendation.)