October 01, 2003

Training Bayesian Filters 10:14 PM, 0 Comments

I've been collecting spam for a year now. Now I have close to 1,500 messages that were sent to me that were spam. I went through my spam archives and added to that list all the messages that spamassassin missed, and now I'm running

sa-learn --spam --mbox junkspool

...on all of them, getting spamassassin to learn my spam. After that, I'll run

sa-learn --ham --mbox archive

...on all my archives of good mail. I just love the bayesian algorithm. The more spam people send, the more we will be able to block it. I can't wait for the day when spammers realize that no one is getting their mail and they just stop.

I plan to keep all my archives so I can train a new bayes database at the drop of a hat. Just thought you'd like to know. So there.

