[ILUG] Running spamassassin on a large number of email messages

Timothy Murphy gayleard at eircom.net
Thu Sep 11 13:50:10 IST 2008


On Tuesday 09 September 2008, Timothy Murphy wrote:

> I have a directory containing about 50,000 email messages,
> of which I am sure over 95% is spam.
>
> What is the best (quickest) way of passing these messages through
> spamassassin and deleting those that fail the test.

I've been following jm's suggestion:
----------------------------------------------------
You could run

    spamassassin -t /path/to/maildir/new > mbox1
    spamassassin -t /path/to/maildir/cur > mbox2

It'll output mboxes with the filtered versions of each message.  You then
need to replace the maildir with the mboxes and remove the ones that
match "X-Spam-Flag: YES".
----------------------------------------------------
(One can hardly ignore the horse when he opens his mouth ...)

But the remainder of the process is rather long-winded in my case,
and I am wondering what exactly I was expected to do.

I could write a Perl script to run through the mbox and delete  messages
with the X-Spam header set.

I use kmail as my email reader, normally with maildir folders.
What I have actually done is to create a Test account using mbox format,
import the mbox files, and transfer the nice messages to a maildir folder.

But surely there is an easier way of completing the task?

As always, I await the valued advice of the ILUG guruship.



-- 
Timothy Murphy  
e-mail: gayleard /at/ eircom.net
tel: +353-86-2336090, +353-1-2842366
s-mail: School of Mathematics, Trinity College, Dublin 2, Ireland



More information about the ILUG mailing list