[ILUG] tool for removing duplicate mails from mbox files?

John Gaughan jgaughan-ilug at irish-times.com
Thu Nov 1 11:42:33 GMT 2001


On Thu, 01 Nov 2001, Ken Guest wrote:
> I've tried google and freshmeat, searching for a tool to remove
> duplicate mails from mbox files to no avail.
> Does anybody know of such a utility, or do I need to cobble one
> together?

Formail (part of procmail) should do it.  With the -D option, formail
keeps a cache of Message-IDs it has seen, which is used to check for
duplicate messages.  When used with the -s option (splitting), formail
won't output duplicate messages.  You need to specify the size and
filename for the Message-ID cache.  

For example (with an ID cache called msgid.cache of size 8192):

    formail -D 8192 msgid.cache -s < oldmboxfile > newmboxfile

John.

-- 
John Gaughan, Systems Administrator
Irish Times New Media - http://www.ireland.com/




More information about the ILUG mailing list