felix at compsoc.nuigalway.ie
Wed Nov 3 12:47:12 GMT 2004
Quoting Timothy Murphy <tim at birdsnest.maths.tcd.ie>:
> On Wednesday 03 November 2004 11:37, Darragh Bailey wrote:
> > I'm using mbox format as well and the time taken to perform
> > sa-learn --spam --no-rebuild --mbox ~/mail/spam/spam && sa-learn --ham
> > --no-rebuild --mbox ~/mail/spam/ham && sa-learn --rebuild
> > is about 1 minute.
> What does --no-rebuild do?
> It does not seem to be listed by "man sa-learn" on my system (Fedora-2).
> Incidentally, timing info seems pretty useless to me
> unless you give some indication of the machine you're using.
> My Sony Picturebook (660MHz, 256MB RAM) would certainly take
> at least 9 minutes with the specified load,
> and would certainly not process 2500 messages in 1 minute, as stated.
> By the way, what is a "false positive"
> and how did the OP collect 2500 of them?
supprised that your man sa-learn doesn't give that option
--no-rebuild = Skip building databases after scan
When scanning lots of spam/ham mails it speeds up the process since it only
resyncs the database when finished.
Dual PIII 850MHz with 512MB Ram. While Spamassassin doesn't benifit from the 2
cpu's it does allow some other work to be done on the other cpu's. But then
there is also 20 other users using the machine (right now), so I would imagine
it should balance itself out.
I actually deleted my current database and ran
time sa-learn --spam --mbox ~/mail/spam/spam && time sa-learn --ham --no-rebuild
--mbox ~/mail/spam/ham && time sa-learn --rebuild
to get a better estimate since it would obviously take longer than normal, to
scan each mail from scratch.
Learned from 1156 message(s) (1232 message(s) examined).
Learned from 1033 message(s) (1033 message(s) examined).
synced Bayes databases from journal in 75 seconds: 218531 unique entries (218531
even adding up those times the actual cpu time is 20 seconds. To me it looked as
though it took just over 4m 30s. The 1minute real time stands up though for
I still think the attachments are whats more likely causing the problem with the
long processing time in this case.
"Nothing's foolproof to a sufficently talented fool"
More information about the ILUG