[ILUG] some fun bayes tokens
Justin Mason
jm at jmason.org
Mon Nov 4 17:59:09 GMT 2002
Padraig Brady said:
> > This is nice -- mail sent from a Red Hat Linux box is only 0.1% likely to
> > be spam, in my corpus ;)
>
> How many messages?
129, as far as I can see... (see below)
> > N:H:X-Mailer:iNNN-redhat-linux 129 0.00173812278080732
>
> What resolution do you require? Since it's multiplicative
> wouldn't 0.01 be enough or at most 0.001?
yes -- it's just an artifact of the float representation. BTW it's
actually got 0 spam signs against it, but it's capped at 0.01 so that 1
strong non-spam sign can't outweigh many not-quite-as-strong spam signs.
statistically, it works better that way.
John -- dunno about release just yet, there's a good bit of QA we
have to do first. But CVS works quite nicely right now ;)
--j.
More information about the ILUG
mailing list