[ILUG] Re: cohosh covert

Paul Jakma paul at clubi.ie
Wed Apr 14 19:45:36 IST 2004


On Wed, 14 Apr 2004, Ronan Cunniffe wrote:

> The whole point of their trick is to provide a statistically
> useless message body.  

Doesnt matter really... the point is to detect statistically
_meaningful_ words that indicate spammyness or non-spammyness of a
mail. The fluff doesnt (shouldnt at least) matter.

If the spammers 'stuff' their spam with random text, then all that
happens is that a bayesian filter will tend to score random text as
neutral, ie 0.5 probability. A decent bayesian filter will only use
phrases with indicative probabilities (ie high or low probabilities)
to construct the bayesian probability for the mail, and discard the 
neutral ones.

So text-stuffing wont really affect things much, well not when every 
spammer does it. What _will_ hurt bayesian filtering is if the 
spammers include the most minimal of spam payloads, eg just one url, 
especially if they do not reuse URLs (and spammers register lots of 
throwaway domains).

regards,
-- 
Paul Jakma	paul at clubi.ie	paul at jakma.org	Key ID: 64A2FF6A
	warning: do not ever send email to spam at dishone.st
Fortune:
No wonder Clairol makes so much money selling shampoo.
Lather, Rinse, Repeat is an infinite loop!



More information about the ILUG mailing list