[ILUG] [Q] `sa-learn --no-sync': what does it (not?) do, and is `--sync' then needed?

Brian Foster blf at blf.utvinternet.ie
Tue Apr 17 06:32:54 IST 2007


 I have just revamped my spam-filtering techniques
 to include the usage of SpamAssassin (v3.1.8).
 the Bayesian filter was trained with c.40,000 of
 the spams I've received, and with c.20,000 hams;
 both the hams and spams cover the last c.5 years.
 the training was done in `--local' mode (i.e., no
 internet access).

 to-date, and using essentially the shipped defaults,
 I've had no false-spams, but only c.50% of the spams
 are being caught.  that is notably lower than I was
 hoping for, but we'll see what happens with time
 and tweaks.  (possibly relevant here, spamd(1) is
 currently being run `--local' (this could, perhaps,
 be changed?).)

 one of the key changes I made was, when refiling a
 false-ham as spam, to run `sa-learn --spam' on the
 misclassified-spam.  and this is where my current
 issue is:  it's a bit slow, and on occasion, takes
 a really really long time (multiple minutes whilst
 consuming a great deal of CPU).  (this training is
 also done `--local', so internet access is not the
 problem here.)

 re-reading the sa-learn(1) man page, I note there
 is a `--no-sync' option which _sounds_ like it may
 deal with one or both slowness issues.  however,
 the manual page is mostly opaque about about the
 consequences of using this option, and seems to
 suggest that after a series of `sa-learn --no-sync's,
 an `sa-learn --sync' ought to be done.  IF that is
 true, it's an issue:  it doesn't fit into my nominal
 routine; and IF it's required, then I cannot ensure
 it will "always" be done.  (besides, I've no clear
 idea when or why it's required?)

  • so just what is `--no-sync' about?
  • is a `--sync' subsequently required?  (if so, why?)
  • what are the consequences of not (always) doing a
     `--sync' afterwards (whether required or not)?

 and b.t.w., how safe is it to interrupt (^C) or
 suspend (^Z) an überlong `sa-learn --spam'?

cheers!
	-blf-
-- 
Experienced (>25 yrs) kernel/software Eng: | Brian Foster   Montpellier,
 • Unix, embedded, &tc;  • Linux;  • doc;  | blf at utvinternet.ie   FRANCE
 • IDL, automated testing, process, &tc.   |  Stop E$$o (ExxonMobile)!
Résumé (CV) http://www.blf.utvinternet.ie  |     http://www.stopesso.com



More information about the ILUG mailing list