Spam Filtering

Spam filtering is an important task. If you look at the statistics, most of the incoming mail we get is spam. Here is a brief discussion of the methods we use to stop spam.

Before it hits the mail queue
There are a couple of things to do to stop spam before it gets to the mail server. The only one we have enabled is reverse_dns checks. If the IP address of the client doesn't have a reverse IP, it will be refused. This stops a surprising amount of spam.

We do not, and in my opinion, should not use greylisting. It causes large mail delays, which no one likes. The only reason old ugcs used it sometimes was that its mail server was too weak.

After it gets to the server
We use amavis with spamassassin to filter spam. Amavis is a meta-filter, it runs messages through other filters. First it goes through clam-av to filter out viruses. Viruses are dropped without notice and are not saved.

It then goes to spamassassin. Spamassassin's configuration is mostly in amavis- site-wide stuff in /etc/spamassassin won't do much good. Its "local preferences" directory is hermes:/var/lib/amavis/.spamassassin.

There is also a cron script that automatically gets new spamassassin rules. See sa-learn and /etc/cron.daily/spamassassin. We get the default rules as well as a bunch from SARE.

Per-user configuration
Users can set a custom spam kill level in ldap- amavisSpamTag2Level and amavisSpamKillLevel. They can also be set for mailing lists- see the mailman command. The mailing list command runs a remctl on hermes which has permissions to update the mailing list's ldap entries. See hermes:/usr/local/lib/mailman/ugcs_mailman.py, particularly ensure_ldap_for_list and set_spam_kill

See the ldap schema hera:/etc/ldap/schema/amavis.schema for all of the options.

Manually training the filter
Run on hermes: sudo sa-learn --dbpath /var/lib/amavis/.spamassassin --prefs-file=/var/lib/amavis/.spamassassin/user_prefs --showdots --(spam|ham)

You may have to wait a while for the program to acquire a lock, since amavis uses this database to process mail. Just be patient, it will start eventually.