savannah-hackers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Savannah-hackers] How to fight spam on GNU mailing lists


From: Martin Hamilton
Subject: Re: [Savannah-hackers] How to fight spam on GNU mailing lists
Date: Sun, 28 Apr 2002 16:42:16 +0100

"Mathieu Roy" <address@hidden> writes:

| Le ven 26 avr 2002 à 11h55, Georg C. F. Greve a écrit :
| > I'm not entirely sure why this mail reached me, but it seems like
| > you've been discussing how to free the GNU lists of Spam. In case you
| > haven't seen it yet, I recommend taking a look at the SpamAssassin
| > (also featured in the Brave GNU World).
| 
| http://spamassassin.org/
| This seem interesting.

I've been using SpamAssassin on my personal mail for a few months now, with an 
eye to how it might be deployed on the GNU mailhub.  There are some notes, 
FWIW, on SpamAssassin/Exim integration at:

  http://bogmog.sourceforge.net/document_show.php3?doc_id=28

In addition to the previous comments, one thing you need to be aware of is that 
this is not a complete solution in itself - whilst the vast majority of spam 
destined for my account is now successfully detected, a small proportion of 
genuine messages to me managed to push enough of the buttons that they were 
misidentified as spam.

This can be corrected (to cater for subsequent instances), e.g. by hacking the 
SpamAssassin config - "whitelist_from ..." in the per user 
$HOME/.spamassassin/user_prefs, for instance.

My $0.02 :-

Here's a possible solution I've been thinking about.  Note that it does not 
cater for the case that the spammer is able to read replies sent to them 
directly.  In my experience most spam these days seems to originate from 
made-up or stolen email addresses, but caveat emptor!

If mail from a particular person fails the "spam test", whatever that ends up 
being, I think we should send them a bounce (e.g. from <>) message containing a 
URL which they can visit to register to bypass the spam checking.

The bounce message is extra traffic, unfortunately, but unavoidable if we're to 
contact the original poster.  Sending it from <> should mean that our reply 
generates no further replies, which is important due to the widespread use of 
non-existent email addresses by spammers.

[Obviously we wouldn't want to reply to a newsgroup (header "Newsgroup:" 
exists?), to a mailing list ("Precedence: bulk"?), or a bounce message.  
Probably a few more special cases here.  Potential problem here when people 
read mail gatewayed into local News servers.]

I'm thinking that the registration would be keyed on the sender's email address 
(envelope and/or From:) rather than their IP address, to cater for folk who 
move around a lot or get a different address every session from their ISP.  IP 
address is not useful in the general case.

Obviously this leaves open the possibility of someone devious registering the 
email address(es) which they will use later to spam.  It also doesn't cater for 
the now common case of spammers originating messages from harvested email 
addresses, but see below.

To guard against people bulk registering addresses which will later be used to 
spam, I suggest using a time-limited magic cookie, e.g. in the registration URL 
itself...

  Your mail to gnu.org has been rejected because it resembled
  unsolicited bulk email (spam).

  If you would like to be able to send mail to addresses at
  gnu.org and associated domains, fire up your Web browser and
  visit this URL:

    http://mail.gnu.org/letmein/85ab4b962a20dfea5efa6f7f2fa47c6d

  This URL will self-destruct in 5 seconds, but (if you choose not to      
register now) you will be prompted to register again if/when you next
  attempt to send mail to us.

  Your message has not been delivered, and you will need to resend
  it after registering.  For your convenience it has been appended to
  this rejection message...

  (etc)

At the GNU end, verifying the cookie could be as simple as comparing with a 
filename in a scratch directory created at rejection time by the spam blocker, 
and adding the email address (stored in the file) to a DB database.

A trivial Perl script (say) run out of cron every 5 seconds would take care of 
the cookie expiry.  I'm joking about the 5 seconds part, of course :-)  An hour 
might be more reasonable?

I'm picturing that the cookie/filename would be based on a hash (MD5 or SHA1) 
of the blocked email address and a private "key" which a Bad Person would have 
to learn or figure out in order to guess a cookie.
  
Using a one-way hash like MD5 should make this process non-trivial for the Bad 
Person to subvert *unless they are able to read replies sent to their spam*.

The private key doesn't have to be a single dictionary word, either :-)
We could use the Bible, the Koran, a dictionary...  Brings a whole new meaning 
to "dictionary attack" ;-)

Note that registration does not require any manual intervention by gnu.org or 
list admin type folk!

Now, this is all very well for registering a person's email address as "OK to 
send", but as we know it's now commonplace for spammers to send their junk mail 
using a randomly chosen real person's email address out of their list of people 
to spam.

To protect against this there needs to be some additional information.  It 
could be mailhub/domain-wide, sender-specific or recipient-specific.  It could 
be cryptographic, e.g. "your messages must be GPG signed", or as simple as an 
extra email header similar to "Approved:" on Usenet.

I quite like the idea of asking each person registering to specify something 
which will appear in the email messages they send - perhaps in either header or 
body, though limiting this to headers would be better for performance purposes. 
 For instance, I might choose to send this header with all my messages...

  X-This-Is-Not-Spam: Honest ;-)

It's not necessary to provide people with a way of editing this info, since if 
they stop sending the (say) header and a message they send is flagged as spam, 
they will be prompted to reregister.

I'm picturing that the process outlined above would actually be hacked into the 
spam blocker - SpamAssassin is written in Perl, so should be particularly easy 
to do.  This avoids any additional forking beyond that required for the spam 
blocker.

Any thoughts?

Cheers,

Martin

PS I ran it by RMS, who thinks that there should be a way of registering which 
doesn't require WWW access.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]