[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Ifile-discuss] Re: Large spam-only .idata file available
From: |
Karl Vogel |
Subject: |
Re: [Ifile-discuss] Re: Large spam-only .idata file available |
Date: |
6 May 2003 13:46:37 -0400 |
>> On 6 May 2003 14:09:45 +0200,
>> "clemens fischer" <address@hidden> said:
C> "Jonadab the Unsightly One" <address@hidden>:
>> Is 45 thousand enough to give solid results, or would it be helpful
>> to have an additional twenty-six-thousand-message spam collection?
C> it might even be too much! note that with ten times as many spams than
C> hams ifile will think many legit messages to be spam, just because some
C> of the words both categories have in common have high counts in `spam'.
It depends on how you categorize the spam. When my collection started
getting big, I started getting false positives, plus the Nigerian scam
messages were getting through as valid mail. I fixed this by using the
following categories in my .idata file:
spambg -- Bruce Guenter's collection
spamcredit -- credit checks, no one denied, etc.
spamdiploma -- get your diploma here
spamfraud -- Nigerian scams
spamgt -- Grant Taylor's collection
spamlicense -- International driver's license
spamlocal -- Junk sent to me
spamsex -- Guess
spamuk -- UK collection
good -- generated from my non-spam mail
I just checked my incoming-spam folder and found one non-spam message
out of 650.
--
Karl Vogel I don't speak for the USAF or my company
address@hidden http://www.pobox.com/~vogelke
If God dropped acid, would he see people? --George Carlin