ifile-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Ifile-discuss] Re: Large spam-only .idata file available


From: Karl Vogel
Subject: Re: [Ifile-discuss] Re: Large spam-only .idata file available
Date: 6 May 2003 13:46:37 -0400

>> On 6 May 2003 14:09:45 +0200, 
>> "clemens fischer" <address@hidden> said:

C> "Jonadab the Unsightly One" <address@hidden>:
   >> Is 45 thousand enough to give solid results, or would it be helpful
   >> to have an additional twenty-six-thousand-message spam collection?

C> it might even be too much!  note that with ten times as many spams than
C> hams ifile will think many legit messages to be spam, just because some
C> of the words both categories have in common have high counts in `spam'.

   It depends on how you categorize the spam.  When my collection started
   getting big, I started getting false positives, plus the Nigerian scam
   messages were getting through as valid mail.  I fixed this by using the
   following categories in my .idata file:

     spambg      -- Bruce Guenter's collection
     spamcredit  -- credit checks, no one denied, etc.
     spamdiploma -- get your diploma here
     spamfraud   -- Nigerian scams
     spamgt      -- Grant Taylor's collection
     spamlicense -- International driver's license
     spamlocal   -- Junk sent to me
     spamsex     -- Guess
     spamuk      -- UK collection
     good        -- generated from my non-spam mail

   I just checked my incoming-spam folder and found one non-spam message
   out of 650.

-- 
Karl Vogel                      I don't speak for the USAF or my company
address@hidden                          http://www.pobox.com/~vogelke

If God dropped acid, would he see people?  --George Carlin





reply via email to

[Prev in Thread] Current Thread [Next in Thread]