[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Ifile-discuss] Large spam-only .idata file available
From: |
Jonadab the Unsightly One |
Subject: |
Re: [Ifile-discuss] Large spam-only .idata file available |
Date: |
04 May 2003 23:03:48 -0400 |
User-agent: |
Gnus/5.09 (Gnus v5.9.0) Emacs/21.2.93 |
"Karl Vogel" <address@hidden> writes:
> The only real change is the number of spam messages used to generate
> the spam-only .idata file mentioned in the page. I included Bruce
> Guenter's collection (45,000 messages) and saw an immediate improvement
> in my mailbox.
Is 45 thousand enough to give solid results, or would it be helpful to
have an additional twenty-six-thousand-message spam collection?
I currently have 15302 messages in spam.general, 4298 in
spam.filtered.charset.chinese.gb2312, 1793 in
spam.filtered.charset.euc_kr, and 5084 in spam.filtered.charset.ks_c_
This is pure spam, about a years' worth (since I switched mail
clients), all sent to one address. Bulk mail from places I've
actually done business with or personally given my address to is not
included. (Nigerian money scams _are_ included.) I have no way to
determine the content of the stuff in non-Latin character sets, but I
sure as shootin' didn't solicit it.
I'm using the nnml storage backend, which means each message is a file
and each folder/group is a directory, so I could tar or zip the whole
spam heirarchy up pretty easily.