[ale] article feedback...

David S. Jackson dsj at sylvester.dsj.net
Tue May 27 14:36:53 EDT 2003


Hi,

If you have time, would you take a look at this article for a
sanity check please?  I'd appreciate it.  In particular, I've
transformed it from latex and dvi, and some of the tools munge
metacharacters pretty badly.  I've tried to catch everything, but
in case I haven't, please beware.  :-)

TIA!

PS.  I was just taking a last quick look and found that ~ and \
characters got munged.  Pipes also got munged, so there might be
a few of those left that I missed...

-- 
David S. Jackson                        dsj at dsj.net
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Life is divided into the horrible and the miserable.
		-- Woody Allen, "Annie Hall"






                              Mail Help



   Sometimes reading and managing mail can take a lot of time. If you're

using a good mail user agent that can take advantage of external programs

on a UNIX/Linux host, there's quite a bit of power you can employ to help

you manage your mail.

   You probably already use some sort of procmail-based spam filter, such

as Spamassassin (www.spamassassin.org) or junkfilter (junkfilter.zer0.org).

If so, you probably have to collect your spam into a spam folder, which you

then have to sift through for any friends who accidentally got filtered into

your spam bucket. I recommend using a "whitelist" and a "blacklist"

approach to your procmail filters, all in addition to your other spam filters.

You can incorporate this idea into your .procmailrc thusly:


### snip of sample .procmailrc ###

## Other important variables can be set above here...
## (See man procmailex for examples...)
PMDIR=$HOME/procmail
LOGDIR=$PMDIR/log
## Check senders against your "whitelist"
:0
* ? formail -x"From" -x"From:" | egrep -is -f $PMDIR/friends.txt
$HOME/inbox/personal
## Checksenders against your "blacklist" of known spammers
:0
* ? formail -x"From" -x"From:" | egrep -is -f $PMDIR/spammers.txt
/dev/null



   Usually you'll have to experiment where in your .procmailrc to put all

your recipes. I normally put all the important recipes up front and put the

spam rules at the end of the file. The recipes above should probably be

near the beginning of your .procmailrc file, because they can actually

dispatch quite a lot of your mail early on, especially the important mail

from your friends.

   Your "friends" and "spammers" files can simply be in

"user at domain.com" format, or you can use regular expressions, such as

".*@.*.kr", which will dump anything from Korea. (Has anyone ever got

legitimate mail from a .kr domain?) There are lots of domains you can put

in this list, such as ".*@bonusoffers.com", ".*valuerewards.com" and so

on.

Handy Utilities. Now that you're deleting a fair amount of mail before

you ever see it, you have to have some method of seeing what mail has been



                                   1




deleted, just in case you accidentally delete legitimate mail unintentionally.

Assuming you have a $HOME/bin directory in your $PATH, you can put

little helper scripts there which you can call from your mail client. To see

what mail has been deleted, you can make a macro to call this little one

liner:

   tac "/procmail/log|grep -A1 dev/null|less

   I simply put this in my .muttrc, since I use Mutt (a terrific MUA, by

the way. See www.mutt.org.). If you're using Mutt, you can bind macros to

certain keystrokes. For example, all my macros start with the plus sign:

"+". They always use two letters that have some sort of significance to

what the macro does. "+vm" tells Mutt to run Vi (my favorite editor) on

my .muttrc file; "+so" tells Mutt to "source" any new changes I've made to

my .muttrc file to make them active. Mutt syntax would be:

   macro generic +lp "!tac "/procmail/log|grep -A1 dev/null|less"n"

You could bind the macro to a more elaborate script by using this in your

.muttrc:

   macro generic +ld "!"/bin/showdeleted.sh"n"   #Look at deleted mail

The script "showdeleted.sh" could contain additional lines to further inform

you of what procmail has deleted for you.

   Sometimes you will find a bunch of new spam has been added to your

spam folder, and you will want to add those addresses to your blacklist. I

wrote a small script called "getspamaddr.sh" which collects address of all

mail in $HOME/inbox/spam and writes it to a temporary file in

$HOME/tmp/spammers.txt. The script compares each address in the

"From" field against addresses already in $HOME/procmail/spammers.txt

and $HOME/procmail/friends.txt to ensure that I don't add duplicates.


### snip of getspamaddr.sh ###
#!/bin/sh

#   Static values: adjust as needed
newcount=0
regex="^From:"
spamfile=$HOME/inbox/spam
tmp_file=$TMP/tmp_addresses.txt
friends=$HOME/procmail/friends.txt
spammers=$HOME/procmail/spammers.txt
testfile=$TMP/spammers.txt           # for testing only...

# Find new addresses in spam folder...
tail -n50000 ${spamfile} | grep ${regex} | \
sed 's/\(From: \).* [<]*\([^ ].*\@.*\.*\)[>]*$/\2/g' | \
sed 's/[<>]//g' | \
sed 's/\[mailto://g' | \


                                   2



sed 's/\]//g' | \
sed 's/From: //g' | \
sed 's/^root root$//g' | \
sed 's/^.*=\([a-zA-Z0-9]*\@.*\..*\).*$/\1/g' | \
sort|uniq > $tmp_file

# See if address already exists in my database...
cat $tmp_file | while read address; do
if `grep -qi "$address" $friends` || `grep -qi "$address" \
$spammers` ; then
   echo "$address" already exists in database...
else
   echo "$address" >> $testfile     # testing only
   newcount=$((newcount+1))
   echo $newcount
fi
done

# Output a summary of results...

echo "====================================================="
echo "                      Summary"
echo "====================================================="
echo

echo "$newcount entries out of $(wc -l $spammers| \
cut -c-8|sed 's/ //g') total entries"
echo

echo = = = = = = = = = = = = = = = = = = = = = = = = =
echo Any items appearing below represent duplicates
echo in your database:
echo = = = = = = = = = = = = = = = = = = = = = = = = =
sort < $testfile | uniq -c | grep -v "^ *1" | more
echo
echo End of duplicate listing...
echo
echo


   Note that the check of an address already existing in your spammers or

friends file is probably too simplistic. For example, the check doesn't take

into account any regular expression symbols. So you could have a

".*@emailfraud.com" entry, but you'll still get the address

"abuser at emailfraud.com" listed in your summary. But output like this still

can be useful, because it tells you that something was wrong with your

procmail recipe in the first place. Abuser at emailfraud.com should have been

deleted and should not have made it to your spam folder anyway. So,

duplicates in your address list can point to a problem in your procmail

recipes too, which can be helpful.

   Still, you can try sorting your spammer addresses from the right side of

the '@' symbol by using this command to see which domains are sending



                                   3




you the most spam:

   sort -t @ -k 2.1 "/tmp/spammers.txt _ less

   Sorting by domain will also tell you whether you have duplicate

blacklist entries from the same domain, which might better be blacklisted

with a blacklist entry for the entire domain. In other words, if you have

bob at abuser.com as well as jill at abuser.com as entries in your temporary

spammer file, perhaps it would be worthwhile to simply enter

.*@abuser.com in your blacklist instead of the individual entries. Your shell

script will have a little less work to do then.

Closing notes. These tips are just the beginning. Once you get in the

habit of calling external scripts or macros, you'll think of lots of other uses

for them to help you in your efforts to streamline your email reading. For

example, I created a dedicated mail server in my office that does almost

nothing except fetch mail from my various accounts and sift through it for

junkmail. I get a lot of mail, so the little CPU load is normally quite

pegged.



                                   4





More information about the Ale mailing list