Resonate
    LEAF Racewear
    Resonate Brendel Racing Team: karting, racing
			  software and
			  books, online store, software, links, photos,
                          technology, glossary, team results
    KAM Karting Supply
    Ribtect

    How to get rid of Spam

    Tech logo

    A word about Spam

    1. Introduction
    2. The difficult of filtering Spam
    3. Some first and very logical steps to take...
    4. Anti-Spam technology that works
    5. In closing



    Introduction

      Ok, so this has nothing to do with karting at all, I admit that. Nevertheless, there is probably not a single Internet user left who has not been bothered by Spam, unsolicited commercial e-mail with often dubious content. From get-rich-schemes, to ads for sexually explicit activities which defy description, Spam seems to be everywhere. About half of all the e-mail messages on the Internet today are Spam, threatening to render e-mail, this wonderful communication medium, almost useless. This is such a big problem that it even prompts special legislation from lawmakers around the world. Yet, there are technical solutions to the problem which will effectively completely solve the Spam problem for you. By the end of this article, you should know enough to be able to completely rid yourself of the Spam problem today! This is not a commercial solicitation, since most of what I will be suggesting is available for free on the Internet. If everyone would implement those measures, Spam would simply not be a problem anymore.

      Many karters participate in online discussion forums, some of which may expose their e-mail address to the 'research' activities of the spammers, which are constantly on the prowl for new addresses to which to send their ads. Many of us also maintain personal web-sites, where we will list our e-mail address for contact. The spammers routinely scan the World Wide Web for web-pages containing e-mail addresses to add to their database. Case in point: I have my e-mail address available on this site here. Ever since the Brendel Racing site came online, my address is in the spammer's databases. Today, I am receiving more than 100 spam messages per day, which made it increasingly difficult to even see the 'good' messages among this flood of garbage. In effect, my e-mail became useless. I decided that something needed to be done about it, which lead me to research ways to prevent Spam from filling up my mail box. This article here presents some results, which probably are applicable to the vast majority of you out there.

      The first part of the article describes some of the ways which may come to mind when fighting Spam, but which do NOT work. This will also explain a bit about how Spam is constructed and what the spammers do to make sure they get into your mail-box. The second part describes some very basic, common sense practices when dealing with Spam. The third part then describes some methods which actually DO work, and which in effect can almost entirely eliminate Spam from your mailbox. I would again like to point out that I am not endorsing any products for this. This is not an advertisement itself, and in fact, most of the ways to fight Spam are entirely for free. How is that for cost-effective? :-)

    The difficulty of filtering Spam

      False sender address

        Every Spam message contains a 'from' address, which supposedly identifies the mail account from which the message was sent. Many e-mail clients, such as Netscape/Mozilla, Eudora, Outlook, and even the popular web-based e-mail systems such as Hotmail, Yahoo, etc., allow for the setting of filters on the received messages, which allow you to specify certain addresses from which you do not wish to receive mail. Well, in case you did not know yet: The 'from' addresses are easily faked. There is no check in place at all on the Internet which verifies the validity of those addresses. Therefore, any manual filters you may ad to your e-mail program, to bounce or drop messages from certain 'from' addresses will ultimately fail. The spammers make up new addresses as they go, and you cannot defend yourself against it with manual filters.

      Keywords are changed

        All right, but what about the stuff they are advertising? If I don't want to receive any more Spam about cheap life-insurance, I may be able to add a filter on keywords in the subject line or the message body. As it turns out, the spammers are a step ahead of you already. The spelling of the various keywords is changed, in more or less random manner, making it impossible to keep up. Example: "life insurance" may become "l1fe 1nsurance". Depending on the font, you may not even see the change! Another variation is to simply add spelling mistakes, such as "llife inshurance", or modify the words by adding symbols, such as "\/iagra". Let me write that again in a different font, so you are sure to see what I just wrote: "\/iagra". See? You cannot possibly predict all the keywords that may be used and all their possible variations. Therefore, keyword filtering is not an option for you. And as the user of a web-based e-mail system, you usually don't have this choice anyway.

      Randomized content

        A very common sight within Spam messages these days are random letters, such as "gghwwrt" in the message body or even the subject line. This has been introduced to confuse anti-spam systems, which centrally detect if a certain message is sent many times to different recipients. Spam is sent to many recipients, so some people thought that simply checking to see if the same message arrives multiple times within a short time is enough to identify it as Spam. By introducing random elements in the message, the spammers are trying to foil that system. The random characters are effectively rendering some rather simple anti-spam systems useless.

      One person's Spam is another person's newsletter

        This is probably the biggest problem with any anti-spam solution. Some people LIKE to receive Spam, believe it or not. Others have subscribed to various newsletters, which in effect are mass-e-mails, but which are not unsolicited, and therefore are not Spam. If an ISP or e-mail provider implements some Spam blocking mechanism, which drops just a single e-mail which is clearly Spam to anyone else, but which a particular person really wanted to receive, this person is rightfully very upset. This is called a 'false positive': A message was identified as Spam, but in this particular context, and for this particular recipient, this was the wrong classification. I have spoken to ISPs which had to endure the wrath of upset users, which were happy that 99% of messages were dropped by the ISP, but were incredibly infuriated about that 1% percent of Spam that went through, and are even MORE upset about that one message that was dropped as Spam, but which they really wanted to receive. Therefore, centrally implemented anti-spam solutions will always suffer from this problem.

    Some first and very logical steps to take...

      Before diving in more detail into technical solutions to Spam, I would first point out a few things about Spam, which everyone should be aware of.

      1. Don't reply!

        Never, ever reply to a Spam message. No matter how annoyed you are about the messages you receive, no matter how much you want to tell them to get lost, simply ignore the message and delete. Because if you would actually reply, you would do the spammer a favor! You would confirm that there really is a person receiving messages on your address. You would confirm that this address is 'life' and therefore, your address becomes even more valuable to them, it gets traded to even more spammers... you get the picture. Don't reply!

      2. 'Unsubscribe' does not work, it does the opposite!

        Many Spam messages contain a link which supposedly allows you to unsubscribe from their mailing lists. Don't click on it! The effect is the same as if you would have replied to the message, which is described in the previous paragraph.

      3. Looking at the Spam message is bad!

        Not only because of the 'quality' of the message's content, but also because of a much more sinister reason. Many Spam messages these days are written in the HTML format, which is the same format that is used for web-pages. Most e-mail clients can display those messages, and render (draw) them just like any web-browser would display any web-page. As such, the message may contain links back to some server in order to download images, sometimes even so small that you cannot see them. This request back to the server, however, allows the spammer to track whether his message was opened. The request will contain a certain ID, which clearly identifies to the spammer that you just opened it, and therefore confirms your address as 'life'. This is bad, as explained in the previous two paragraphs! Therefore, do not view Spam messages. In programs such as Microsoft Outlook, the message is automatically opened, even if you just want to select it for deletion. Therefore, switch off the 'preview' pane, which can be done from the 'View' pull-down menu. Other client programs such as Eudora offer the same setting. Web-based systems such as Hotmail or Yahoo usually provide a check-box next to the message subject line, allowing you to delete a message before even looking at it.

      4. Using multiple mail-accounts can help!

        A commonly used technique to manage Spam is to use multiple e-mail addresses. Use one of the free and more or less anonymous addresses you can get at Yahoo or Hotmail or some such service if you need to post something to a discussion forum where your e-mail will get disclosed. You then expect this address to be spammed, but you don't really care. Give your 'real' address only to those people you really know. That way, by keeping your real address a secret, you have a chance of keeping it Spam free. Of course, this is not an alternative for people who need to make their e-mail available as an important contact and business tool on a web-site, for example.

      5. On web-sites, use forms, rather than the 'mailto:' tag!

        If you operate a web-page and need to offer visitors a means by which to contact you, try to use a contact-form, rather than a 'mailto:' tag, which is typically followed by your e-mail address. The 'robot' or 'spider' programs of spammers scan through all the pages on your site in an effort to find such e-mail links. However, if you use a contact-form you do not need to disclose your address. Visit this site's guestbook to see an example of such a form. A background application on the server stores the information submitted via the form. This can be set up in any way you wish, including forwarding the information to you via e-mail. Either way, your address is not available to anyone inspecting the HTML code of your web-pages. It will depend on your web-hoster which capabilities are available to you for forms. If you run your own server, you have all the options, of course.


    Anti-Spam technology that works

      Now it is time to look at the various technologies out there, which allow you to effectively manage a Spam problem. Many of those work if you use some e-mail client on your own computer. Users of a web-based system have somewhat less choices.

      I would like to point out that there are now MANY different commercial anti-spam products available. Just go to Google and search for "anti spam" or "spam blocking", and you will see the sponsored (commercial) links on the side, which lead to those products. I have not tested any of them, and I therefore cannot comment on their value. I would imagine that many of them use pretty much the same algorithms as many of the public-domain (free) solutions that you can download. So, I would think they should work well. But then, if a solution is available for free, why not use it? If you are into karting or any form of motor sport, you will BY DEFINITION be short of funds, so something free seems to have exactly the right price, as far as I am concerned. :-)

      Bayesian filtering

        This is one of the most promising and effective ways to fight Spam. I recently downloaded a Bayesian filtering plugin for Outlook, and am now routinely delegating about 90% of my Spam to a special Spam folder, in the moment I receive it without me ever seeing it, and with all my legitimate mail remaining in my Inbox. And this detection rate is going to improve as the system is trained...

        ... that's right, the system is trained. The advantage of this is quite simple: Instead of relying on someone else's definition of what is Spam and what is not, you can train the system to recognize what YOU consider to be Spam. The system works so amazingly well, and is so easy to use, it is surprising that not more people are using it.

        As a first step, you probably need to collect a good-sized sample of your good messages and some of the Spam you have received. You should have AT LEAST a few dozen of messages of each category, but the more the better. For example, ideally you have a few hundred examples of each group. But any number is a start. The more examples you have, the more accurate the system is.

        After you install the system, you typically tell it which mail folders contain Spam and which contain good messages. It then performs a statistical analysis of those messages, and trains itself how to recognize which. With every new message you are receiving, it will learn more. If a message should be classified incorrectly, you can tell the system about it easily, so that after a while it won't make the mistake again. A nice side effect of this is that the system will always adapt to new trends in Spam, gently aided by some occasional human training.

        Right now I am still checking my Spam folder once a day to see if something was misclassified. However, that usually only happens to certain newsletters that I have subscribed to. After a single 'teaching' about that particular message, the next newsletters already are recognized correctly. I am still receiving a few Spam messages each day in my Inbox, which the system was not 100% sure about yet. If that happens, I simply move them into the Spam folder, and the system automatically learns that this kind of message was Spam, and will likely not make this mistake again.

        You can download several free implementations of Bayesian Spam filters. Many of those system perform additional analysis to increase accuracy. There is also a host of commercial products available. Some of the free ones can be found here:

        • SpamBayes.org
          This is a completely free and very effective solution, which works on Windows, Linux/Unix and MacOS. If you are a Windows user, and use Outlook, it will even provide a very convenient plugin for you here, which is fully integrated into the Outlook application, and allows you to perform all the training without ever having to leave Outlook. I am using this to handle my e-mail messages with great effect.

        • SpamAssassin
          Similar to SpamBayes, but using even more ways to detect Spam. SpamAssassin for Unix is for free, and is somewhat of an industry standard. You can also get a free 'proxy' version here (also offering a commercial version with support), which you can install on your Windows machine, but which is not integrated into your e-mail client program. On the other hand, it allows you to work with pretty much any e-mail client program you wish. You can get a plugin for the Eudora mail client here, though, but that is a commercial product, with a 30 day free trial.

        • Spammunition
          This is another completely free anti-Spam filter plugin for Outlook. I have used it as well, and it works very nicely.

        • Mozilla
          Mozilla is of course the successor of the Netscape web-browser, and is not only a web-browser, but also a news reader, mail client and more. Without having to install anything additionally, Mozilla's mail client has a Bayesian Spam filter already fully integrated, which you can simply train from within Mozilla. Very convenient. Mozilla is a good browser, too.

        These are just some examples of the Bayesian filtering solutions that are available. Many more can be found on the Internet, but those should give you a first start. If you are interested to learn more about the background of Bayesian filtering, and where it was introduced as a Spam fighting concept, I recommend the article A plan for Spam by Paul Graham. It explains what Bayesian Spam filtering is all about.

      Using people to recognize Spam

        This is an interesting concept, which relies also on plugins in mail clients. Anyone who installs the plugin can report back to a central server if he/she received a Spam message with a single click of the mouse. If a sufficient number of people have reported a particular message as Spam, the server will update all the client installations with an additional 'fingerprint' for this message, so that all the clients can recognize it when it arrives and can filter it out. I have not myself used such a system yet, but I am assuming that it is intelligent enough not to be confused by random characters which may be added randomly to spam.

        An example of a system like this is Cloudmark. This should probably work quite well. The advantage is that it is useful even for those occasional e-mail users, which do not have a sufficiently large body 'good' e-mail to train a Bayesian system with (my Bayesian filtering worked well, even though I did not have many messages to train with, though...) The problem with many of those 'user driven' systems is, that they are not for free. Even though the monthly fee is quite low, it is still a cost. You can download a free trial version, though, if you are an Outlook user.

        The same idea is also used by iHateSpam, which is another commercial product, in order to supplement any filtering capabilities it has built in. This costs about $19.95 and is available as Outlook and Outlook Express plugin.

      Challenge - Response

        This is a system, which at first glance should be an absolutely accurate way to eliminate Spam: Instead of making a message visible to you in your Inbox, the system will hold the message for a while and will send an e-mail back to the sender. This works something like this:

          "Hi! You have just sent an e-mail to racing_at_domain-name-of-this-site.com. To confirm that you are really human, would you mind filling out this form, by specifying your name? Once you have done this, your original message will be sent on to the intended recipient. Thank you!"

        An example of a major provider who offers this system is Earthlink.

        The idea is that the sender of legitimate e-mail will perform the challenge that is described in the e-mail, while a spammer will either not bother, or will not receive this challenge anyway, since they are normally using fake 'from' addresses.

        A challenge-response system is very effective against Spam. However, the big problem lies with the fact that it puts a burden on a legitimate sender of e-mail. If you are an online business, you want to make initial customer contact as easy and pain free as possible. Sending a challenge back may just frustrate some potential customers, so that you will never hear from them again.

        Also, what if the person that tried to contact you does not speak English?

        However, if you use your e-mail for personal communication only, and you know that people who contact you will be able to respond properly to a challenge, then a challenge-response system may be right for you.

      Web-based e-mail

        Web-based e-mail systems have become popular, since you can read your mail from anywhere, and the mail account is completely free. Many of those systems now also offer Spam filtering. For example Yahoo or Hotmail. While these Spam filters can be effective, the problem with them remains that this is a centrally administered filter. Therefore, this filter cannot be trained by YOU, and it will always remain more or less inaccurate. Nevertheless, you should consider switching to those web-mail providers which do offer Spam filters, since something is better than nothing. You may have to read the messages classified as 'Spam' a bit more carefully, though, to make sure you can catch false positives.

        Many of these web-based mail services allow you to access you mail via POP (the Post Office Protocol, which is the standard protocol on the Internet to receive e-mail). In that case then, you can use client software, such as Outlook, Eudora or Mozilla, and all the filtering capabilities that come with it. In order to receive your e-mail via POP, the web-based services usually require you to sign up to their 'premium' services, which cost a certain amount per month. That makes sense, since you will not be viewing their advertisements anymore, which you would normally see while checking your messages. However, if you have to deal with a lot of Spam, and you need the flexibility of a system which can operate as web-based e-mail when you need it (when you travel, for example) it may be worth it.

    In closing

      As you can see from this article, there are solutions to fight spam out there for everyone. You have the most choices if you use an e-mail client running on your computer, but even as a user of web-based e-mail, you are not helpless. And best of all: Most of these solutions are completely free to you.


    If you have any suggestions or feedback about this article, or if you know about other significant anti-spam methods I should include here, please send me e-mail at: racing_at_domain-name-of-this-site.com.




    Back to the How to... main page





    Click to visit another section...

    Home
    Home
    Links
    Links
    Photos
    Photos
    Tech
    Tech
    Features
    Features

    Tracks
    Tracks
    Results
    Results
    Partners
    Partners
    Store
    Store
    Contact
    Mailinglist
    Guestbook
    Guestbook


    (C) Copyright 1998-2000 Resonate Brendel Racing. Do not duplicate or redistribute. racing_at_domain-name-of-this-site.com