Overview

Traveler is an E-mail system in use by me, Andy Valencia, handling E-mail for my personal domains as well as my operating system projects, vsta.org and forthos.org. Prior to its deployment, I was receiving around 60 spams per day. Since its deployment in October of 2003, I have received two. Both of these were handled such that I will never receive spam from those channels again, and yet I achieved this without requiring the cooperation of any ISP or mail administrator, nor any changes to protocols or message formats.

This document describes my system, and why I have found it structurally resistant to spam, rather than adaptively resistant to spam. I use the word structural to denote a system where the would-be spammer finds it difficult to initiate spam; adaptive approaches are ones where the spammer is free to act as before, but techniques are used to identify, manage, and respond to these behaviors. Spam complaints and filters are examples of the latter, adaptive style.

Background

The design of the global TCP/IP based, SMTP E-mail system goes back to a time when successfully and reliably exchanging E-mail was a considerable challenge. That such a system has managed to scale up to support communication across a large percentage of the world's population is a tribute to the skill and insight of its designers. However, it was not designed to handle willfully and sustained abuse (initially, the thought was that social pressure would prevent inappropriate uses).

Several background aspects of the world's IP Internet system provide an enabling environment for spammers. For instance, almost anyone can get an Internet connection for a modest amount of money--even if they have previously exhibited bad, even destructive behavior in their use of such feeds. This fact appears to be unchangeable.

Similarly, once connected to the Internet, it is easy to scan web and FTP sites for E-mail addresses. The Internet is much less useful if people can't establish contact with each other, so inevitably E-mail addresses show up in public locations. While some people try to minimize the exposure of their E-mail address, it is very hard to avoid it in the long haul given newsletters, E-commerce, various address books, chat logs, and the myriad archives which build up of online activities. Attempts at convolution ("joe at domain.com") have been noticed by spam address harvesters, who can automatically deconvolute them.

Once the spammer has a connection and a list of addresses, it is very easy to build E-mail messages with headers and contents which are entirely unreliable. Most of the indications of where it came from can be falsified, as can, of course, the wild claims often contained in the message itself.

The Traveler Approach

While looking at this problem (and digging my way through a daily dose of irritating and even repulsive offers), I came to realize that my E-mail address had become a burden. While it was a boon to have it known by friends and acquaintances, and even to have it discovered by people with whom I had lost touch, it was also an unchanging name which, once known by the spammers, would forevermore be the target of offers and scams.

So how could I change my E-mail system so that I could still be in touch with current and future correspondents, and yet not have to "sit still" for the endless bludgeoning of spam? I wanted to implement something locally, and not require software or protocol changes elsewhere in the Internet. I read through the E-mail headers of my spam, and tried to figure out how to stop it.

One approach would be to attempt to filter spam. At any given time, there tend to be features of a spam E-mail which make it stand out from any other. But these features are not reliable; even in a time when, say, many spams have multiple exclamation points in the Subject line, there were still plenty more which didn't. And it seemed clear that the spammers were becoming creative in order to circumvent just this kind of pattern recognition.

Many Names

I finally settled on the approach which also suggested the name of my system. Instead of having a single E-mail address, I would have a unique E-mail address for each different correspondent with whom I exchanged E-mail. On the incoming side, my mail server would recognize all these myriad names, and funnel it all into a single inbox behind the scenes. On the outgoing side, I would customize my mail program, modifying my message on its way out to take on the E-mail address which that correspondent would use to send back to me. The name "Traveler" comes from a series of stories "The Traveler in Black", whose protagonist is a person with "many names, but only one nature."

Incoming

Specifically, I used a random number generator to drive a "pronounceable password" generator. It spat out things like "jonscana" and "poostens", which I pasted onto my "account prefix", resulting in addresses like "ajv-jonscana@vsta.org". I generated 1,000 of these, and pasted them all into my sendmail aliases file. They all map to a single account, hidden behind the scenes.

I used a pronounceable password generator because I thought it would make it easier to live with these names. In fact, it turns out people handle the reading off of digits much more easily--so if I was doing it again, I'd just have accounts like "ajv-72957481@vsta.org". The only important part is that the random part be long and random enough that spammers can't guess a working E-mail account name.

Outgoing

Sending was a little more complicated. I use the nmh(1) mail handling system, which is very nicely modular. I ended up writing a custom "sendproc" program for NMH, and specifying it in my NMH ".mh_profile" file. This program gets invoked when it's time for a composed E-mail to be sent out to the network. My program, written in Python, then scans through the To:, Cc:, and Bcc: fields. It maps each recipient against its database, and comes up with the E-mail address which they would use if they were to contact me. It then iteratively rewrites the From: field to look like that address, and feeds it on to sendmail, one copy for each recipient.

My "sendproc" program uses a database file which records all 1,000 addresses by which I might be known. Initially, this file recorded all 1,000 addresses as unused. When I first sent to somebody, it notices that this person is not yet in the database, takes the next free address from the pool, records that this address will be used for this recipient from now on, and then processes the message. Note that all 1,000 aliases were already up and usable on my mail server, but since nobody knew (or could guess) what those addresses are, they are basically unusable until my mailer sends my first E-mail to them.

I can also allocate an address interactively. When I'm on a WWW order form and some merchant wants my E-mail address, I pop a free one out of my database, recording that I've used it for this merchant. Their order confirmation comes through just fine. But if they sell my address to somebody else, and I get spammed, I can go back to my database and see who was using that address to reach me.

Contact

This all works fine so long as I initiate the E-mail. But my old address "vandys@vsta.org" is widely known--I didn't want to just shut it off. Or if I'm far from my laptop and I want to write my E-mail address down for somebody, I can't possibly remember the next unused address from my pool of 1,000 addresses.

So instead what I did was to connect my old E-mail address to the vacation(1) auto-responder program which ships with most any UNIX-ish system. The bounce message it sends points the mailer to a form on my WWW site. Right off the bat, most spammers don't see this, since they generally don't have a workable return address in the stuff they send. But a legitimate sender will, and when they follow the URL to my WWW site, they find a little contact form which lets them write their name, E-mail address, and a short comment. Once they submit this, it gets turned into a message which gets queued to my inbox, and I can then send my would-be contact an E-mail to get things going.

Gotchas

I'm obviously using E-mail in a way which doesn't reflect how most of the world uses it. Mostly I "fit in" and have no problems. But there are a couple of cases which have given me a pause.

First is the contact form. It's a nuisance, and there are some people who grumble. But spam has become such an issue for the E-mail-using population that it rarely causes more than a raised eyebrow.

When participating in an E-mail discussion with a long To: list, your individual E-mail addresses for each correspondent will be seen by all the others, because the sender puts their idea of your To: address in the copy which they send out to all others on the distribution. Because all the addresses are of the form "ajv-*@vsta.org", it rarely causes confusion. However, when I join a particular group which will be operating in this fashion, I try to get all of them lumped under a single personal address for my side (just like I did for my mother and father) to keep things as seamless as possible.

Without thinking about it, I allocated a unique E-mail address for E-bay and Paypal. But the two cross-link to each other, and I ended up having to run two Paypal accounts. I try to be careful around E-mail based systems which cross-link with each other.

A common occurrence is that I send a message to somebody at one address, but they answer back from a different address. Unlike many other systems, their answer is received just fine--they know a working address for me, and my system doesn't care what From: address uses that address. My mail program scans received E-mail, and adds these new addresses into its record of who uses what address:

Step 1: I send a message to joe@isp1.com
Step 2: My mailer allocates the next "free" address
Step 3: My address database now has:
joe@isp1.com -> ajv-19562839@vsta.org
Step 4: My message is sent off to sendmail.
Step 5: Joe reads my message and sends back to me from joe_jones@isp2.com
Step 6: On receipt, my mailer notices that the From: isn't joe@isp1.com, but is instead joe_jones@isp2.com
Step 7: My mailer updates its database:
{joe@isp1.com, joe_jones@isp2.com} -> ajv-19562839@vsta.org
Step 8: I can now send to either address; I am ajv-19562839@vsta.org to either address.

When I Get Spammed

So with all of this system in place, what have I found? That I really enjoy E-mail now that each message in my inbox is something I actually want to read. My old address, when I check my mail server logs, still gets hit with those 60 or so messages a day. But they all get bounced, without me having to see any of it. The one legitimate sender in 1,000 to my old address will get back the mail bounce, follow the URL, and fill in the contact form.

As I said at the beginning, I have received spam twice since cutting over to this new system. The most recent was a copy of the "Mydoom" virus (I use FreeBSD, so infection wasn't the issue; I was still annoyed at having my time and bandwidth wasted). Although the From: address and related headers completely hid where the message had come from, it had to come in to a particular alias for me, and all I had to do was look that up in my database, and send off a warning to the person using that alias, advising them that their PC was infected.

In the other case, I received spam via an address which I had provided to a vendor during a web purchase. I didn't even bother complaining; I just went up on my mail server and deleted that alias from the sendmail aliases file. From then on, any spammer trying to use the address I had given to that vendor would get a bounce--the address no longer existed. Deleting your address when you get spammed doesn't work when you only have a single E-mail address (you'd lose touch with all your friends!). But when you have a unique address for each and every correspondent, you can delete one and only affect that one single correspondent.

Conclusions

It was a week or so of work to code up my mail system; The results have been every bit as good as I had hoped. Spam is a parasitic and antisocial behavior with global ramifications. But when I hear about the various Draconian prescriptions which are proposed (do you really want the world's governments to get together and control E-mail content?), I felt that I should write up my own approach to spam handling. I'm sure there are ways in which it falls short of the requirements of the mail using public, but the small amount of inconvenience I've experienced while living with it these last 14 months seems like a minor cost for having taken control of my inbox. I'm a public, long-time member of the Internet, with a public, long-time E-mail address, and I don't see spam.

The most obvious opening in my system for spam-like abuse is the contact form on my web server. When a worm starts driving spam directly into that input form, I will probably put one of those graphic displays with the "Enter the number you see above" form inputs. I suppose the spammers will then have to use sweatshops of people to read these forms and fill in the answer, but by then they're already paying a LOT more per message than I think a spammer can afford. I cap the form size at something quite small, so it's not like they can push a whole advertisement into it anyway.

You are welcome to contact me via my home page. Be prepared to fill in a form. :->