Time to Replace SMTP – Part 2

In my last article, we talked about spam as the problem which SMTP can never solve.  So what is the solution? 

First, let’s be clear on the goals:

  1. Reliable transport of email
  2. Efficient transport of email
  3. No spam
  4. Ability to send a message just by knowing a user’s address, nothing more.  If we make it more complicated, users will not switch.

Sender Challenge

My first choice for reducing spam is to use a CAPCHA or another challenge-response system the first time a sender sends email to a particular recipient.  This would mean that the first time you tried to send email to a Joe, you would be challenged with a CAPCHA-like system to verify that you are indeed human.  You don’t need to know anything about the Joe other than his email address, but you do need to be able to decipher this code.  This simple challenge would kill most all spam, as spammers would no longer be able to send millions of emails per day.  Once you’ve sent email to a recipient and passed the challenge, that recipient could add you to the list of “no challenge required”.  Communicating further with that user would not require additional challenges.  There is a drawback to this, however, which is that spammers will start trying to figure out who is on your whitelist, and forging emails from your approved contacts.  Still, this will be much more difficult than the current system where spammers can simply flood the net.

There are some of downsides to using a Sender Challenge.  But I think these are surmountable.  First, mail would need to be able to lookup challenges in real time, rather than the current store-and-forward technology which is used today.  With SMTP, “email relaying” is a core concept for delivery of mail.  After all, it was designed in the 70’s/80’s when the Internet was an interconnect of many different networks, many of which were disconnected for most of the day.  Most mail servers today intentionally disable mail relaying, because it is a spammer’s dream.  Relaying could still exist, but only if the challenge could still be done in real time.  Switching to a real-time protocol seems pretty practical at this point.

Another problem is how to receive automated email from legitimate sources.  For example, you really do want Amazon to send you an email when your purchase has shipped.  But Amazon’s computer won’t be able to pass your email-client’s challenge.  To overcome this, I think you’d need to specifically give Amazon a pre-approved token for sending email.  Your mail server could be configured to determine the lifetime of that token.  You could either allow Amazon to send you email via that token forever, or for a short period of time, or for just 3 emails.  The receiver would be in control. 


DiffMail – A Pull-based email approach

In researching new SMTP mechanisms, I came across DiffMail.  DiffMail is also a proposal for altering SMTP to make it better able to deal with spam.  It does have one fantastic recommendation – pull based email.

Today, SMTP pushes email from the sender to the recipient.  That is, when the sender sends email, the recipient’s server is required to accept it.  DiffMail proposes that we switch it around.  Sending an email would merely send an informational notice to the receiver saying “I have mail for you on my server if you want it.”  The sender would be required to keep the email on his server until the recipient picks it up.  (Keep in mind the user would never see this – this is just what the protocol is doing on behalf of the user) 

When SMTP was designed, a pull-based approach was not feasible – you couldn’t guarantee the sender’s server would be up when the receiver wanted to read mail.  Today, however, this is not an obstacle.  This model again clips the wings of the spammer, because it requires the spammer to remain online in order to deliver the message.  And of course, if the spammer’s content is online, it is much easier to pinpoint the spammer and stop him altogether.  Further, if we add a little cryptography to the format of email, we can force the spammer to either have a sophisticated spam server application or keep a lot of email stored on his own disk.  Both of which help reduce his ability to send spam at large scale.   The end user, of course, would never see any of this “push” or “pull” of email.  It would all happen under the covers of the email client.

An interesting feature emerges from switching to the pull model.  Have you ever wished to “retract” an email after you sent it but before it was read?  This model would allow you to edit the email as many times as you wish before the receiver downloads it.  Of course, if the receiver pulls it right away, you may not be able to retract. 

DiffMail proposes the pull-based email with the Sender Challenge as an optional component.  But I don’t think pull-based email is enough.  Imagine arriving at your machine and having to sort through 200 potential “new contact requests” to find the one person that is legitimate?  It is really the same problem as manually filtering out spam email, except that you would only have the sender’s email address to use for determining whether you would open the message.

There is a big risk with DiffMail, however.  With SMTP, the sender delivers the email content to the recipient immediately (disregard mail queues or relays for the moment).  In the new model, the sender most host a copy of that email on his server, waiting for the intended recipient to come back and fetch it.  How will it know if a bad guy isn’t trying to steal the email by pretending to be the real recipient?  DiffMail proposes to solve this by basically making the email contain a really big random number to hide the content.  But it isn’t a very safe solution.  I do believe the security issues can be overcome, but they are definitely not trivial.


Sender Challenge or Pull-based Email or Both?

The Sender Challenge system seems pretty straightforward to me to implement.  It fits within the email model that we have today, but introduces an optional challenge which the sender must be able to respond to.  Receivers of email need to be able to identify which people sending email to them need to be challenged and which do not.   That is a slightly tricky concept, but manageable.  The drawback to only using a Sender Challenge is that spammers could continue to exist.  Their new job would be to find ways to beat the challenge- either by random guessing or more sophisticated techniques.  If they can figure out ways to do this, we’ll have spam again.

The pull-based email system makes it costlier for the spammer to send mail at scale.  It also is a somewhat more logical way to communicate given universal access to the network.  However, it does represent a major architectural shift, and it will be a challenge for ISPs to deploy.

For the best spam protection, the combination of the two may be the best bet, but will obviously force a lot of changes to our existing systems.


Compatibility with Existing SMTP

It does seem attractive to make this system compatible with SMTP.  DiffMail attempts to this by proposing some modest additions to the SMTP protocol.  However, I think this is not the right approach for a couple of reasons. 

First off, even if we only change the protocol slightly, the mail servers all need to change radically.  With SMTP, we all deal with our outbound mail servers (SMTP) and our inbound mail storage (POP/IMAP/WebMail).  These are distinct servers.  Using any pull-based email mechanism will require the sender’s outbound mail server to now store data.  Further, routers and firewalls will need to be reconfigured.  Everything in the system is different from a deployment perspective, so why pretend it is the same via the protocol?   It does allow a recipient to receive both types of mail via a single server, which is helpful if your correspondents aren’t willing to move from SMTP.  But, in that model, you have no benefit over existing SMTP, as the spammers can still get to you.

Another problem with maintaining SMTP compatibility is that the protocol is not very efficient.   Sending each email via SMTP is at least 4 round trips to the server.  Using a model of sending all headers in one half trip and receiving the responses in the second half trip would allow for much lower cost server deployments.  If we’re breaking the world, we may as well do it right.



Leave a Reply

Your email address will not be published. Required fields are marked *