pymta Documentation

pymta is a library to build a custom SMTP server in Python. This is useful if you want to...

  • test mail-sending code against a real SMTP server even in your unit tests.
  • build a custom SMTP server with non-standard behavior without reimplementing the whole SMTP protocol.
  • have a low-volume SMTP server which can be easily extended using Python.

Installation and Setup

pymta is just a Python library which uses setuptools so it does not require a special setup. The only hard dependency is repoze.workflow (0.2dev). To serve multiple connections in parallel, pymta uses the multiprocessing module which was added to the standard library in Python 2.6 (there are backports for Python 2.4 and 2.5).

Currently pymta is only tested with Python 2.5 but probably 2.4 works too. The goal is to make pymta compatible with Python 2.3-2.6. With Python 2.3 you can only serve one connection at a time due to the lack of a multiprocessing module.

repoze.workflow

repoze.workflow is not available in pypi (yet?) so you have to install it directly from the svn:

easy_install http://svn.repoze.org/repoze.workflow/trunk/

repoze.workflow requires zope.interface which is available via pypi (and installable via the package manager for most Linux distributions).

multiprocessing

The multiprocessing module hides most the operating system differences when it comes to multiple processes. The module is included in Python 2.6 but it is available standalone via pypi:

easy_install multiprocessing

If multiprocessing is not installed, pymta will fall back to single-threaded execution automatically (therefore multiprocessing is no hard requirement in the egg file).

Goals of pymta

The main goal of pymta is to provide a basic SMTP server for unit tests. It must be easy to inject custom behavior (policy checks) for every SMTP command. Furthermore the library should come with an extensive set of tests to ensure that does the right thing(tm).

Eventually I plan to build a highly customizable SMTP server which can be easily hacked (just for the fun of it).

Development Status

Currently (02/2009, version 0.3) the library only implements basic SMTP with very few extensions (e.g. PLAIN authentication). However, as far as I know, it is the only MTA written in Python that implements a process-based strategy for connection handling which is an advantage because many libraries - including most Python DB API implementations - can not be used in an asynchronous environment and you can use your CPUs to their fullest extent. And last but not least pymta comes with many unit tests and good, comprehensive documentation.

‘Advanced’ features which are necessary for any decent MTA like TLS and pipelining are not yet implemented. Currently pymta is used only in the unit tests for TurboMail. Therefore it should be considered as beta software.

Architectural Overview

pytma uses multiple processes to handle more than one connection at the same time. In order to do this in a platform-independent manner, it utilizes the multiprocessing module.

The basic SMTP program flow is determined by two state machines: One for the SMTP command parsing mode (single-line commands or data) in the SMTPCommandParser and another much bigger state machine in the SMTPSession to control the correct order of commands sent by the SMTP client.

The main idea of pymta was to make it easy adding custom behavior which is considered configuration for ‘real’ SMTP servers like Exim. The ‘pymta.api’ module contains classes which define interfaces for customizations. These interfaces are part of the public API so I try to keep them stable in future releases. Use IMTAPolicy to add restrictions on certain SMTP commands (check recipient addresses, scan the message’s content for spam before accepting it) and IAuthenticator to authenticate SMTP clients (check username and password). With an IMessageDeliverer you can specify what to do with received messages.

Problems with asynchronous architectures

The two most important SMTP implementations in Python (smtpd and Twisted Mail) both use an asynchronous architecture so they can serve multiple connections at the same time without the need to start multiple processes or threads. Because of this they can avoid the increased overall complexity due to locking issues and can save some resources (creating a process may be costly).

However there are some drawbacks with the asynchronous approach:

  • SMTP servers are not necessarily I/O bound. Some operations like spam scanning or other message checks may eat quite a lot of CPU. With Python you need to use multiple processes if you really want to utilize multiple CPUs due to the Global Interpreter Lock.
  • All libraries must be able to deal with the asynchronous pattern otherwise you risk to block all connections at the same time. Many programmers are not familiar with this pattern so most libraries do not support this. This is especially true for most of Python’s DB api implementations which is why Twisted implemented its own asynchronous DB layer. Unfortunately by using this layer you have to use plain SQL, because the most popular ORMs like SQLAlchemy do not support their layer.

Given these conditions IMHO it looks like a bad design choice to use an asynchronous architecture for a SMTP server library which should be easily hackable to handle even uncommon cases.

Components

pymta consists of several main components (classes) which may be important to know.

PythonMTA

The PythonMTA is the main server component which listens on a certain port for new connections. There should be only one instance of this object. When a new connection is received, the PythonMTA spawns WorkerProcess (if you have the multiprocessing module installed) which triggers a SMTPCommand parser that handles all the SMTP communitcation. When a message was submitted successfully, the new_message_accepted() method of your IMessageDeliverer will be called so it is in charge of actually doing something with the message.

You can instantiate a new server like that:

from pymta import PythonMTA, BlackholeDeliverer

if __name__ == '__main__':
    # SMTP server will listen on localhost/port 8025
    server = PythonMTA('localhost', 8025, BlackholeDeliverer())
    server.serve_forever()

Interface

class pymta.PythonMTA(local_address, bind_port, deliverer_class, policy_class=None, authenticator_class=None)

Create a new MTA which listens for new connections afterwards. local_address is a string containing either the IP oder the DNS host name of the interface on which PythonMTA should listen. deliverer_class, policy_class and authenticator_class are callables which can be used to add custom behavior. Please note that they must be picklable if you use forked worker processes (default). Every new connection gets their own instance of policy_class and authenticator_class so these classes don’t have to be thread-safe. If you omit the policy, all syntactically valid SMTP commands are accepted. If there is no authenticator specified, authentication will not be available.

shutdown_server(timeout_seconds=None)
This method notifies the server that it should stop listening for new messages and shut down itself. If timeout_seconds was given, the method will block for this many seconds at most.

Policies

class pymta.api.IMTAPolicy

Policies can change with behavior of an MTA dynamically (e.g. don’t allow relaying unless the client is located within the trusted company network, enable authentication only for some connections). In established MTAs like Exim and Postfix it’s a very important task for every system administrator to configure the message acceptance policies which are normally part of the configuration file.

A policy does not change the SMTP implementation itself (the state machine) but can send out custom replies to the client. A policy doesn’t have to care if the commands were given in the correct order (the state machines will take care of that). The only thing is that the message object passed into many policy methods does not contain all data at certain stages (e.g. accept_mail_from can not access the recipients list because that was not submitted yet).

‘IMTAPolicy’ provides a very permissive policy (all commands are accepted) from which you can derive custom policies. Its methods are usually named ‘accept_<SMTP command name>’.

Every method in the ‘IMTAPolicy’ interface can return either a single boolean value (True/False) or a tuple. A boolean value specifies if the command should be accepted. The caller is responsible for sending the actual default replies.

Alternatively a policy can choose to return a tuple to have more control over the reply sent to the client: (decision, (reply code, response)). The decision is the boolean known from the last paragraph. The reply code is an integer which should a be a valid SMTP code. response is either a basestring with a custom message or an iterable of basestrings (in case you need to return a multi-line reply).

accept_auth_plain(username, password, message)

Decides if AUTH plain should be allowed for this client. Please note that username and password are not verified before, the authenticator will check them after the policy allowed this command.

The method must not return a response by itself in case it accepts the AUTH PLAIN command!

accept_data(message)
Decides if we allow the client to start a message transfer (the actual message contents will be transferred after this method allowed it).
accept_ehlo(ehlo_string, message)
Decides if the EHLO command with the given helo_name should be accepted.
accept_from(sender, message)
Decides if the sender of this message (MAIL FROM) should be accepted.
accept_helo(helo_string, message)
Decides if the HELO command with the given helo_name should be accepted.
accept_msgdata(msgdata, message)
This method actually matches no real SMTP command. It is called after a message was transferred completely and this is the last check before the SMTP server takes the responsibility of transferring it to the recipients.
accept_new_connection(peer)
This method is called directly after a new connection is received. The policy can decide if the given peer is allowed to connect to the SMTP server. If it declines, the connection will be closed immediately.
accept_rcpt_to(new_recipient, message)
Decides if recipient of this message (RCPT TO) should be accepted. If a message should be delivered to multiple recipients this method is called for every recipient.
ehlo_lines(peer)
Return an iterable for SMTP extensions to advertise after EHLO. By default support for SMTP SIZE extension will be announced if you set a max message size.
max_message_size(peer)
Return the maximum size (in bytes) for messages from this peer. When this method returns an integer, there pymta will check the actual message size after the message was received (before the accept_msgdata method is called) and will respond with the appropriate error message if necessary. If you return None, no size limit will be enforced by pymta (however you can always reject a message using accept_msgdata().

Here is a short example how you can implement a custom behavior that checks the HELO command given by the client:

def accept_helo(self, helo_string, message):
    # pymta will return the default error message for the given command if
    # you just return False
    return False

    # This will send out a '553 Bad helo string' and the command is
    # rejected. pymta won't send any additional reply because you did that
    # already.
    return (False, (553, 'Bad helo string'))

    # This is basically the same as above but now it will trigger a
    # multi-line SMTP response:
    # 553-Bad helo string
    # 553 Evil IP
    return (False, (553, ('Bad helo string', 'Evil IP'))

Authenticators

class pymta.api.IAuthenticator

Authenticators check if the user’s credentials are actually correct. This may involve some checking against external subsystems (e.g. a database or a LDAP directory).

authenticate(username, password, peer)
This method is called after the client issued an AUTH PLAIN command and must return a boolean value (True/False).

Deliverers

class pymta.api.IMessageDeliverer

Deliverers take care of the message routing/delivery after a message was accepted (e.g. put it in a mailbox file, forward it to another server, ...).

new_message_accepted(msg)

This method is called when a new message was accepted by the server. Now the MTA is then in charge of delivering the message to the specified recipients. Please note that you can not reject the message anymore at this stage (if there are problems you must generate a non-delivery report aka bounce).

There will be one deliverer instance per client connection so this method may does not have to be thread-safe. However this method may get called multiple times when the client transmits more than one message for the same connection.

Message

The Message is a data object contains all information about a message sent by a client. This includes not only the actual RFC822 message contents but also information about the SMTP envelope, the peer and the helo string used. The information is filled as the client sends some commands so not all information may be available at any time (e.g. the msg_data not available before the client actually sent the RFC822 message).

Peer

The Peer is another data object which contains the remote host ip address and the remote port.

SMTPSession

This class actually implements the most complicated part of the SMTP state machine and is responsible for calling the policy. If you want to extend the functionality or need to implement some custom behavior which is beyond what you can do using Policies, check this class.

The SMTP state machine is quite strict currently but I consider this a feature and not something I’ll try to improve in the near future.

Unit Test Utility Classes

pymta was created to ease testing SMTP communication without the need to set up an external SMTP server. While writing tests for other applications I created some utility classes which are probably helpful in your tests as well...

class pymta.test_util.BlackholeDeliverer
BlackholeDeliverer just stores all received messages in memory in the class attribute ‘received_messages’ (which implements a Queue-like interface) so that you can examine the received messages later.
class pymta.test_util.DebuggingMTA(*args, **kwargs)
DebuggingMTA is a very simple implementation of PythonMTA which just collects all incoming messages so that you can examine then afterwards.
class pymta.test_util.MTAThread(server)

This class runs a PythonMTA in a separate thread which is helpful for unit testing.

Attention: Do not use this class together with multiprocessing! http://www.viraj.org/b2evolution/blogs/index.php/2007/02/10/threads_and_fork_a_bad_idea

run()
Create a new thread which runs the server until stop() is called.
stop(timeout_seconds=5.0)
Stop the mail sink and shut down this thread. timeout_seconds specifies how long the caller should wait for the mailsink server to close down (default: 5 seconds). If the server did not stop in time, a warning message is printed.

Example SMTP server application

In the examples directory you find a pymta-based implementation of a debugging server that behaves like Python’s DebuggingServer: All received messages will be printed to STDOUT. Hopefully it can serve as a short reference how to write very simple pymta-based servers too.

License Overview

pymta itself is licensed under the very liberal MIT license (see COPYING.txt in the source archive) so there are virtually no restrictions where you can integrate the code.

However, pymta depends on some other packages which come with different licenses. In order to ease license auditing, I’ll list the other licenses here (no guarantees though, check yourself before you trust):

I believe that all licenses are GPL compatible and do not require you to publish your code if you don’t like to.