Receiving email

This is a follow up (part 2) to Minimum viable email service.

I concluded the previous webpage with a list of known limitations/issue with my SMTP server (RFC 5321) implementation. Since then, I have found more issues and re-assessed my perspective on one of my previously listed issues. Here is a list of issues with the previous code (including the additional issues I discovered):

Note that I had previously listed the following as a known limitation: VRFY always returns "252 Cannot VRFY user, but will accept message and attempt delivery". I no longer consider this to be an issue. In fact, this approach is sometimes recommended as a means to remain protocol-compliant while preventing user enumeration.

Now, let's improve the SMTP implementation to alleviate these issues before moving on to implement Internet Message Format (RFC 5322).

Implementation fixes

Issue: Single client blocking

Currently, the server can only have a single connected client at a time. The most natural fix to this limitation is multi-threading with pthread.h.

By moving the main conversation loop into a new void *handle_client(void *arg) function, then updating the outer loop of the main() function, we can create a new thread for each accepted connection:

// Accept clients
while (1) {
  client_fd = accept(server_fd, (struct sockaddr *)&address, (socklen_t *)&addrlen);
  if (client_fd < 0) { perror("accept failed"); continue; }

  int *fd_ptr = malloc(sizeof(int));
  *fd_ptr = client_fd;

  pthread_t thread;
  pthread_create(&thread, NULL, handle_client, fd_ptr);
  pthread_detach(thread);
}

Issue: read() assumes one line per syscall

Currently, we are reading messages from the client using int bytes_read = read(client_fd, buffer, BUFFER_SIZE - 1); then immediately proceeding, which implicitly assumes that all bytes are sent at once. Instead, we should replace this single command with a buffer that track and accumulates bytes until the CRLF.

Before implementing this function, I will update the Session struct to include a received buffer and received length variable (along with updates to the reset function):

struct Session {
  bool greeted;
  bool set_mail_from;
  bool in_data_mode;
  int recipient_count;
  int recv_len;                               // new
  char mail_from[ADDRESS_SIZE];
  char rcpt_to[MAX_ADDRESSES][ADDRESS_SIZE];
  char data[DATA_SIZE];
  char recv_buf[RECV_BUFFER_SIZE];            // new (along with #define RECV_BUFFER_SIZE)
};

Then, we can create a the read_line function (to replace our use of read(...)). In addition, we can strip the CRLF here rather than ad-hoc later.

int read_line(int client_fd, struct Session *state, char *out, int out_size) {
  while (1) {
    for (int i = 0; i < state->recv_len - 1; i++) {
      if (state->recv_buf[i] == '\r' && state->recv_buf[i+1] == '\n') {
        int line_len = i;
        if (line_len >= out_size) line_len = out_size - 1;
        memcpy(out, state->recv_buf, line_len);
        out[line_len] = '\0';

        int consumed = i + 2; // +2 for \r\n
        state->recv_len -= consumed;
        memmove(state->recv_buf, state->recv_buf + consumed, state->recv_len);

        return line_len;
      }
    }

    int space = RECV_BUFFER_SIZE - state->recv_len;
    if (space <= 0) { // Buffer full
      return -1;
    }

    int n = read(client_fd, state->recv_buf + state->recv_len, space);
    if (n <= 0) return n; // Return with disconnect or error
    state->recv_len += n;
  }
}

Now, in the conversation loop, we can simply use:

int stripped_len = read_line(client_fd, &session_state, buffer, BUFFER_SIZE);

As a result of these changes, telnet localhost 2525 works but netcat (with nc -C localhost 2525) no longer returns the expected command terminator.

Issue: Highly limiting data array sizes

Firstly, I will increase ADDRESS_SIZE to 128 to 256 (per 4.5.3.1.3).

The more problematic limit is DATA_SIZE restricted to 2048. A modern email, especially with attachments and HTML (which we may support in the future), can easily exceed 2048 bytes. In fact, just plain text could overflow this limit. However, I am keeping this limit for now, since we should stream the data directly to storage once we actually support deliverability.

Issue: Extensions in EHLO

We can kill two birds with one stone here. I will generalize the EHLO response to include dashes (the expectation for multi-line responses) and advertise an extension: SIZE.

The "SMTP Service Extension for Message Size Declaration" (that is, RFC 1870). This simple extension allows the server to easily advertise the maximum accepted message size from the client.

In the EHLO handling block, we can simply use

char line[BUFFER_SIZE];

// greeting
snprintf(line, sizeof(line), "250-%s greets %s\r\n", NAME, arg);
send_msg(client_fd, line);

// Extensions (dash)
snprintf(line, sizeof(line), "250-SIZE %d\r\n", DATA_SIZE);
send_msg(client_fd, line);

// Final extension (no dash)
send_msg(client_fd, "250 VRFY\r\n");

Issue: MAIL FROM: should reset recipients and mail data

Per 4.1.1.2, the MAIL command is supposed to do the following:

This command clears the reverse-path buffer, the forward-path buffer,
   and the mail data buffer, and it inserts the reverse-path information
   from its argument clause into the reverse-path buffer.

Since we already have a handy reset_transaction(...) function, we can simply call that after receiving a MAIL command to fulfill this requirement.

Issue: No explicit Postmaster handling

Under 4.5.1. Minimum Implementation, RFC 5321 is very specific about the requirement to support a reserved, case-insensitive Postmaster mailbox:

Any system that includes an SMTP server supporting mail relaying or
delivery MUST support the reserved mailbox "postmaster" as a case-
insensitive local name.  This postmaster address is not strictly
necessary if the server always returns 554 on connection opening (as
described in Section 3.1).  The requirement to accept mail for
postmaster implies that RCPT commands that specify a mailbox for
postmaster at any of the domains for which the SMTP server provides
mail service, as well as the special case of "RCPT TO:<Postmaster>"
(with no domain specification), MUST be supported.

SMTP systems are expected to make every reasonable effort to accept
mail directed to Postmaster from any other system on the Internet.
In extreme cases -- such as to contain a denial of service attack or
other breach of security -- an SMTP server may block mail directed to
Postmaster.  However, such arrangements SHOULD be narrowly tailored
so as to avoid blocking messages that are not part of such attacks.

Currently, the "postmaster" local name is treated the same as any other.

Given an address, we first need to be able to determine if it is a (qualified or unqualified) Postmaster address:

bool is_postmaster(const char *addr) {
  if (strcasecmp(addr, "Postmaster") == 0) return true;    
  const char *at = strchr(addr, '@');
  if (at && strncasecmp(addr, "Postmaster", at - addr) == 0  && strlen("Postmaster") == (size_t)(at - addr)) return true;
  return false;
}

Then, we can update RCPT handling to always accept this address.

Issue: No enforcement for RFC 5321 maximum line length

Per 4.5.3.1.4. Command Line:

The maximum total length of a command line including the command word
and the <CRLF> is 512 octets.  SMTP extensions may be used to
increase this limit.

Currently, the implementation actually allocating too many bytes and the new realline function ensures we do not overflow. Despite technically being outside of the spec, I will keep this as-is.

Issue: Mail is never actually sent or written

Currently, without support for Internet Message Format (RFC 5322) and Maildir, we cannot actually write mail to storage; however, the server is still returning "250 OK" despite erasing the only copy from its memory. To prevent the server from lying, I am updating the response to "450 Requested mail action not taken: mailbox unavailable".

The actual issue will be fixed once we implement Maildir and local user accounts.


Local users

I am under no illusion that the SMTP (RFC 5321) implementation is now perfect. It definitely will exhibit unexpected behavior in some edge cases and assumes well-behaved clients. Security is lacking, timeout is not implemented, etc. However, I think enough of the fundamental design and structure is in place to allow me to continue.

At this point, the SMTP server can (theoretically) handle connections from an SMTP client. Now, we need to determine what to do with the data we receive. Here is the main idea/flowchart:

         RCPT TO:<user@domain>
                     |
         Do I own this domain?
          /                  \
        Yes (receiving)       No (sending)
         |                     |
  Does user exist?       Is sender authenticated?
   /            \         /                    \
  No            Yes     Yes                     No
  |                \   /                         |
550 No such user  250 OK                  550 Relay denied

This will allow us to accept mail to our domain while preventing and send to another domain without acting as an open relay. The handling of local users is outside of the scope of RFC 5321 so it is completely up to the implementation.

In order to check for local users (while also allowing for sub-domains in the future), I will use the simplest possible approach: a single txt file containing a list of users in the user@domain format. For user in RCPT TO:<user@domain>, if user matches a line in this txt file, then we should accept the email and store it for a local user. This is also keeps the project entirely in the C standard library and avoids a database dependency.

Here is my implementation from the is_local_user(...) stub:

bool is_local_user(const char *addr) {
  // If no @ return false
  const char *at = strchr(addr, '@');
  if (!at) return false;

  // If domain does not match NAME return false (preventing open relay)
  const char *domain = at + 1;
  if (strcmp(domain, NAME) != 0) return false;

  // Extract username in username@domain
  size_t username_len = at - addr;
  char username[ADDRESS_SIZE];
  if (username_len >= ADDRESS_SIZE || username_len == 0) return false;
  strncpy(username, addr, username_len);
  username[username_len] = '\0';

  // Check username against local users file
  FILE *fp = fopen(LOCAL_USERS_FILE, "r");
  if (!fp) {
    printf("Server error: Cannot open local users file at %s\n", LOCAL_USERS_FILE);
    return false;
  }
  char line[ADDRESS_SIZE];
  bool found = false;
  while (fgets(line, sizeof(line), fp)) {
    line[strcspn(line,"\r\n")] = '\0';
    if (strcmp(line, username) == 0) {
      found = true;
      break;
    }
  }
  fclose(fp);

  return found;
}

Now, at the termination of DATA mode, if the RCPT address is a local user, we want to write the email to our storage. There are various formats and recommendations (not full RFCs) for this, but I plan to use (more-or-less) the maildir format. Each user has their own mailbox directory with sub-directories that categorize the status of mail. By first writing to tmp/ then renaming into new/, this format prevents lock issues and remains atomic.

mail/
├── alice/
│   └── Maildir/
│       ├── tmp/
│       ├── new/
│       └── cur/
└── bob/
    └── Maildir/
        ├── tmp/
        ├── new/
        └── cur/

Here is the overview of my Maildir implementation. Note that I do not yet support info version/tags for new/ files:

static bool deliver_to_user(const char *username, const char *data, size_t data_len) {
  char tmp_path[MAX_PATH];
  char new_path[MAX_PATH];
  char filename[256];

  // Generate and check Maildir-style filename
  if (!maildir_filename(filename, sizeof(filename))) return false;

  int tmp_written = snprintf(tmp_path, sizeof(tmp_path), "%s/%s/Maildir/tmp/%s", MAIL_DIR, username, filename);
  int new_written = snprintf(new_path, sizeof(new_path), "%s/%s/Maildir/new/%s", MAIL_DIR, username, filename);

  // Max length checks for writes
  if (tmp_written >= MAX_PATH || new_written >= MAX_PATH) {
    return false;
  }

  // Write to tmp/; that is, (MAIL_DIR)/(username)/Maildir/tmp/(filename)
  FILE *fd = fopen(tmp_path, "wbx");
  if (!fd) { 
    fprintf(stderr, "fopen failed: %s (path: %s)\n", strerror(errno), tmp_path);
    return false;
  }

  // fsync
  bool ok = (fwrite(data, 1, data_len, fd) == data_len);
  if (ok) ok = (fflush(fd) == 0);
  if (ok) ok = (fsync(fileno(fd)) == 0);
  fclose(fd);
  if (!ok) {
    unlink(tmp_path);
    return false;
  }

  // Atomic rename into new/
  // TODO: include info version and flag support
  if (rename(tmp_path, new_path) != 0) {
    unlink(tmp_path);
    return false;
  }

  return true;
}

Now, a connection can handle local recipients and store it appropriately in the recipient's mailbox. For example,

TCP: Listening on :2525
TCP: Client connected
S: 220 foo.com Simple Mail Transfer Service Ready
C: EHLO bar.com
S: 250-foo.com greets bar.com
S: 250-SIZE 2048
S: 250 VRFY
C: MAIL FROM:<alice@bar.com>
S: 250 OK
C: RCPT TO:<bob@foo.com>
S: 250 OK
C: DATA
S: 354 Start mail input; end with <CRLF>.<CRLF>
C: Hello Bob, it's Alice.
C: This is a test message for the Maildir system.
C: .
S: 250 OK
C: QUIT
S: 221 foo.com Service closing transmission channel
TCP: Connection closed

I ran into a rather silly de-synced state issue. My use of fopen(...) to write emails would fail if the directory to write to did not exist; however, at the SMTP level, the server was accepting usernames based on their inclusion in the txt file rather than if their mailbox was actually set up. This allowed for an opaque fail state. Eventually, I will want a better managing interface/script to ensure that local user information remains in sync.

I also added support for tracking the data buffer length as a part of the session state, cleaned up some local user search patterns, and fixed some of the rough edges of the data mode.

Now, we should be able to receive and store emails for our local users. There are several key limitations to the current project:

  • No sending to non-local users
  • No auth/verification
  • No MIME support
  • No knowledge of Internet Message Format (RFC 5322)
  • No server hosting and DNS records

Last updated March 27, 2026