Minimum viable email service
Email is one of the killer features of the internet and has remained one of its core use cases for decades. Originally intended as a federated messaging protocol, the unprecedented scale of user adoption and unforeseen requirements of modern communication has collapsed the network into a relatively small number of centralized email services, most notably Google's Gmail. However, the core protocols response for email are still fundamentally federated even if in practice factors such as domain reputation will greatly reduce the viability of small-scale email providers.
Through its history, email has evolved significantly in order adopt new standards and security features for the modern day. As a result, the "Simple Mail Transfer Protocol" has become significantly less simple since its inception via RFC 821 in 1982. Despite regularly using email, its actual protocol and workings has been largely mysterious to me. In order to learn more about the protocol, I am going to attempt to implement my own minimum viable email service.
My long-term goal is to read the Request For Comments (RFCs), synthesize the protocol into a readable format for the interested reader in this, and future, webpages, and implement a theoretically-interoperable email service. The goal of this webpage is to layout an overview of this project and implement the SMTP server for part (1) as outlined below:.
Specifications
At minimum, a working email server requires four fundamental capabilities:
- a way to receive mail from other servers
- a way to deliver or store mail for local users
- a way for users to read their mail
- a way for users to send mail
For (1), I will implement a Simple Mail Transfer Protocol (SMTP) server to receive mail from other servers (per RFC 5321). This mail will follow the Internet Message Format (RFC 5322). Ideally, I will later support Multipurpose Internet Mail Extensions (MIME) per RFC 2045-2049.
For (2), received mail will be stored locally using the Maildir format, which uses a nested directory storage system to write and store email as a file.
For (3), users will read their mail using the Internet Message Access Protocol (IMAP) specified in RFC 3501, allowing the server to act as the canonical source of truth rather than a temporary container for received emails.
Sending mail (4) is the most aspirational goal on this list. It requires SMTP Submission (RFC 6409) along with security measures to potentially avoid filtering by other SMTP servers: STARTTLS (RFC 3207) and ideally SPF/DKIM/DMARC support. In addition, DNS MX lookup and an outbound SMTP client (again, via RFC 5321) would be required to actually perform mail delivery over the internet.
Software design
The main challenge of this project is a combination of networking and interoperable protocol implementation.
Typically, a mail service is considered to be four separate internal services:
- Mail Transfer Agent (MTA): interact with external SMTP servers and manage queue
- Mail Submission Agent (MSA): handle authenticated user-submitted mail
- Mail Delivery Agent (MDA): writes mail into local mailboxes
- Mail Access: IMAP service for the client to access locally stored mail
I intended to consolidate the MTA, MSA, and MDA into a single service, with a separate IMAP service for reading mail.
TCP server
Rather than fully implementing each RFC/specification sequentially, I am instead going to build usable services to make it easier to test (and more rewarding to build) as I go. To this end, I will start with receiving mail.
Before we can receive commands, we need to set up a basic TCP server. In C, we can follow a consistent pattern. First, we need to create a socket:
/*
* socket is defined as socket(int domain, int type, int protocol);
* AF_INET indicates ipv4, SOCK_STREAM indicates TCP, and 0 selects the default protocol
*/
server_fd = socket(AF_INET, SOCK_STREAM, 0);
if (server_fd < 0) {
perror("socket failed");
exit(EXIT_FAILURE);
}
I recommend enabling port reuse, otherwise a recently terminated socket can lock the port for ~1 minute:
int opt = 1; setsockopt(server_fd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));
Now, we can bind to our chosen PORT:
address.sin_family = AF_INET;
address.sin_addr.s_addr = INADDR_ANY;
address.sin_port = htons(PORT);
if (bind(server_fd, (struct sockaddr *)&address, sizeof(address)) < 0) {
perror("bind failed");
exit(EXIT_FAILURE);
}
With the server listening at a given port, we can now accept connections from clients:
client_fd = accept(server_fd, (struct sockaddr *)&address, (socklen_t *)&addrlen);
if (client_fd < 0) {
perror("accept failed");
}
printf("TCP: Client connected: %s\n", inet_ntoa(address.sin_addr));
Finally, with a connection, we can send to and receive from the client:
// Read data from client
int bytes_read = read(client_fd, buffer, BUFFER_SIZE - 1);
buffer[bytes_read] = '\0';
printf("C: %s\n", buffer);
// Send response to client
const char *response = "Hello from server\n";
send(client_fd, response, strlen(response), 0);
printf("S: %s\n", response);
In RFC 5321, they prefix server messages with "S: " and client messages with "C: " for their example scenarios. I like this format so I am implemented it for my logging.
Once a connection needs to be terminated, we can close the server's connection to the client with simply:
close(client_fd);
With this, we can now run a basic TCP server, connect to it with netcat (nc), and start some communication with a simple conversation loop:
while (1) {
memset(buffer, 0, BUFFER_SIZE);
int bytes_read = read(client_fd, buffer, BUFFER_SIZE - 1);
if (bytes_read <= 0) {
printf("TCP: Client disconnected.\n");
break;
}
buffer[bytes_read] = '\0';
// Strip trailing \r\n
char *end = buffer + strcspn(buffer, "\r\n");
*end = '\0';
printf("C: %s\n", buffer);
if (strncmp(buffer, QUIT_CMD, strlen(QUIT_CMD)) == 0) { // QUIT
char bye[BUFFER_SIZE];
snprintf(bye, sizeof(bye), "221 %s Service closing transmission channel\r\n", NAME);
send(client_fd, bye, strlen(bye), 0);
printf("S: %s", bye);
break;
} else { // Catchall
const char *response = "500 Command not implemented\r\n";
send(client_fd, response, strlen(response), 0);
printf("S: %s", response);
}
}
SMTP (RFC 5321)
Now that we have a TCP server, we can begin thinking about the actual SMTP protocol per RFC 5321 in order to receive mail from other servers.
Per RFC 5321 4.5.1. Minimum Implementation,
4.5.1. Minimum Implementation
In order to make SMTP workable, the following minimum implementation
MUST be provided by all receivers. The following commands MUST be
supported to conform to this specification:
EHLO
HELO
MAIL
RCPT
DATA
RSET
NOOP
QUIT
VRFY
Now, we review each of these commands and discuss a basic implementation into our TCP server main loop. For the sake of pedagogy, I will cover these commands in order of their usage rather than their presentation in the above list.
Hello (HELO)
The HELO command is sent by the client to the server along with the client's Fully Qualified Domain Name (FQDN) in order to introduce itself to the server. It should be the first command sent following the server's greeting and must occur prior to the mail transport commands. The server is expected to respond with "250 OK" to acknowledge receipt.
The syntax is defined to be the following:
helo = "HELO" SP Domain CRLF
The above is the syntax format used in the RFC 5321 which is rarely explained, so I will take a second to illustrate how this should be interpreted. On the left hand side, the helo is the command name, which is called via a plain text command from the client of the form on the right hand side. The double quotation around "HELO" is used to indicate a string literal, the SP means a space, the Domain refers to the FQDN, and CRLF is an acronym for Carriage Return and Line Feed (that is, "\r\n"). So an example HELO comand in this syntax is the following:
HELO mail.bar.com\r\n
The HELO command was originally introduced in RFC 821; however, modern SMTP (RFC 5321) prefers EHLO.
Extended Hello (EHLO)
The EHLO (Extended Hello) command is a modern replacement for HELO that allows the server to respond with a list of supported features. Thus, the "Extended" part of "Extended Hello" is a reference to its ability for the client to be made aware of extra server features and behave accordingly. It is still used as the client's first message to the server following the server's greeting.
Consider this example snippet from Scenario D.1.:
S: 220 foo.com Simple Mail Transfer Service Ready C: EHLO bar.com S: 250-foo.com greets bar.com S: 250-8BITMIME S: 250-SIZE S: 250-DSN S: 250 HELP
In this example, the client identifies itself as bar.com following the server's 220 greeting message. The server sends a 250 to acknowledge bar.com, then lists its extensions to SMTP. Those are the following: 8BITMIME (8 bit character support), SIZE (allows server to declare maximum email size), DSN (delivery status notification), and HELP (can be used to request more information). The use of the dash (-) in the server's message is used to state that the client should expect more from the server. The last server message ("250 HELP") does not have this dash since it is the last message from the server which is now awaiting a response from the client.
The MAIL command is used to provide the sender's mailbox. The syntax is the following:
mail = "MAIL FROM:" Reverse-path [SP Mail-parameters] CRLF
Here is an example of the MAIL command used to provide the user's mailbox, "Smith@bar.com":
MAIL FROM:<Smith@bar.com>\r\n
Recipient (RCPT)
The RCPT command is used to identify the recipient for the email. Only one recipient is identified with the command; however, the command can be repeated to add additional recipients.
The syntax also contains additional information for the required Postmaster address:
rcpt = "RCPT TO:" ( "<Postmaster@" Domain ">" / "<Postmaster>" / Forward-path ) [SP Rcpt-parameters] CRLF
For example,
RCPT TO:<Jones@foo.com>\r\n
and then
RCPT TO:<Brown@foo.com>\r\n
If no user exists with the stated address, the server responds with "550 No such user here" instead of the successful "250 OK".
DATA
The DATA command is used to construct the mail data from the sender. Unlike other commands with are completed in a single message, DATA tells the server to enter into a different mode, in which subsequent plain text messages from the client should be appended to mail data buffer. These mode is terminated by sending a period (".") to the server.
The syntax is rather simple for this command:
data = "DATA" CRLF
Here is a snippet of this command from scenario D.1.:
C: DATA S: 354 Start mail input; end with <CRLF>.<CRLF> C: Blah blah blah... C: ...etc. etc. etc. C: .
At this point, it is worth mentioning that we are only describing the SMTP RFC 5321 specification here. The data format itself, known as the Internet Message Format, is specified in RFC 5322 and will be discussed later.
Reset (RSET)
The RSET command instructs the server to discard all sender, recipient, and mail data, clearing all buffers and state tables.
Syntax:
rset = "RESET" CRLF
For example:
RESET\r\n
NOOP
The NOOP command only asks for the receiver to respond with "250 OK". It has no effect on buffers and may be issued at any time. Note that a parameter string make be specified; however, it should be ignored by the server.
Syntax:
noop = "NOOP" [ SP String ] CRLF
For example:
NOOP\r\n
QUIT
The QUIT command instructs the server to send a "221 OK" reply then close the transmission channel. It may be sent by the client at any time which will abort any uncompleted mail transaction.
Syntax:
quit = "QUIT" CRLF
For example:
QUIT\r\n
Verify (VRFY)
The VRFY command is used to ask the server to confirm that the argument identifies a user or mailbox.
Syntax:
vrfy = "VRFY" SP String CRLF
For example:
VRFY Crispin\r\n
which may produce a response such as
250 Mark Crispin <Admin.MRC@foo.com>\r\n
Current progress
Below is my complete, single-afternoon attempt at implementing SMTP (RFC 5321) in C. Keep in mind that this is still very much in a toy implementation stage.
Known limitations:
- VRFY always returns "252 Cannot VRFY user, but will accept message and attempt delivery"
- Single client blocking
read()loops assume one line per syscall (via netcat) but true TCP can violate this- Highly limiting data array sizes
- No input validation
- Assumes no extension in EHLO (and does not support dashes (-) in server responses)
Full code (collapsable details)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>
#include <unistd.h>
#include <arpa/inet.h>
#define PORT 2525
#define NAME "foo.com"
#define BUFFER_SIZE 1024
#define ADDRESS_SIZE 128
#define MAX_ADDRESSES 128
#define DATA_SIZE 2048
#define EHLO_CMD "EHLO"
#define HELO_CMD "HELO"
#define MAIL_CMD "MAIL FROM:"
#define RCPT_CMD "RCPT TO:"
#define DATA_CMD "DATA"
#define RSET_CMD "RSET"
#define NOOP_CMD "NOOP"
#define QUIT_CMD "QUIT"
#define VRFY_CMD "VRFY"
struct Session {
bool greeted;
bool set_mail_from;
bool in_data_mode;
int recipient_count;
char mail_from[ADDRESS_SIZE];
char rcpt_to[MAX_ADDRESSES][ADDRESS_SIZE];
char data[DATA_SIZE];
};
void reset_transaction(struct Session *state) {
state->set_mail_from = false;
state->in_data_mode = false;
state->recipient_count = 0;
memset(state->mail_from, 0, ADDRESS_SIZE);
memset(state->rcpt_to, 0, sizeof(state->rcpt_to));
memset(state->data, 0, DATA_SIZE);
}
void reset_session(struct Session *state) {
state->greeted = false;
reset_transaction(state);
}
void send_msg(const int client_fd, const char msg[]) {
send(client_fd, msg, strlen(msg), 0);
printf("S: %s", msg);
}
bool extract_address(const char *arg, char *out_addr, size_t out_size) {
if (arg[0] != '<') {
return false;
}
const char *close = strchr(arg, '>');
if (!close || close == arg + 1) { // missing or empty
return false;
}
size_t len = (size_t)(close - arg - 1);
if (len >= out_size) {
return false;
}
strncpy(out_addr, arg + 1, len);
out_addr[len] = '\0';
return true;
}
int main() {
int server_fd, client_fd;
struct sockaddr_in address;
int addrlen = sizeof(address);
char buffer[BUFFER_SIZE];
// Create socket
server_fd = socket(AF_INET, SOCK_STREAM, 0);
if (server_fd < 0) { perror("socket failed"); exit(EXIT_FAILURE); }
int opt = 1;
setsockopt(server_fd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));
address.sin_family = AF_INET;
address.sin_addr.s_addr = INADDR_ANY;
address.sin_port = htons(PORT);
if (bind(server_fd, (struct sockaddr *)&address, sizeof(address)) < 0) {
perror("bind failed"); exit(EXIT_FAILURE);
}
if (listen(server_fd, 5) < 0) {
perror("listen failed"); exit(EXIT_FAILURE);
}
// Responses
char greeting[BUFFER_SIZE];
snprintf(greeting, sizeof(greeting), "220 %s Simple Mail Transfer Service Ready\r\n", NAME);
char bye[BUFFER_SIZE];
snprintf(bye, sizeof(bye), "221 %s Service closing transmission channel\r\n", NAME);
const char *okay = "250 OK\r\n";
const char *cannot_verify = "252 Cannot VRFY user, but will accept message and attempt delivery\r\n";
const char *start_mail_input = "354 Start mail input; end with <CRLF>.<CRLF>\r\n";
const char *not_implemented = "500 Command not implemented\r\n";
const char *syntax_error_param = "501 Syntax error in parameters or arguments\r\n";
const char *bad_sequence = "503 Bad sequence of commands\r\n";
const char *too_many_recipients = "452 Requested action not taken: insufficient system storage\r\n";
printf("TCP: Listening on :%d\n", PORT);
// Accept client
while (1) {
client_fd = accept(server_fd, (struct sockaddr *)&address, (socklen_t *)&addrlen);
if (client_fd < 0) { perror("accept failed"); continue; }
printf("TCP: Client connected: %s\n", inet_ntoa(address.sin_addr));
// Initialize session state
struct Session session_state;
reset_session(&session_state);
// Send greeting
send_msg(client_fd, greeting);
// Conversation with client
while (1) {
memset(buffer, 0, BUFFER_SIZE);
int bytes_read = read(client_fd, buffer, BUFFER_SIZE - 1);
if (bytes_read <= 0) {
printf("TCP: Client disconnected.\n");
break;
}
buffer[bytes_read] = '\0';
// Strip trailing \r\n
char *end = buffer + strcspn(buffer, "\r\n");
*end = '\0';
int stripped_len = (int)(end - buffer);
printf("C: %s\n", buffer);
// Data mode
if (session_state.in_data_mode) {
if (strcmp(buffer, ".") == 0) { // Terminate
session_state.in_data_mode = false;
send_msg(client_fd, okay);
} else {
// Dot-stuffing
const char *line = (buffer[0] == '.' && buffer[1] != '\0') ? buffer + 1 : buffer;
strncat(session_state.data, line, DATA_SIZE - strlen(session_state.data) - 1);
strncat(session_state.data, "\r\n", DATA_SIZE - strlen(session_state.data) - 1);
}
continue;
}
// Command mode
char arg[BUFFER_SIZE];
char extracted_addr[ADDRESS_SIZE];
if (strncmp(buffer, EHLO_CMD, strlen(EHLO_CMD)) == 0) { // EHLO
session_state.greeted = true;
if (stripped_len > (int)strlen(EHLO_CMD) + 1) { // Get client FQDN if was provided (that is, content after "EHLO ")
// Construct arg
strncpy(arg, buffer + strlen(EHLO_CMD) + 1, sizeof(arg) - 1);
arg[sizeof(arg) - 1] = '\0';
// Personalized greeting
char greets_you[BUFFER_SIZE];
snprintf(greets_you, sizeof(greets_you), "250 %s greets %s\r\n", NAME, arg); // ASSUMING NO EXTENSIONS
send_msg(client_fd, greets_you);
} else { // No FQDN provided by client
send_msg(client_fd, okay);
}
} else if (strncmp(buffer, HELO_CMD, strlen(HELO_CMD)) == 0) { // HELO
session_state.greeted = true;
send_msg(client_fd, okay);
} else if (strncmp(buffer, MAIL_CMD, strlen(MAIL_CMD)) == 0) { // MAIL
if (!session_state.greeted) {
send_msg(client_fd, bad_sequence);
continue;
}
strncpy(arg, buffer + strlen(MAIL_CMD), sizeof(arg) - 1);
arg[sizeof(arg) - 1] = '\0';
if (strcmp(arg, "<>") == 0) { // Allow empty reverse-path <> (used by bounce messages)
session_state.mail_from[0] = '\0';
} else if (extract_address(arg, extracted_addr, ADDRESS_SIZE)) { // Extracted address
strncpy(session_state.mail_from, extracted_addr, ADDRESS_SIZE - 1);
session_state.mail_from[ADDRESS_SIZE - 1] = '\0';
} else { // Invalid
send_msg(client_fd, syntax_error_param);
continue;
}
session_state.set_mail_from = true;
send_msg(client_fd, okay);
} else if (strncmp(buffer, RCPT_CMD, strlen(RCPT_CMD)) == 0) { // RCPT
if (!session_state.greeted || !session_state.set_mail_from) {
send_msg(client_fd, bad_sequence);
continue;
}
if (session_state.recipient_count >= MAX_ADDRESSES) {
send_msg(client_fd, too_many_recipients);
continue;
}
strncpy(arg, buffer + strlen(RCPT_CMD), sizeof(arg) - 1);
arg[sizeof(arg) - 1] = '\0';
// Attempt extract address
if (!extract_address(arg, extracted_addr, ADDRESS_SIZE)) {
send_msg(client_fd, syntax_error_param);
continue;
}
// Update session for new address
strncpy(session_state.rcpt_to[session_state.recipient_count], extracted_addr, ADDRESS_SIZE - 1);
session_state.rcpt_to[session_state.recipient_count][ADDRESS_SIZE - 1] = '\0';
session_state.recipient_count++;
send_msg(client_fd, okay);
} else if (strncmp(buffer, DATA_CMD, strlen(DATA_CMD)) == 0) { // DATA
if (!session_state.greeted || !session_state.set_mail_from || session_state.recipient_count == 0) {
send_msg(client_fd, bad_sequence);
continue;
} else {
session_state.in_data_mode = true;
send_msg(client_fd, start_mail_input);
}
} else if (strncmp(buffer, RSET_CMD, strlen(RSET_CMD)) == 0) { // RSET
reset_transaction(&session_state); // Reset transaction (not session)
send_msg(client_fd, okay);
} else if (strncmp(buffer, NOOP_CMD, strlen(NOOP_CMD)) == 0) { // NOOP
send_msg(client_fd, okay);
} else if (strncmp(buffer, QUIT_CMD, strlen(QUIT_CMD)) == 0) { // QUIT
reset_session(&session_state);
send_msg(client_fd, bye);
break;
} else if (strncmp(buffer, VRFY_CMD, strlen(VRFY_CMD)) == 0) { // VRFY
send_msg(client_fd, cannot_verify);
} else { // Catchall
send_msg(client_fd, not_implemented);
}
}
close(client_fd);
printf("TCP: Connection closed\n\n");
}
close(server_fd);
return 0;
}
I will follow up with more webpages to discuss updates to this project, along with changes that I make.
Last updated March 26, 2026