Overview of the New York City Marathon and its Email Alerts
It was a sunny mid-September day when my head of sales informed me that the systems integrator for the 2009 New York City Marathon was looking at our SMTP service for the delivery of alert emails on runners during the marathon. Given that our SMTP relay service was relatively new at the time, we saw this as an opportunity to demonstrate the powerful tracking features and performance capabilities of our service.
Having tried other email vendors and their own internal systems in the past, the marathon was looking to avoid past issues, such as large delays in the delivery of the email alerts and email blocking issues with consumer ISPs like GMail and Hotmail.
The marathon had about 42,000 total runners. Friends and family members of each runner could "subscribe" to a runner, such that whenever the runner reached a checkpoint during the marathon, an email alert would be sent to the subscribers of that runner. This would be done electronically via a Chronotrack D-Tag. For more information on the electronic tracking of runners, see http://www.nycmarathon.org/race_scoring.htm.
My team was told to expect anywhere from 400,000 to 2,000,000 email alerts to be sent out during the race. That represents an average of 10 to 50 email alerts per runner, depending on how many individuals subscribe to a particular runner.
The Speed Problem
The JangoSMTP service has been a single-node service since its launch earlier in 2009. It was serviced by one single fault-tolerant, RAID-based, multiple-CPU, high memory Windows server located at relay.jangosmtp.net. The SMTP service works by receiving email at relay.jangosmtp.net, and then passing emails along to one of JangoMail's 40 outbound SMTP senders for delivery to the final recipient. The relay.jangosmtp.net server could receive an unlimited amount of emails and saturate our upstream Internet provider's bandwidth, but it could only process, add tracking, and transmit to the outbound SMTP senders over the Internet at a rate of 250 emails/minute. I therefore calculated:
250 emails/minute x 60 = 15,000 emails/hour
Based on past marathon results, I estimated that the fastest runners would complete the race in about 2 hours, and the slowest runners would complete the race in about 6 hours. However, the marathon would be initiated in 3 waves of 14,000 runners each, distributed over an hour the morning of the marathon. So 6 hours of running time, plus an added hour for the start of the last wave, meant 7 hours of sending email.
7 hours x 15,000 emails/hour = 105,000 emails over 7 hours
Uh oh. We weren't nearly fast enough. Even at the minimum expected volume, we weren't fast enough by a factor of 4. And at the maximum expected volume, we only had 5% of the needed capacity.
The two bottlenecks were 1) the processing of an email message, meaning the dis-assembly and re-assembly to determine what user it belonged to and add tracking mechanisms, and 2) transmitting the email to a SMTP sender. Point #2 warrants more explanation. While there is no bottleneck for relay.jangosmtp.net to receive emails from the outside world, there can be a bottleneck for relay.jangosmtp.net to transmit emails, since relay.jangosmtp.net must transmit the message to separate SMTP sender located on separate networks in separate data centers. This transmission happens over the Internet, and based on where the SMTP sender is located and the routing to it, speed can fluctuate.
Time to Scale Up
We needed to scale up, and scale up fast. The initial plan was to order four more servers, and have them each serve as an additional SMTP receiver and processor for relay.jangosmtp.net. The JangoMail architecture does not employ the use of appliance-based load balancers, so I decided to handle the load balancing via our Domain Name System (DNS).
We created multiple DNS "A" records for relay.jangosmtp.net, each with a Time to Live (TTL) of 60 seconds. Five "A" records were created in total, each with a different IP address. The first of the five was the original IP for relay.jangosmtp.net, and the other four were for the four additional servers we commissioned.
By keeping the TTL at a short 60 seconds, I could make certain that each of the five servers would receive an equal load of email every minute. And since not all DNS servers on the net respect TTLs and sometimes do their own caching, I confirmed with our tech contact at the marathon that their systems would NOT cache the IPs for relay.jangosmtp.net beyond the designated 60 second TTLs.
SQL Query Caching
Every email that arrives at relay.jangosmtp.net is disassembled, tracked, re-assembled, and then passed to an outbound SMTP server for actual sending. In order to determine what email belongs to what user, and what tracking options each user has selected, the originating IP address of each email message is looked up against an IP Address table, and then once the user is determined from the IP address, the UserAccounts table in the core database is queried to determine what tracking/DomainKeys options the user has selected. These two queries combined took anywhere from 0.05 seconds to 0.2 seconds, depending on the load on the database at the time.
We shaved this time down to 0.001 seconds by caching the results of these queries and refreshing every five minutes.
Our custom SMTP architecture is a multi-threaded model, allowing for the simultaneous processing and delivery of emails across user accounts. We had been informed ahead of time that the marathon would trigger email alerts from two originating IP addresses. We therefore configured two dedicated processing threads on each of the 8 servers. This gave us 16 total processing threads, and also isolated the processing/delivery of the marathon's emails from our other clients' emails during the race.
Distributed Transactional Senders
The SMTP service is part of JangoMail's transactional email platform, which also includes the SendTransactionalEmail API method. All transactional emails are sent through the email sender that is assigned for a particular user. For fault-tolerance purposes, every user has a list of transactional email senders assigned to it, such that if the first email sender is unavailable or offline, the email is passed to the second, and to the third, and so on, until the email is successfully transmitted. This approach was great for fault-tolerance and redundancy, but not for scaleability. If the first server in the list was online and available, then it would receive all the transactional emails for that account.
I therefore decided to add an internally controlled user-level setting option to randomize the list of senders. Now, if a user's list of transactional senders included:
Sender1, Sender2, Sender3, Sender4
Now the relay.jangosmtp.net servers would farm out the emails for delivery to any sender in the user's list at random, ensuring that as many transactional senders as were assigned would receive an equal load of emails to deliver, rather than having the first available sender do all the delivery. This also aided in resolving the bandwidth bottleneck with the SMTP senders mentioned earlier.
Still not fast enough - need 3 more servers
Given our SQL query optimizations, multi-threading, additional servers, the timing now looked like:
400 emails/minute/server x 5 servers = 2,000 emails/minute x 60 minutes = 120,000 emails/hour.
120,000 emails/hour x 7 hours = 840,000 emails over 7 hours
However, if the load was over 1 million emails, there would still be a delay. I decided to annex 3 additional servers that are already a part of the JangoMail network but weren't active on port 25, and turn them into 3 additional SMTP receivers. There were now 8 total DNS "A" records for relay.jangosmtp.net.
With just 48 hours before the marathon, we were informed that the expected outbound volume, based on the number of subscribers so far, would be about 750,000 emails.
And just in case all else failed...
While we've always believed in the performance and reliability of our code and architecture, we decided that for a project of this caliber, a backup plan was necessary in case our custom SMTP architecture was unable to perform. The JangoSMTP custom architecture is what allows emails that pass through the SMTP relay to be open and click-tracked, and stored in a database, so that SMTP logs can be viewed and reports can be generated based on open and click timing, domains, and geo-tracking reports based on IP addresses. However, the primary issue for the marathon, was to ensure the emails were delivered, and delivered on time. Therefore, a backup plan was put into place that could guarantee the emails would be delivered, even if we had to eliminate the tracking and logging based on our own custom architecture. If we found that JangoSMTP could not handle the load, we would replace the instances of the JangoSMTP receiver with Microsoft's built-in SMTP service (part of Internet Information Services), such that emails would be received and delivered to the final recipient, without any processing needed in between.
In order to isolate the marathon from our other clients in case this backup plan had to be put in place, we setup the domain marathon.jangosmtp.net, which mimiced the 8 A records for relay.jangosmtp.net, and we asked the system integrators for the marathon to connect to marathon.jangosmtp.net instead of relay.jangosmtp.net. This would allow us to re-direct just their email on the day of the race if needed.
The Day of the Race - November 1, 2009
I awoke at 7:30 AM EST on that Sunday, after having been out the night prior for Halloween. The emails would begin trickling in at 8:30 AM EST for runner check-ins, and the first wave of the race was set to begin at 9:40 AM EST.
I was watching the race live on TV, while at the same time monitoring the traffic flows across our 8 instances of relay.jangosmtp.net. Emails were being received, processed, and sent quickly, and there was no backlog...until about 12:25 PM EST.
All three waves had been released, and runners from all three waves were triggering a massive volume of email alerts. From 12:25 PM EST to about 12:35 PM EST, there was a 4-5 minute delay with final delivery. Thankfully, the backlog period only lasted 10 minutes. It makes sense that approximately 3 hours after the release of the first wave, that the highest volume of email was passing through, since the greatest number of runners would still be running around this time.
Additionally, at about 1:00 PM EST, we discovered that comcast.net and att.net/bellsouth.net domains were blocking one of the IPs from which marathon email was sending. While we did have a mechanism by which domain-specific routes could be used, such that we could enable all comcast.net/att.net/bellsouth.net email to go through one specific non-blocked IP address, the complexity of the randomization system we had added to accomodate the bandwidth bottleneck rendered this mechanism non-functional. I called our lead developer, who was on call, asked him to make a change to the sender-determination algorithm, and re-deployed our code across all 8 SMTP server instances. We were now able to route all comcast.net/att.net/bellsouth.net email through a separate non-blocked SMTP sender. In the end, the marathon alert emails had less than a 0.3% blocking rate.
After 1:00 PM EST, no further email deliverability or backlog issues ensued.
We were thrilled that we were able to pull this off for the 2009 ING New York City Marathon. Even post-marathon, we continue to make performance and feature enhancements to our transactional email platform. If you have an important project for which sending email is critical, please get in touch with us, and we'll work as hard for you as we did for the marathon.