How to Improve Amazon S3 Download Performance with a CDN


 


Among the thousands of origin servers we support, Amazon’s Simple Storage Service (S3) is the most popular (especially among our Enterprise CDN customers). And although S3 is great at warehousing content, it’s not designed with the price and performance considerations of a CDN.

Suppose you’re pushing 1TB a month. On S3 that costs you $122.76/month, served from a single location. With a MaxCDN Business Package, you could handle that same 1TB for $79, distributed across 10+ data centers worldwide. This results in massive load time improvements for users as you can see here:
Location/ISP S3 CDN Improvement
Amsterdam, NL 1,751ms 171ms -1,580ms (90.23%)
Chicago – GLC 1,208ms 120ms -1,088ms (90.62%)
New York – VZN FIOS 1,242ms 591ms -651ms (52.41%)
Today let’s walk through creating a new S3 bucket on AWS, creating a Pull Zone on MaxCDN, and measuring the real-world performance of our CDN.
.

Step 1: Create Your S3 Bucket

Login to your AWS account (https://console.aws.amazon.com/s3/home) and select “Create a Bucket”:

AWS S3 Management Console

Once created, choose Actions > Upload:

AWS S3 Management Console

Add a file and click start upload:

AWS S3 Management Console

Select the file, click Actions, then Make Public:

AWS S3 Management Console

With the file selected, click Properties and copy the URL:

AWS S3 Management Console

With the file URL, we can check out the S3 response headers using cURL:

 

Step 2: Create Your (CDN) Pull Zone

Our next step is to create a Pull Zone on MaxCDN. In the Control Panel, click “Create Pull Zone”:

MaxCDN Control Panel

Give it a name, and use the root S3 path as the origin server URL:

MaxCDN Control Panel

Click “Manage Pull Zone” for the direct CDN URL (like zone.company.netdna-cdn.com):

MaxCDN Control Panel MaxCDN Control Panel

Now that the Pull Zone is provisioned, check to see if the image is cached on the CDN:

Notice the 2 new headers in the response:

  • Server: NetDNA-cache/2.2
  • X-Cache: Hit

Server is the version of the NetDNA Caching Server.  X-Cache shows if the upstream Edge Server had the file cached. The first two requests will have a value of MISS, since Edge Servers only cache files once they are requested twice (this prevents rarely-accessed long-tail content from filling caches).

Step 3: Measure Performance

Using Chrome’s Developer Tools, we can peek at the file load time from the browser’s perspective (well under a second):

MaxCDN Control Panel

But that’s a single data point. With Catchpoint, we ran a detailed 4-hour performance test. As you can see, there’s a huge improvement when using a CDN vs. S3.

  • Average response: CDN (300ms) vs. S3 (1.8 – 2.7s)
  • Median response: CDN (200 – 400ms, worldwide) vs. S3 (1.2 – 4.2s, worldwide)
Catchpoint Web Performance Monitoring Catchpoint Web Performance Monitoring

A dedicated CDN offers consistent performance, and global load times up to 10x better than S3. But there’s no need to choose: use S3 as your origin server, and MaxCDN to speed up your site for every visitor.

  • http://twitter.com/davidhenzel David Henzel

    Great post Justin!

  • jelyman

    How is this cheaper? Wouldn’t you still have to pay for the 1TB of data heading out of S3?

    • http://blog.justindorfman.com jdorfman

      @jelyman Good question. The CDN acts as a proxy and serves the files until the cache is purged or becomes stale. Each request is not going back to the S3 bucket. The only time the CDN makes a request to the bucket is when a file/object is not cached on the edge.

      This video could help you visualize it better: http://youtu.be/leJ8WzG2Uuk

      • jelyman

        Makes sense. One would initially have to pay for the CDN’s pulling of content to its edges initially and when it becomes stale, but outside of those times, the only S3 cost is the pennies it takes to hold the data.

        • http://113tidbits.com/ tony greene

          yuuup!!

  • Dave Bishop

    Just what I was wondering, thanks!

  • http://twitter.com/EatNatto James Yeung

    Was there a reason to not just use Amazon’s CloudFront w/ S3 instead?

    • http://blog.justindorfman.com jdorfman

      @twitter-17184574:disqus because Cloudfront is a competitor. ;)

    • Peter

      Cloudfront doesn’t compress content.