Amazon’s Elastic File System: Kicking the Tires
Sebastian Good

If you want to store lots of data, the Amazon cloud has buckets and glaciers for you, but not a shared file system. Until now. The recently released AWS Elastic File System

 finally plugs that gap. Now you can mount an arbitrarily large file system to any of your Linux EC2 instances. Like other “cloud-native” solutions before it, you don’t have to provision or manage your own server capacity to manage this file system, you just start using it and AWS scales it up or down as necessary. The service is in preview currently and we were lucky enough to get our hands on an account, so we decided to kick the tires a bit. (As a side note, Azure has a similar program under preview as well.)

Why use it?

There are a lot of great reasons a file system is preferable to more traditional cloud buckets, like S3 or Azure Blob Storage.

The downside: cost

But it’s not all sunshine and puppies. Let’s compare storage costs (at least today, in the Oregon (US-West-2) region)

GLACIERS3

EFS¢/GB/MONTH

1¢3¢30¢

Concrete Use Cases

What’s worth doing for a factor of ten in cost? Well, it’s probably not your first choice if you’re handling petabytes of user data in a typical consumer-facing application. But there are a couple of scenarios where I’d consider it

Some Read Benchmarks

I spent a few minutes doing some simple benchmarks on large files I had lying around (6GB and 9GB seismic datasets). Without lots of measurements under different usage patterns, especially heavy ones, these need to be taken with a grain of salt. But without any particular effort, it appears that throughput of 100MB/s and over is easily achievable with EFS. We did a few quick and dirty measurements on read performance.

S3 -> EFS

Copying 9GB from S3 to EFS on a single machine with 10GigE connections sustained at 47MB/s. My guess is this was about 50MB/s coming in from S3, and 50MB/s going out, i.e. neither S3 nor EFS was connecting to my machine with more than 1GigE connections. This was using the AWS SDK, so should automatically be doing multi-threaded fetching and the usual tricks to get maximum speed from S3.

cat EFS

Reading 9GB from EFS using that age-old file system benchmark cat file > /dev/nullclocked in at 105MB/s from one machine with “high” network performance (a m4.xlarge).

multi-machine cat EFS

With three of those instances all reading the same file from EFS (all in the same availability zone), they each averaged 42MB/s for total bandwidth of over 125MB/s. Though in cases where I read from multiple machines, the first machine that started the read seemed to have a 10-20% throughput advantage over the others. This certainly suggests that multiple machines could stream data at high speeds if data were spread out properly on EFS, but there are no configuration switches for affecting this yourself. Some further testing with looking at one versus many files would be revealing.

Strided Scanning

A piece of code which mmaps a 6.1GB file and reads values spaced every 8.4k took 61 seconds to complete, which again hits the magic number of 100MBs, or basic sequential reading throughput. (The identical test on a 2013 Macboook Pro with an SSD took only 37 seconds, or more like 165MB/s.) This is expected for a typical SSD-based system whose block size is sure to be at least 4k, if not 8k or more. Each read of just 4 bytes pulls in a 4k page, so you don’t do any better than just reading the whole file.

Random Access

Reading 10,000 values at random locations in the 6.1GB file took only 455ms after flushing local file caches. (A seek time of just 45µs which is at or better than the optimal RTT for a GigE connection. Opportunistic reading or lucky random numbers may have necessitated slightly fewer reads.) Clearly the file was still cached on the EFS server, which is a powerful benefit. (The same test run a second time of course produces a 204ns seek time, which is on the order of RAM seek times, indicating it’s cached on the local server.)

Testing heavy random access from multiple nodes will be an interesting test for another day

Setup

Setting up an elastic file system is extremely easy. To start playing, just use the point-and-click web interface in the AWS console. There was really only one point that was poorly documented. When setting up an elastic file system, you are told to choose a security group for the file system. Recall that this security group basically sets up a firewall for the system. It’s unlikely your default security group allows inbound traffic on the NFS port (2049), so you’ll want to set up a security policy allowing this from whatever machines you intend to use it from. Likewise, the machines on which you’d like to mount the EFS need to enable outbound traffic on port 2049.

Finally, it’s no fun having the EFS disappear when a machine reboots, so consider adding the EFS mount to the /etc/fstab of your machine image. Unfortunately, you have different mount points per availability zone, so it’s a little harder to bake into a per-region AMI as one usually does. It may be wise to configure in a provisioning script to be run on boot. The magic line for your /etc/fstab is

us-west-2a.fs-12345678.efs.us-west-2.amazonaws.com:/ /efs nfs4 defaults,nofail,nobootwait 0 2

(obviously replacing with whatever your actual availability zone and filesystem id are!)

I’m looking forward to letting some more applications rip on this file system to see just which ones are worth the price tag. Let us know if you have an interesting use case!

RECENT POSTS FROM
THIS AUTHOR