Trading Fish The blog of Hector Castro

Preparing EC2 Instance Store with cloud-init

Most Amazon Machine Images (AMIs) are backed by an Elastic Block Store (EBS) volume. This volume houses the operating system and any additional software added to the machine image. When you launch an instance of an EBS backed AMI, the resulting EC2 instance usually includes some amount of instance store storage as well. Instance store is fast (relative to EBS), but also temporary, and physically attached to the virtual machine host.

Unprepared Instance Store

Instance store is associated with an EC2 instance via a block device mapping. Usually, instance store mappings carry a virtual device name of ephemeral0 to ephemeralN and are pre-formatted as ext3. Unfortunately, no formatted ext3 file system exists if you’re using SSD-based instance store with TRIM support (only r3.* and i2.* instances right now).

If you’re dealing with instance store that’s not pre-formatted, or you want to use a filesystem other than ext3, how do you remedy that elegantly inside of EC2? One possible answer is a set of cloud-init directives via EC2 user data.

User Data and cloud-init

Before launching an EC2 instance, you can provide it with a bit of user data. User data can either be a shell script or a set of cloud-init directives.

Using the fs_setup cloud-init module, formatting a pair of SSD volumes looks something like:

fs_setup:
   - label: ephemeral0,
     filesystem: ext3
     extra_opts: [ "-E", "nodiscard" ]
     device: ephemeral0
     partition: auto
   - label: ephemeral1,
     filesystem: ext3
     extra_opts: [ "-E", "nodiscard" ]
     device: ephemeral1
     partition: auto

After the volumes are formatted, you probably also want to mount them somewhere. The mounts module can handle that:

mounts:
 - [ ephemeral0, null ]  # Override any default EC2 mounting behavior
 - [ ephemeral1, null ]  # Override any default EC2 mounting behavior
 - [ ephemeral0, "/media/ephemeral0", "ext3", "defaults,nobootwait,discard", "0", "2" ]
 - [ ephemeral1, "/media/ephemeral1", "ext3", "defaults,nobootwait,discard", "0", "2" ]

Lastly, we can change the user and group for these mounts with runcmd so that users other than root (here I’m using hdfs) can read and write to them:

runcmd:
 - [ chown, hdfs, "/media/ephemeral0" ]
 - [ chgrp, hdfs, "/media/ephemeral0" ]
 - [ chown, hdfs, "/media/ephemeral1" ]
 - [ chgrp, hdfs, "/media/ephemeral1" ]

After putting all of these snippets together inside of a .yml file with #cloud-config at the top, it’s ready to be fed through the launch process of new EC2 instances via user data. In the end, hopefully producing a few nicely formatted and mounted volumes of instance store.