Amazon Web Services Tips
Build a Custom Amazon EC2 Machine Image - (CentOS 6.2)
by Jeff Hunter, Sr. Database Administrator
There is no shortage when it comes to finding an available Amazon Elastic Compute Cloud (EC2) Machine Image (AMI). Often times, however, finding an image from the community AMIs that meets your particular needs can be a challenge. In many cases the image is bloated, provides too much customization, performs poorly, or lacks any type of reasonable documentation. Not to mention the inherent security concerns associated with some 3rd party AMIs.
In this article, I will demonstrate how to create your own instance store-backed (a.k.a. S3-backed) and EBS-backed Amazon EC2 image of CentOS 6.2 (64-bit) with its own kernel. Creating your own AMI allows you to make the most of Amazon EC2 and provides better control over performance, security, and reproducibility. Your AMIs become the basic unit of deployment which allow you to rapidly boot new custom instances as you need them.
There are two methods to prepare your own custom Amazon EC2 instances for Linux/UNIX systems:
Involves launching an existing public AMI and modifying it according to your requirements.
Involves building a fresh installation either on a stand-alone machine or on an empty file system mounted by loopback.
Although preparing a new AMI from an existing one is often the easiest method, this guide will document the procedures to create a new AMI from scratch using a fresh OS install of CentOS 6.2 (64-bit) on an empty file system mounted by loopback.
Creating AMIs through a loopback involves performing a full operating system installation on a clean root file system. Using an empty file system mounted by loopback avoids having to create a new root disk partition and file system on a separate physical disk. After installing the operating system, the resulting image gets bundled as an AMI with the ec2-bundle-image command, which should be noted is part of the Amazon EC2 AMI Tools (and not the Amazon EC2 API Tools). Finally, the new AMI will be uploaded as an instance store-backed image and registered with Amazon EC2 using the command-line tools. Instructions will also be provided at the end of this guide to convert the instance store-backed AMI to an EBS-backed AMI.
Ensure the following prerequisites have been met before creating your new Amazon EC2 image. Although you may have already fulfilled some of the requirements discussed below, please do not skip over this section as it contains important configuration information needed to follow along with the examples presented in this guide. For example, which Amazon AWS services to sign up for and how to prepare the machine and set environment variables used to create the new AMI.
Amazon AWS Account
Obviously the first requirement is to create an AWS Account from the Amazon Web Services site if you don't already have one. Creating an AWS account is free; however, you will need to provide a credit card when signing up for the Amazon Elastic Compute Cloud (EC2), Amazon Simple Storage Service (S3), and Elastic Block Store (EBS) services (discussed below).
AWS Account Number
The account number (sometimes called the account id) shows up when you go to the Account Activity area of the AWS web site. The account number is a 12 digit number that appears in the top-right of the Account Activity page and is in the form:
When you use the account number in the context of the APIs, you should leave out the hyphens and just enter the 12 digits.
In this guide, your AWS account number will be assigned to the environment variable AWS_ACCOUNT_NUMBER on the build machine.
AWS EC2 Service
Sign up for the Amazon Elastic Compute Cloud (Amazon EC2) service if you haven't already done so.
AWS S3 Service
The AMI created in this guide will be instance store-backed (a.k.a. S3-backed) and requires you to be signed up for the Amazon Simple Storage Service (S3) service.
AWS EBS Service
If you intend to convert the instance store-backed AMI to an EBS-backed AMI then sign up for the Elastic Block Store (EBS) service.
AWS Access Key ID and Secret Access Key
The AWS Access Key ID and Secret Access Key serve the purpose of ID and Password to access Amazon S3. Navigate to Security Credentials, click on the Access Keys tab under Access Credentials to create or view your Access Key ID and Secret Access Key.
The Access Key and Secret Key will be assigned to the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY respectively on the build machine.
EC2 Private Key File and EC2 Certificate File
If you have not already created an X.509 Certificate, you need to create or upload one from the AWS Management Console. Navigate to Security Credentials, click on the X.509 Certificates tab under Access Credentials, and click "Create a new Certificate".
Important: After creating an X-509 Certificate, make sure to download the Private Key file before navigating away from the "X509 Certificate Created" window. AWS does not store your private key information and you will not be able to download the private key file at any other time. If you do not have access to your private key file, you will have to create a new certificate and private key.
For the purpose of this guide, I will be renaming my private key file from pk-2L7LZYRTNEAC4KGZMPPZWAOZ4KYCTCA4.pem to ec2-pk.pem.
The private key file name and path will be assigned to the environment variable EC2_PRIVATE_KEY on the build machine.
Once you have a registered X.509 Certificate, it can always be downloaded from the AWS Management Console be navigating to Security Credentials and clicking on the X.509 Certificates tab under Access Credentials.
As mentioned already, you will need to have the private key file associated with the X.509 certificate. Amazon does not store your private key information. If you do not have access to your private key file, you will have to create a new certificate and private key.
For the purpose of this guide, I will be renaming my certificate file from cert-2L7LZYRTNEAC4KGZMPPZWAOZ4KYCTCA4.pem to ec2-cert.pem.
The certificate file name and path will be assigned to the environment variable EC2_CERT on the build machine.
Build CentOS Machine
Build a CentOS 6.2 machine that will be used to create a new image on. The CentOS install only needs to include the base packages through a Minimal package installation. In this guide, I am using a physical machine (not imaged) that will be used to build the new image on. The environment must have network access in order to download all necessary Linux packages using yum.
The remaining prerequisites in this section deal with preparing a basic environment on the CentOS 6.2 build machine that will be used to create the new image.
Set Environment Variables
After creating the CentOS Build Machine, add the following environment variables to your shell login script (i.e. /root/.bashrc).
Install Required Linux Packages
Install the following Linux packages required to build images.
Copy EC2 Private Key and Certificate
Copy your EC2 private key and X.509 certificate to the following directories on the build machine:
Download the Amazon EC2 Tools
Download the Amazon EC2 API Tools.
Download the Amazon EC2 AMI Tools.
The Amazon EC2 AMI Tools and EC2 API Tools are Java based, so verify the JAVA_HOME environment variable is set and confirm that Java is installed correctly.
Get to Know Your Region and Availability Zone
AWS infrastructure services like EC2, S3, EBS, etc. are hosted in multiple locations around the world which include the United States, South America, Europe, and Asia Pacific. These Regions are logically isolated from one another, so for example, you will not be able to access US East EC2 resources when communicating with a South America endpoint. When creating AWS services, users should choose a Region which optimizes latency, minimizes costs, or addresses particular regulatory requirements.
Figure 1: AWS Regions
Nearly every command or call you make to EC2 must target a specific Region. Amazon chooses a default Region based on the value of the environment variable EC2_URL (or the --url command-line flag) which you use to specify your Region's endpoint. For example:
The default Region for the above endpoint is us-east-1 and will be the one I use throughout this guide based on my geographic location near the east coast.
Most of the commands in the Amazon EC2 API and AMI Tools will allow users to override the default Region if a URL endpoint is specified by the EC2_URL environment variable or --url flag by using the --region command-line parameter. For example:
When running the Amazon EC2 API and AMI Tools in this guide, I will not be manually specifying a Region using the --region command-line parameter and will allow the default Region be chosen based off the the endpoint specified by the EC2_URL environment variable.
To obtain a list of regions you have access to, use:
When you launch an EC2 instance, you can optionally specify an Availability Zone within your Region. If you do not specify an Availability Zone, Amazon EC2 selects one for you in the Region that you are using. When launching your initial instances, Amazon recommends accepting the default Availability Zone, which allows Amazon EC2 to select the best Availability Zone for you based on system health and available capacity.
When launching EC2 instances in this guide, I will not manually specify the Availability Zone and will allow EC2 to select the recommended one. When converting the instance store-backed AMI to an EBS-backed AMI, it is important to know which Availability Zone your EC2 instance is running in when creating the EBS volume. When creating an EBS volume, you must specify an Availability Zone and that Availability Zone for the volume must be the same as the instance to which it attaches.
To obtain a list of availability zones for your region, use:
Click here to learn more about Amazon Regions and Availability Zones.
With the prerequisites out of the way, it's time to start preparing the new instance store-backed AMI which involves building an operating system installation from scratch to a clean root file system. This will be performed on a stand-alone physical machine (known in this guide as the build machine) installed with CentOS 6.2 (64-bit). Later in this guide, we will convert the instance store-backed AMI to an EBS-backed AMI.
In this section we will perform the tasks necessary to prepare an Amazon EC2 instance store-backed AMI to an empty file system mounted by loopback. Creating the AMI through a loopback avoids having to create a new root disk partition and file system on a separate physical disk. The AMI will be created through a loopback which involves doing a full operating system installation on a clean root file system.
Start by creating a disk image with an empty ext4 file system mounted by loopback. The loopback module enables you to use a normal file as if it were a raw device. Think of it as a file system within a file. Mounting a file system image file through loopback presents it as part of the normal file system. You can then modify it using your favorite file management tools and utilities.
Make sure to create a disk image file large enough to host the operating system, tools, and applications that you will install. For example, a baseline Linux/UNIX installation requires about 700 MB, so your file should be at least 1 GB.
The following example creates an empty 10 GiB file system mounted by loopback.
Before installing the operating system, prepare the image by creating directories in the root file system to hold system files and devices.
Populate the /dev directory with a minimal set of devices. Ignore any MAKEDEV: mkdir: File exists warnings.
Mount dev, pts, shm, proc, and sys in the new root file system.
With the system directories and device files in place, you are ready to install the CentOS operating system in the image file. Depending on the speed of the host and network link to the repository, this process might take a while.
Create a yum configuration file (e.g. yum-xen.conf) that you will use to install the base OS. The configuration file ensures that all the required basic packages and utilities are installed. You can locate this file anywhere on your main file system (not on your loopback file system) and is used only during installation.
Install the Base package and supporting utilities to the image using the yum configuration file. We need e2fsprogs for the fsck.ext4 command. We install yum-plugin-fastestmirror.noarch so that yum tests for a faster repository mirror rather than connecting to one at random.
At this point you can add any other yum or rpm installs to the image.
After successfully installing the base operating system, the next step is to configure your networking, hard drives, and security settings to work in the Amazon EC2 environment.
Create a shell login script on the new image for the root account.
Configure networking options for the image.
Ensure that the network will be started on boot.
CentOS comes with SELinux set to enforcing by default; however, in some cases it doesn't get labelled correctly depending on the instance being created. It is best to assume that for the first start of the instance that it is not properly labelled. Run the following to ensure labelling is executed on the first start of the instance.
If you decide not to enable SELinux, this can be set by modifying the file /mnt/ec2-image/etc/sysconfig/selinux as follows:
Create an fstab file and any optional mount points for the image that supports the target Amazon EC2 instance type. Although the example fstab file demonstrated in this guide will allow an image to successfully boot from the volume attached for the root device for all Amazon EC2 instance types, it will only correctly mount the local instance storage (a.k.a. ephemeral storage) for the following instance types:
The following example fstab file will mount a single instance storage volume to /mnt/vol1 as well as swap (/dev/xvde3). For a Small instance type, the local instance storage is 160 GiB while for a Medium instance type, the local instance storage is 410 GiB. The amount of local instance storage included with all other Amazon EC2 instance types as well as several Amazon EC2 Instance Storage Usage Scenarios can be found at Amazon's AWS documentation website.
First, create a mount point for each local instance storage volume necessary for your configuration according to EC2 instance type. This example will only include a single instance storage volume which is suitable for Small and Medium size instance types.
Next, create the fstab file for the image. Since Amazon uses Xen drivers, the first drive starts at /dev/xvde.
Small and Medium Instance Types
You can learn more about block device mapping at Amazon's AWS documentation website.
Create a grub configuration file for the image and boot settings so the Amazon Kernel Image (AKI) can boot into the new kernel.
Add the following entries to the image's sshd_config file in order to allow root login without password. This is helpful since I intend to use a private key to log in to the instance.
Create a script that captures the public key credentials for your root login from instance metadata. In this example, public key 0 (in the OpenSSH key format) is fetched from instance metadata using HTTP and written to /root/.ssh/authorized_keys in order to allow root to log in without a password using his private key.
Update the runlevel information for the new system service on the image.
Clean up the image.
Now that the image is created, it must be bundled using ec2-bundle-image which compresses, encrypts, and then spits it to prepare for upload to S3.
When we go to register the AMI with Amazon EC2 later in this guide, we must set the default kernel as one which supports the GRUB boot loader. To enable user-provided kernels, Amazon has published Amazon Kernel Images (AKIs) that use a system called PV-GRUB. PV-GRUB is a paravirtual "mini-OS" that runs a version of GNU GRUB, the standard Linux boot loader. PV-GRUB selects the kernel to boot by reading /boot/grub/menu.lst from your image which we configured earlier in this guide. It will load the kernel specified by your image (the CentOS 6.2 kernel) and then shut down the "mini-OS", so that it no longer consumes any resources. One of the advantages of this solution is that PV-GRUB understands standard grub.conf or menu.lst commands, which allows it to work with most existing Linux distributions. I'm going to bundle a default AKI with the image so that it is included in the manifest. I will then specify the AKI again when registering the AMI with Amazon EC2 later in this guide.
Several PV-GRUB AKIs are available depending on the type of the instance and the Region where it is located. There are AKIs for 32-bit and 64-bit architecture types, with each having one AKI for partitioned images and another AKI for partitionless images. You must choose an AKI with "hd0" in the name if you want a raw or unpartitioned disk image (most images). Choose an AKI with "hd00" in the name if you want an image that has a partition table.
Most vendors, such as Fedora, Red Hat, Ubuntu, and Novell, use unpartitioned disk images (a.k.a. partitionless images). This means that they use the hd0 variants of PV-GRUB; almost without exception most users will want to use the hd0 variants. The AMI created in this guide is an unpartitioned disk image and therefore will use an hd0 AKI.
Use the ec2-describe-images command (which is part of the EC2 AMI Tools) to check for the latest available published AKI for your instance type, disk layout, architecture, and Region.
From the above list, I am looking for a published AKI for an "unpartitioned disk image", "64-bit" architecture, in the "us-east-1" Region so I chose aki-88aa75e1.
We now have the information needed to bundle the image on the build machine to prepare for upload to S3.
Use ec2-bundle-image which is part of the EC2 AMI Tools.
The bundled AMI now needs to be uploaded to Amazon S3 before Amazon EC2 can access it. Use ec2-upload-bundle which is part of the EC2 AMI Tools.
The bundled AMI will be uploaded to Amazon S3 in a bucket specified using the --bucket parameter (i.e. idevelopment-amis/x86-64/Linux/CentOS/6.2). Amazon S3 stores data objects in buckets, which are similar to directories. Buckets must have globally unique names. The ec2-upload-bundle utility uploads the bundled AMI to the specified bucket. If the specified bucket does not exist, it is created. If the specified bucket exists and belongs to another AWS account, the ec2-upload-bundle command will fail.
Use the --manifest parameter to specify the full path to the manifest file created in the previous section. Note that the manifest file must be in a bucket in the same region where the AMI is to be created. The AMI manifest file and all image parts will be uploaded to Amazon S3. The manifest file is encrypted with the Amazon EC2 public key before being uploaded.
Include your AWS Access Key and your AWS Secret Key for S3 authentication using the --access-key and --secret-key parameters respectively.
The uploaded image now needs to be registered so that Amazon EC2 can locate it and run instances based on it.
Make certain to include the same AKI ID when using the ec2-register command that was used when bundling the image.
The ec2-register command returns an AMI Identifier (AMI ID), the value next to the IMAGE tag (ami-6418ba0d in the example). An AMI ID is a unique identifier for an individual image which is assigned by EC2 and used to run and manage instances.
You can now launch an instance of the new AMI using ec2-run-instances and specifying the image identifier (AMI ID) you received when you registered the image in the previous step.
Before launching an instance (and connecting to it), you will need an RSA Key Pair. Note that this is the not the same as your X.509 private key and certificate used earlier when authenticating to AWS. If you haven't already created a Key Pair for Amazon EC2, use ec2-create-keypair from the build machine. The public key will get stored by Amazon EC2 and the private key will be displayed on the console. Remember that Amazon does not store your private key information. If you loose your private key, you will not be able to access any EC2 instance for which its public key partner was used.
The following creates a new Key Pair named idevelopment-ec2-key. Note that the private key displayed below is not my actual private key. It is nothing more than a set of random characters and is only being shown to display example output. It is not valid. However, if you have a lot of time on your hands, give it a try!
The private key is returned to the console as an unencrypted PEM encoded PKCS#8 private key. Save the private key to any machine you will be using to connect SSH to the instance from. Copy and paste the contents of the private key from -----BEGIN RSA PRIVATE KEY----- to -----END RSA PRIVATE KEY----- including those two lines and save it to a keypair file. For example:
Create an AWS Security Group (if one doesn't exist) that will be used by the instance to authorize access via SSH (port 22) from the public internet 0.0.0.0/0 along with any optional services like ICMP (ping), HTTP (port 80), or HTTPS (port 443). You could further restrict access to these ports by specifying specific machines or networks (184.108.40.206/24 for example) that can access the instance instead of the public internet. At a minimum, authorize access for SSH.
Execute the ec2-run-instances command with the image identifier (AMI ID) that was returned by ec2-register in the previous section along with the instance type, key pair, and security group to launch the instance.
Select the Amazon EC2 instance type compatible with your configuration.
This will start a single instance (instance store-backed) based on your newly created AMI and provide you with an instance identifier, the value immediately to the right of the INSTANCE tag. The instance identifier can be used to monitor the status of the running instance and to confirm when the instance is available for access.
Check the status of the instance with ec2-describe-instances using the instance identifier.
It will take a while to start the instance, so don't panic when you can't ping or log into the new instance immediately; even after the status shows running. This is especially true if you decided to use SELinux as shown in this guide. The re-labelling will take approximately 2-3 minutes to finish and that happens long before networking and SSH are started.
Keep checking the instance description until the status returns running.
Finally, connect to the instance using an SSH client and the private key.
Verify local instance storage (ephemeral storage) and swap.
When you are done using the instance, it can be shutdown. Given this is an instance store-backed AMI, we can only terminate the instance which discards any changes made.
Terminated instances will remain visible in the AWS Console after termination for approximately one hour.
You have successfully built and deployed your very own custom AMI for CentOS 6.2, and launched an instance based on it. This custom AMI is private to your account. You can build as many custom AMIs as required and use them to launch as many instances as you need.
In the next section, I will provide instructions for converting the current instance store-backed AMI to an EBS-backed AMI.
As of yet, there is no simple API or button in the AWS Management Console that allows users to convert an existing Amazon EC2 instance store-backed AMI to an Amazon EBS-backed AMI. The process, however, is not terribly difficult and will be fully explained in this section.
The process for converting a Linux/UNIX Amazon EC2 instance store-backed AMI to an EBS-Backed AMI can be broken down into the following steps.
Now lets convert the instance store-backed AMI created in this guide to an EBS-backed AMI. Please note that the steps in this section can be performed using the AWS Management Console or the EC2 API Tools from the build machine. I will be demonstrating all of the steps using the EC2 API tools.
Use the same methods described in the previous section to launch the instance store-backed AMI.
Verify that the instance is running and take note of the Availability Zone assigned to the instance. In this example, the instance was assigned to the us-east-1b Availability Zone. You'll need this when creating the EBS volume in the next section.
Create a new Amazon EBS volume. For the purpose of this example, I will create a 10 GiB volume that will be used as the root disk for any new EBS-backed instance. The volume size of 10 GiB is being used because that is the largest size for an instance store-backed AMI. At the end of this section I will demonstrate how to increase the volume size at run time.
The creation process may take awhile depending on the size of the volume. The creation process for the volume will be finished when the status returns 'available'.
Attach the new EBS volume to the running EC2 instance store-backed AMI created earlier in this guide. The following will attach the volume and expose it as the specified device.
Verify the volume was successfully attached to the instance.
Log in to the EC2 instance store-backed AMI and create an ext4 filesystem type on the partitionless EBS volume.
Create a mount point directory and mount the EBS volume.
Remove any local instance storage entries from /etc/fstab if they exist. Booting from an EBS volume does not use local instance storage by default. If you followed this guide to create the instance store-backed AMI, remove the local instance storage entry in /etc/fstab which mounts on /mnt/vol1.
Sync the root and dev file systems to the EBS volume.
Label the disk.
Flush all writes and unmount the volume.
From the build machine, detach the EBS volume from the instance.
Verify that the volume was successfully detached and has a status of 'available'.
Create a snapshot of the EBS volume and provide an optional description for the snapshot.
The snapshot may take awhile to create. Check the progress of the snapshot creation until the status returns 'completed'.
Finally, delete the original EBS volume and register the new EBS-backed image with a block device mapping that maps the root device name to the previously created snapshot. Since this is a partitionless EBS volume with the same architecture and located in the same Region, I am able to use the same kernel AKI.
Use the image identifier (AMI ID) that was returned by ec2-register in the previous section along with the instance type, key pair, and security group to launch the instance.
Check the status of the new EBS-backed instance.
Just as before, connect to the instance using an SSH client and the private key.
Jeffrey Hunter is an Oracle Certified Professional, Java Development Certified Professional, Author, and an Oracle ACE. Jeff currently works as a Senior Database Administrator for The DBA Zone, Inc. located in Pittsburgh, Pennsylvania. His work includes advanced performance tuning, Java and PL/SQL programming, developing high availability solutions, capacity planning, database security, and physical / logical database design in a UNIX / Linux server environment. Jeff's other interests include mathematical encryption theory, tutoring advanced mathematics, programming language processors (compilers and interpreters) in Java and C, LDAP, writing web-based database administration tools, and of course Linux. He has been a Sr. Database Administrator and Software Engineer for over 20 years and maintains his own website site at: http://www.iDevelopment.info. Jeff graduated from Stanislaus State University in Turlock, California, with a Bachelor's degree in Computer Science and Mathematics.
Copyright (c) 1998-2015 Jeffrey M. Hunter. All rights reserved.
All articles, scripts and material located at the Internet address of http://www.idevelopment.info is the copyright of Jeffrey M. Hunter and is protected under copyright laws of the United States. This document may not be hosted on any other site without my express, prior, written permission. Application to host any of the material elsewhere can be made by contacting me at email@example.com.
I have made every effort and taken great care in making sure that the material included on my web site is technically accurate, but I disclaim any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on it. I will in no case be liable for any monetary damages arising from such loss, damage or destruction.
Last modified on
Monday, 28-Apr-2014 04:27:13 EDT
Page Count: 93858