Thursday, August 28, 2008

Cloud Computing: Setup Postgresql to use Amazon Elastic Block Store (EBS)

Cloud Computing: Setup Postgresql to use Amazon Elastic Block Store (EBS)

Prerequisites: Familiar with creating and launching instances using Amazon EC2, and familiar with editing config files in linux. Also make sure that Postgresql is installed on your instance, see installation steps at http://endurotracker.blogspot.com .

Overview

We will follow the same initial steps as outlined on Amazon’s developer website ( http://developer.amazonwebservices.com ). The latter steps will be specific to Postgresql.

1) Creating an Amazon EBS Volume

In this example, the user calls the CreateVolume API, specifying an 10 GB volume.

$ ec2-create-volume --size 10 --availability-zone us-east-1a

VOLUME vol-4d826724 858993459200 creating 2008-02-14T00:00:00+0000

$ ec2-describe-volumes vol-4d826724

VOLUME vol-4d826724 858993459200 available 2008-02-14T00:00:00+0000

2) Attach the Amazon EBS Volume to an Instance

In this example, the user calls the AttachVolume API to attach the volume vol-4d826724 to the instance i-6058a509 and expose it as the device /dev/sdh.

$ ec2-attach-volume vol-4d826724 -i i-6058a509 -d /dev/sdh

ATTACHMENT vol-4d826724 i-6058a509 /dev/sdh attaching 2008-02-14T00:15:00+0000

3) Describing Volumes and Instances

After creating Amazon EBS volumes and attaching them to instances, you can list them using the DescribeVolumes and the DescribeInstances functions.

To list all volumes owned by the user, including their status, the user invokes the DescribeVolumes function.

$ ec2-describe-volumes

VOLUME vol-4d826724 858993459200 in-use 2008-02-14T00:00:00+0000

ATTACHMENT vol-4d826724 i-6058a509 /dev/sdh attached 2008-02-14T00:00:17+0000

VOLUME vol-50957039 13958643712 available 2008-02-091T00:00:00+0000

VOLUME vol-6682670f 1073741824 in-use 2008-02-11T12:00:00+0000

ATTACHMENT vol-6682670f i-69a54000 /dev/sdh attached 2008-02-11T13:56:00+0000

The function returns the volume ID, capacity, status (in-use or available) and creation time of each volume. If the volume is attached, an attachment line shows the volume ID, the instance ID to which the volume is attached, the device name exposed to the instance, its status (attaching, attached, detaching, detached) and when it was attached.

The user can also view volumes that are attached to running instances by using the DescribeInstances function.

$ ec2-describe-instances

RESERVATION r-e112fc88 416161254515 default

INSTANCE i-3b887c52 ami-3fd13456 ec2-67-202-27-216.compute-1.amazonaws.com domU-12-31-38-00-35-94.compute-1.internalrunning gsg-keypair 0 m1.small 2007-11-26T13:20:35+0000 vol-4d826724

RESERVATION r-e612fc8f 416161254515 default

INSTANCE i-21b63c22 ami-3fd13456 ec2-67-202-18-227.compute-1.amazonaws.com domU-12-31-38-00-39-28.compute-1.internalrunning gsg-keypair 0 m1.small 2007-11-26T13:21:51+0000 vol-6682670f,vol-50957039

4) Create ext3 filesystem on persistence volume (Amazon EBS) and mount the volume

Command line: yes | mkfs -t ext3 /dev/sdh

Command line: mkdir /mnt/pgsql

Command Line: mount /dev/sdh /mnt/pgsql

5) Stop Postgresql if it is already running on your instance

Cmd line: service postgresql stop

6) Copy postgresql folders to /mnt/pgsql

On Fedora 8, the folder is /var/lib/pgsql, copy the contents of the folder to /mnt/pgsql.

7) Change the owner of /mnt/pgsql to postgres user, and initial db cluster

Command line: chown –R postgres /mnt/pgsql

su –postgres

initdb –D /mnt/pgsql/data

8) Edit postgresql startup script and change all entries from /var/lib/pgsql to /mnt/pgsql

command line: vi /etc/rc.d/init.d/postgresql

9) Configure postgres for local access (see previous postgres setup post) (you will need to edit /mnt/pgsql/data/pg_hba.conf and /mnt/pgsql/data/postgresql.conf)

10) Start postgresql

command line: service postgresql start

10) Enjoy !

5 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Thanks for this how-to!

    Question : Will this database stored on the EBS can be used simultaneously used by multiple EC2 instances each running in their postgresql server? Would this cause any issues?

    Regards,

    Sylvain

    ReplyDelete
  3. There is a limitation to EBS where you can only have a single instance connected to EBS at a time.
    The benefit of EBS is having persistent storage. But the storage can not be shared simulatenously.
    If you create a new instance and want the new instance to use your EBS , you will need to disconnect the old instance from EBS, and then connect the new instance to your EBS. If your application starts getting more and more traffic, you will want looking into replication using Postgresql.
    (Setup a Master db, and replicate to one or more Slave dbs). Each of these dbs can use EBS as well.
    Most large websites make use of replication extensively. Typically, scenario would be if US East Coast based, make Master DB in East datacenter, and setup Slave(s) on West Coast DataCenter. (with Geographic load balancing at the Web Site level)

    ReplyDelete
  4. @Dave - can you describe how the PUT/GET fees work when running Postgres? Obviously this depends on how much traffic you run, but I'm trying to get an idea for how much it costs to actually *run* postgres against an EBS-based drive. Any thoughts or advice?

    ReplyDelete
  5. I got this info from Amazon (specifically from my EC2 billing statement):
    Costs:

    A)$0.10 per GB-month of Provisioned Storage

    B)$0.10 per 1 Million I/O Requests

    C)$0.15 per GB-Month Snapshot data stored

    D) $0.01 per 1,000 puts (when saving a snapshot)

    I have provisioned 20GB in EBS storage so for item #1, it cost me $2 for 1 month.
    I did 13 million I/O requests which cost $1.32.
    If you do take snapshots of your data, it $0.15 per GB, last month I used 1.585 GB of snapshots for $0.24.
    I did 5,021 puts (when saving a snapshot) at $.01 per 1,000 puts for cost of $0.05.

    ReplyDelete