Hosting on Amazon EC2

18.Dec.2009

BEING WRITTEN

I just started looking at Amazon's EC2 solution to host my website.
What it offers seems to be quite good but right at the beginning it's not so easy to understand (new terminology and an on-demand approach for the usage of the infrastructure) when you perhaps just want to have an idea of what are the services that you need and how much that would cost.
By telling you my story I hope to make you able to grasp the concepts quickly and effortlessly.


Introduction

First of all what I understood about Amazon's EC2 is that it practically offers the infrastructure on which you can store data and run your own server(s). Therefore, when I think about EC2, I think of it being more or less a hosting provider, just on a bigger scale. This description is probably quite shaky, but it should give you a first idea of what it is.

Still thinking about hosting providers, Amazon's EC2 differs from many of the typical shared/dedicated/root server offers in the areas of availability and/or cost allocation and/or support and/or flexibility:

Costs
You pay for the resources you use. You don't pay for them if you don't use them.
E.g. if your server is up and running you pay the hourly rate for the instance, otherwise if it is shut down you have no such costs.
Same for the storage space - you pay if you allocated some storage or you're generating I/Os but stop paying for it once you drop it and/or it is not accessed anymore.
Same for the bandwidth - you pay only for the data that you transfer. And so on... .
It looks like Amazon is offering the option to privates and firms to win the battle against fixed costs. Ok, it might increase unemployment in that area and make you dependent on Amazon - I'll leave it to you to evaluate the impact that those factors have :o)

Availability
Whenever you request an instance, an external IP-number, some storage space etc..., they are available immediately. No technician runs in the background to set up things whenever you order them.

Support
You're on your own.
Amazon has of course support for their own infrastructure, but as the software (instance) that is running belongs to you and only you know what you're doing with it, you have to rely on your own.

Flexibility
On one side the allocation of costs on a on-demand base is very handy at the beginning when you want just to have a look at it and experiment with it.
Yesterday it took me 2 hours to read the documentation (which I hope will last less with the help of this wiki), 20 minutes to create the login (ok, confirming the account using my handy just didn't want to work - it finally worked when I used the landline ;o) ) and 15 minutes to choose the virtual HRDW configuration for my test OS and start the instance. Yesterday's cost for the 2-hours run was 0.24$ - brilliant, isn't it?!
On the other side once you know the basics you have the potential to quickly scale the system up if needed and e.g. to eventually build an own server farm with loadbalancer etc... - all without having to wait for somebody to allocate HRDW or start internal processes. And don't forget that you can even upload your own homemade OS-images, as it will most probably be my case.




First contact

As it's so cheap, probably the best thing to do to know how it works & reacts is to create and start a test instance as I did.
So, once you registered yourself (if you already have an Amazon account, the EC2 account will be the same one but you'll still have to register yourself in order to be able to use the service) you could:

  • create an instance type on-demand (that way you won't have almost any fixed costs)
  • choose one of the available OS (I chose the normal Fedora image)
  • download the ".pem"-certificate that you'll need when connecting to the instance using SSH
  • go to the AWS management console, select your instance and boot it.

That's it.

Once the instance is up you can ssh to it with:

ssh -i <mycertname>.pem root@<my-public-DNS>

Both "<mycertname>" and "<my-public-DNS>" are visible in the AWS management console.

Tests
I didn't resist running immediately some tests to see how weak or powerful the instance was. I admit that they aren't too intelligent, but they might give you at least a first impression:
* Notebook:

Intel Core 2 Duo T7500 running at 2.4Ghz, normal 2.5 HDD
  • Amazon instance:
Small server with 1 EC unit, root filesystem (on which I ran the disk test) mounted on EBS volume.
NbrDescriptionResult notebookResult EC2
1Write 1GB of data to filesystem (ext3)~25 MB/s48 MB/s
2Do computations in the Bash shell12 s.37 s.
3Do C++ computations using long12 s.26 s.
4Do C++ computations using double18 s.967 s.


Remarks:

  • I was astonished to see that the filesystem I/O is actually quite good.
  • I won't use EC2 for any floating-point double-precision computations ;o)
  • For all the computation tests I measured the elapsed time using the "time"command. Quite interesting that while the "real" and the "user" time on the notebook were always almost the same, on Amazon's server they differ almost always by ~3x. That's probably due to the underlying Xen distributing CPU resources as well to other instances (damn! How do they dare?! ;o) ).
  • The integer computations weren't bad compared to the notebook.

I executed the above tests with the following hand-made scripts:

  1. time $(dd if=/dev/zero of=testdisk bs=32k count=32768 && sync)
  2. <replaceme>!/bin/bash
    for ((a=1; a <= 1000000 ; a++))
    do
    
            ((a = a + 5))
            ((a = a - 5))
    
    done
    exit 0
  3. #include <iostream>
    using namespace std;
    int main()
    {
            long a, b, c, d;
            a=2;
            b=3;
            c=5;
            d=0;
    
            for (int i = 0 ; i < 1000000000 ; i++)
            {
                    d += a;
                    d *= b;
                    d -= i;
                    d += i - c;
    
            }
            cout << "Result: " << d << endl;        
    
            return 0;
    }
    
    Compile with "g++ <yoursourcecodefile>.cc -o runme".
  4. #include <iostream>
    using namespace std;
    int main()
    {
            double a, b, c, d;
            a=2.454;
            b=3.23423;
            c=5.5342;
            d=0.0;
    
            for (int i = 0 ; i < 1000000000 ; i++)
            {
                    d += a;
                    d *= b;
                    d -= (double)i;
                    d += (double)i - c;
    
            }
            cout << "Result: " << d << endl;        
    
            return 0;
    }
    
    Compile with "g++ <yoursourcecodefile>.cc -o runme".



Fixed IP addresses

Now, if you reboot/stop the instance you'll probably see that that "Public DNS" changes each time. Therefore one of the first questions you might have is how to be able to get the classical fixed IP number that you can map in your DNS configuration. Amazon's fancy answer is "Elastic IPs".

Elastic IPs are actually more powerful than normal fixed IP numbers as they can be dynamically mapped to any running instance you have, but in my case I only use it as the classical entry point of my website.

WARNING: Costs for Elastic IP are charged even if the instance is not running - this because IPv4-addresses are scarce. If you don't want to pay for it when the instance is down you'll have to disassociate and delete it from the management console.


EMERGE SYSLOG-NG!!!
2bwritten:

hp ~ # #libselinux
hp ~ # #FATAL: Could not load /lib/modules/2.6.21.7-2.ec2.v1.2.fc8xen/modules.dep: No such file or directory
hp ~ # #Failed to load loop
hp ~ # 
hp ~ # #Failed to load fuse
hp ~ # #dhcpd?
hp ~ # #rm -R bin ec2-ami-tools.noarch.rpm etc home lib misc net opt root sbin srv tmp usr var
hp ~ # #what I left was: boot  dev  lost+found  mnt  proc  selinux  sys
hp ~ # #cp -xvaRP /bin /etc /home /lib /opt /root /sbin /tmp /usr /var /mnt/fedora/
hp ~ # #sync
hp ~ # #rebooted
hp ~ # #had to delete in my own known_hosts the key of the server
hp ~ # #created swap: dd if=/dev/zero of=swapfile.img bs=64k count=8192
hp ~ # #mkswap swapfile.img
hp ~ # #swapon swapfile.img
hp ~ # #REMINDER: modify fstab!!