Hosting on Amazon EC2
18.Dec.2009
BEING WRITTEN
I just started looking at Amazon's EC2 solution to host my website.
What it offers seems to be quite good but right at the beginning it's not so easy to understand (new terminology and an on-demand approach for the usage of the infrastructure) when you perhaps just want to have an idea of what are the services that you need and how much that would cost.
By telling you my story I hope to make you able to grasp the concepts quickly and effortlessly.
Introduction
First of all what I understood about Amazon's EC2 is that it practically offers the infrastructure on which you can store data and run your own server(s). Therefore, when I think about EC2, I think of it being more or less a hosting provider, just on a bigger scale. This description is probably quite shaky, but it should give you a first idea of what it is.
Still thinking about hosting providers, Amazon's EC2 differs from many of the typical shared/dedicated/root server offers in the areas of availability and/or cost allocation and/or support and/or flexibility:
Costs
You pay for the resources you use. You don't pay for them if you don't use them.
E.g. if your server is up and running you pay the hourly rate for the instance, otherwise if it is shut down you have no such costs.
Same for the storage space - you pay if you allocated some storage or you're generating I/Os but stop paying for it once you drop it and/or it is not accessed anymore.
Same for the bandwidth - you pay only for the data that you transfer. And so on... .
It looks like Amazon is offering the option to privates and firms to win the battle against fixed costs. Ok, it might increase unemployment in that area and make you dependent on Amazon - I'll leave it to you to evaluate the impact that those factors have :o)
Availability
Whenever you request an instance, an external IP-number, some storage space etc..., they are available immediately. No technician runs in the background to set up things whenever you order them.
Support
You're on your own.
Amazon has of course support for their own infrastructure, but as the software (instance) that is running belongs to you and only you know what you're doing with it, you have to rely on your own.
Flexibility
On one side the allocation of costs on a on-demand base is very handy at the beginning when you want just to have a look at it and experiment with it.
Yesterday it took me 2 hours to read the documentation (which I hope will last less with the help of this wiki), 20 minutes to create the login (ok, confirming the account using my handy just didn't want to work - it finally worked when I used the landline ;o) ) and 15 minutes to choose the virtual HRDW configuration for my test OS and start the instance. Yesterday's cost for the 2-hours run was 0.24$ - brilliant, isn't it?!
On the other side once you know the basics you have the potential to quickly scale the system up if needed and e.g. to eventually build an own server farm with loadbalancer etc... - all without having to wait for somebody to allocate HRDW or start internal processes. And don't forget that you can even upload your own homemade OS-images, as it will most probably be my case.
First contact
As it's so cheap, probably the best thing to do to know how it works & reacts is to create and start a test instance as I did.
So, once you registered yourself (if you already have an Amazon account, the EC2 account will be the same one but you'll still have to register yourself in order to be able to use the service) you could:
- create an instance type on-demand (that way you won't have almost any fixed costs)
- choose one of the available OS (I chose the normal Fedora image)
- download the ".pem"-certificate that you'll need when connecting to the instance using SSH
- go to the AWS management console, select your instance and boot it.
That's it.
Once the instance is up you can ssh to it with:
ssh -i <mycertname>.pem root@<my-public-DNS>
Both "<mycertname>" and "<my-public-DNS>" are visible in the AWS management console.
Tests
I didn't resist running immediately some tests to see how weak or powerful the instance was. I admit that they aren't too intelligent, but they might give you at least a first impression:
* Notebook:
- Amazon instance:
| Nbr | Description | Result notebook | Result EC2 |
|---|---|---|---|
| 1 | Write 1GB of data to filesystem (ext3) | ~25 MB/s | 48 MB/s |
| 2 | Do computations in the Bash shell | 12 s. | 37 s. |
| 3 | Do C++ computations using long | 12 s. | 26 s. |
| 4 | Do C++ computations using double | 18 s. | 967 s. |
Remarks:
- I was astonished to see that the filesystem I/O is actually quite good.
- I won't use EC2 for any floating-point double-precision computations ;o)
- For all the computation tests I measured the elapsed time using the "time"command. Quite interesting that while the "real" and the "user" time on the notebook were always almost the same, on Amazon's server they differ almost always by ~3x. That's probably due to the underlying Xen distributing CPU resources as well to other instances (damn! How do they dare?! ;o) ).
- The integer computations weren't bad compared to the notebook.
I executed the above tests with the following hand-made scripts:
time $(dd if=/dev/zero of=testdisk bs=32k count=32768 && sync)<replaceme>!/bin/bash for ((a=1; a <= 1000000 ; a++)) do ((a = a + 5)) ((a = a - 5)) done exit 0#include <iostream> using namespace std; int main() { long a, b, c, d; a=2; b=3; c=5; d=0; for (int i = 0 ; i < 1000000000 ; i++) { d += a; d *= b; d -= i; d += i - c; } cout << "Result: " << d << endl; return 0; }Compile with "g++ <yoursourcecodefile>.cc -o runme".#include <iostream> using namespace std; int main() { double a, b, c, d; a=2.454; b=3.23423; c=5.5342; d=0.0; for (int i = 0 ; i < 1000000000 ; i++) { d += a; d *= b; d -= (double)i; d += (double)i - c; } cout << "Result: " << d << endl; return 0; }Compile with "g++ <yoursourcecodefile>.cc -o runme".
Fixed IP addresses
Now, if you reboot/stop the instance you'll probably see that that "Public DNS" changes each time. Therefore one of the first questions you might have is how to be able to get the classical fixed IP number that you can map in your DNS configuration. Amazon's fancy answer is "Elastic IPs".
Elastic IPs are actually more powerful than normal fixed IP numbers as they can be dynamically mapped to any running instance you have, but in my case I only use it as the classical entry point of my website.
WARNING: Costs for Elastic IP are charged even if the instance is not running - this because IPv4-addresses are scarce. If you don't want to pay for it when the instance is down you'll have to disassociate and delete it from the management console.
EMERGE SYSLOG-NG!!!
2bwritten:
hp ~ # #libselinux hp ~ # #FATAL: Could not load /lib/modules/2.6.21.7-2.ec2.v1.2.fc8xen/modules.dep: No such file or directory hp ~ # #Failed to load loop hp ~ # hp ~ # #Failed to load fuse hp ~ # #dhcpd? hp ~ # #rm -R bin ec2-ami-tools.noarch.rpm etc home lib misc net opt root sbin srv tmp usr var hp ~ # #what I left was: boot dev lost+found mnt proc selinux sys hp ~ # #cp -xvaRP /bin /etc /home /lib /opt /root /sbin /tmp /usr /var /mnt/fedora/ hp ~ # #sync hp ~ # #rebooted hp ~ # #had to delete in my own known_hosts the key of the server hp ~ # #created swap: dd if=/dev/zero of=swapfile.img bs=64k count=8192 hp ~ # #mkswap swapfile.img hp ~ # #swapon swapfile.img hp ~ # #REMINDER: modify fstab!!