Commodity Hardware

According to last week's Economist, the United States Air Force are buying 2,200 units of Sony PlayStation 3. The idea is to get a cheap hardware (subsidized by SONY as their actual revenue comes from selling the games), install Linux and have a super-powerful cluster at 10% of the normal cost. I've had a similar idea last year - coming out the economic boom left me with number of spare notebooks and number of customer needing a server but with advancing recession with less and less money for a new one. Out of necessity I set up temporary servers using my spare laptops. While I did expect some problems - like failing hard drives, performance issues and so on... surprisingly enough 1 year later not one of those laptops/servers failed. Not only that - I even found the performance on a par with my other regular servers.

That got me thinking...according to the Moore's law the computing power is doubling fast enough to make last month's average laptop faster than last year's high end server. Over the years I've worked with number of servers and number of brands - from Dell, through HP, Compaq to NEC. Strangely enough, no matter how expensive the server (or any computer for that matter) - the only sure thing is that it will break. E.g. I use MacBook Pro 17" that cost me over $5,000. Yet, everything (with the exception of keyboard) has already failed and I spent on repairs more than the initial amount. Just the display with the motherboard was more than $4,000. The worst thing about it was that every time something happened the notebook spent at least a month (up to 2 months) in the workshop. My old Dell notebook broke down just as much and it used to be very frustrating to wait until the next business day for technician to come to my office and fix it. Having to actually go to the workshop, spend 1 hour there + 1 hour on the way and have your bloody expensive notebook taken for a month... Anyway, the point is that even my high end servers fail. I once had 4 out of 5 hot-swap SCSI drives fail at the same time. Another time just 3 of them did. At the end I found it much more important to prepare for the hardware failure than trying to prevent it.

Coming back to the cheap notebooks - you can get a reasonably good notebook with 4GB RAM for around $800. With Dell's next business day service (or even better the 4 hour service) you pretty much don't have to worry about it for the next 2 or 3 years. Should anything happen you can simply swap it for a spare one. This actually addresses whole set of points and issues I face with conventional servers. The installation and troubleshooting usually requires a senior person going onsite wasting at least half a day. The notebooks, on the other hand, can be pre-installed in the office and simply delivered to the customer by a junior staff who just needs to plug it to the network and everything is done. I still remember moving 4U servers (54kg each) a few years back. This obviously requires a proper backup strategy. The software part of which (i.e. what to backup and when and how and offsite strategy) is the same as on a regular server. For faster recovery in notebook servers I clone the drive to an external USB drive and have the backup copied also on this external drive (pgpool-II is really great for this). The cloning can actually be automated on monthly or weekly basis to have a reasonably fresh image on the external drive. If the drive fails (by far the most common thing on my servers as well as notebooks) all that needs to be done is to replace the drive. This takes a couple of minutes on my Dell and Compaq notebooks and even a non-technical person can do it. Once the system is back on-line, with everything installed and the backup conveniently on the drive, it takes less than 5 minutes to restore the database and deploy the latest revision of the system. And of course, can be done from anywhere.

I am not saying that notebook servers are fit for every situation - I am only saying that they're fit for most of our customers - situations where there's limited number of users (typically around 30 - 50) and most of them in the same location. While the internet connection for businesses in SG is as slow as it is expensive, it still still usable for even several offices accessing the system. I am still finding it hard to understand why in such a high tech place like SG it costs around $100 a month to get 100MB downstream/10MB upstream for residential connection yet it costs over $200 to get 1.5MB down/374kB for the office. There's so many initiatives for bringing internet to consumers - from free Wireless@SG, through iCell's lighting up East Cost Park and food courts, internet on the buses to great quality and dirt cheap 3G internet (I pay $20 for 50GB a month) - yet there is very little for the content providers. There's been a big push coming from government and SingTel for cloud computing. I haven't really heard of any local cloud provider worth mentioning - there are several offering either ridiculously high prices or ridiculously low specs or both. We've been using Rackspace's cloud servers ever since they became available - besides the lowest price they offer really great support - yet I've been moving more and more customers away to the notebook servers - from the peak of 19 servers I now only have 4 left. The main reason is actually price/performance ratio. My usual application requires (at least) 1GB memory instance the price of which comes to around S$70 (around 50 USD) which comes to around S$800 a year. The problem is that 1MB instance is sufficient only for average performance - it is not enough during peaks - and even small performance hiccups during peak result in great frustrations for the users. As S$1000 can buy you a 4GB Dell with next day on-site service for 2 years - it's quite compelling - especially when budgets are tight.

In the Cloud

On of the biggest pains in the ass with web apps is not really the development, changing technologies nor extremely demanding customers. It's actually hosting - seems like the easiest thing yet it's something that, most of the times, you have only a very little control over. Of course you still can put the server in the office but if you have multiple locations you need pretty good connection to support it. I've written countless times about problems with hosting in Singapore - from crazy prices to the total lack of support and overselling of capacities. With pretty much the only "reasonable" hosting provider here (Frro) going down I found myself in the middle of yet another server migration. It seems easy but installation of around 10 servers can be quite daunting :-).

Luckily (I so hope this luckily is going to last at least some time), there's a new kid in town - Rack Space Cloud. There's been so much buzz around the clouds over the past year or so (Amazon, Google, Microsoft), but none of them was really useable for hosting of rails applications. Rackspace Cloud offers Cloud Servers - they're quite similar to Slice Host's slices, however, come with some "cloud" extras. I don't really understand the technicalities of the processor computing power sharing on the cloud and as (I would assume) they don't have that many customers yet I cannot say how it's going to affect the performance.

What's great about the cloud servers, however, is the flexibility of other resources together with hourly billing. And the prices are currently more than reasonable. For the cheapest set up - 256 RAM and 10GB will cost you 1.5c per hour + the band with 8c per incoming GB and 22c per outgoing GB. Most of my apps will rarely hit transfer of more than around 5GB so my total costs works out to around $12 - 13 per month. The same set up on Slice Host is even $20. But it's really not just a price comparison. One of the main strengths of the cloud is flexibility to resize the servers as you need - increase the resources during peak hours/days and shrink it back off peak. This could be nicely automated with the API for Cloud Server that is being developed. Personally, I have mostly used this flexibility to test and compare different configurations - mod_rails vs. Lighttpd, Postgresql per instance vs. shared, various Postgresql settings and all this with different memory sizes. This experimentation actually led to some surprising (for me) results.

Another great feature is back ups. They allow you to create new servers from the back up images. What it means is that I have images with my typical configurations (mod_rails, Lighttpd, Tomcat + Apache, database server, etc.) and when I need to add a new server I just choose the image I need and within a few minutes the server is ready with everything I need, and all I have to do is add the new IP to hosts and configure the DNS. As we manage over 50 different applications, this comes extremely handy - not just for adding new ones but also to keep the existing ones in sync.

No matter how great the initial set up, things will always go wrong so I was quite curious about the support. Rackspace has been know for their great customer service ("Fanatical Support"). As the control panel is very new (about 2 weeks) there are still some glitches, however, I found the live chat always available and ready to help. For the "bugs" in the control panel - the support seems to be automatically notified and fixes everything within minutes. How I wish this was the case with Singaporean hosting where my server can be down for hours without any response from support. Enough bitching. Over the years I've grown to be very skeptical about hosting (and internet) providers. They usually start great but as soon as I prepay for several months everything goes down. Let's hope this one's going to be different :-)

Ruby on Rails Hosting

This is a problem I've been (surprisingly) fighting with ever since I've been working with Ruby on Rails. Even though the situation has improved tremendously over the past 2 years it's still far from ideal. I am not sure if it's because insufficient demand or the increased complexity compared to hosting HTML or PHP the big hosting companies don't show much interest in Ruby hosting. Sure they list it as one of the options but the offer and support is flaky at best, many times offering only 1 rails application per account or missing ways to restart FCGI processes. Not surprisingly rails developers had to take the matter in their own hands and as a result almost any RoR hosting worth considering was started like that. I am not sure how much long term market sense it makes but looking at waiting time at slice host it really seems to be working for now.

When choosing RoR hosting the main decision you have to make is shared hosting vs. VPS. Shared hosting is usually cheaper and easier to set up but you're sharing the server with many other users and it takes only one bad neighbor to make the whole server unusable. VPS, on the other hand, is slightly more expensive and you have to set up everything your self. That gives you a lot of power and flexibility but you have to be able to configure Linux server (it's really no rocket science – there's plenty of How-To-s on the net – including this blog).

Shared Hosting

A few things to watch out for when choosing shared hosting:

  1. FCGI support – most of the hosts offer FCGI (but there are still many that offer only CGI) including an easy way of restarting FCGI processes. You really don't want to open up a support ticket any time you update you application and especially not when you fix a very urgent bug.
  2. 24/7 Support – while 24/7 support is claimed to be a standard in hosting industry I am yet to really find the host with this kind of support. That is not to say that the support doesn't matter – it really does and even if not 24/7 you should really test them out first – open up tickets outside of business hours, on public holidays, etc. and see the response time as well as helpfulness and professionalism. Many big hosting companies outsource their support offshore and all you'll get outside the working hours is “I will get a senior technician to look into the problem”. This is especially important if you're not in different timezone.
  3. Number of applications you can host under one account. This includes number of domains, number of subdomains as well as number of databases that you can create. You don't have to worry that much about the actual limits of the server here because if you're not going to utilize the resources it doesn't make them available during peak, it just means that somebody else will.
  4. DNS management tools / support. This is not so important if you have 1 application but as the number increases you will have to take care of multiple domains and multiple subdomains under each domain. With the standard shared hosting you will usually only get cPanel that lets you maintain only domains/subdomains hosting on the same server. Reseller account should come with WHM that has proper DNS management tool. Of course you have always option to host your DNS elsewhere.
  5. Disk space / bandwidth limits. Usually not a problem with U.S. hosting but most of the hosting companies in Asia still offer 100MB accounts.
  6. SSH. It's very hard if not impossible to set up your rails application without SSH access and yet I've seen several hosts (mostly in Asia) to offer ruby hosting without SSH. When I asked how to install the application they asked me for step by step instructions :-). I really don't think you want to do that.
  7. SVN hosting – not crucial but a very nice bonus. It's beneficial even if you're a sole developer as it makes your repository available online. RailsPlayground.Com even bundles this with Trac.

Looking at the list seems like there's quite a few things to watch out for. From what I've used the best seems to be the RailsPlayground.Com. They have a very reasonable reasonable support, very few limits and offer quite interesting packages. The downside is that the servers do get overloaded sometimes and then they kill off your processes. This is really bad as the user will get a Rails Application Error but the Exception Notifier will not generate anything and there won't be anything in the logs. I've had several other issues there – longer HTTP POST will generate application error – again without any trace in logs and I had some intermittent problems with file upload / download. Other then that I would recommend them as most probably the best RoR shared hosting out there.

VPS Hosting

VPS experience a stellar launch to popularity over the past year solving most of the problems of shared hosting for only slightly higher price. It had an easy job replacing dedicated hosting offering roughly the same but for 10 times higher price.

Here are some things to watch out for:

  1. 24/7 Support – even though it's all maintained by you and you're much less dependent on support, there still will be times when your VPS doesn't come back after restart or doesn't respond due to some runaway processes.
  2. Choice of Linux distributions – many hosts offer a selection of distributions like Ubuntu, Debian, RedHat, etc. You just choose your desired flavor from the menu and it's automatically installed for you. It's really helpful when you have experience with only one Linux flavor or to synchronize your installations when you have multiple servers at several hosting companies.
  3. Memory / Price ratio. The only important resource when choosing VPS is memory. Most of them come with sufficient disk space and run on multiprocessor machines but provide only limited memory. Anything below 256MB is not worth considering. You shouldn't pay more then US$ 29 for 256MB and usual price is around US$ 20. Some hosts provide burstable memory which means you can go over your memory limit if nobody else is using it. This can be very helpful during random hit surges or when processing memory extensive tasks. Be very careful as some of the hosts will mercilessly cut off any process that goes about the memory limit causing rails application error without any trace in logs (otherwise great VPSLink.com does that). Another thing to watch out for here is SWAP. While not ideal, it can save your ass during peak requests. Some hosts don't allow any SWAP which will cause your application to crawl when the memory limit is reached.
  4. Scalability – check how easy it is to upgrade your account – either to increase memory or to add on another server. When number of users increases, adding another server is many times the only option to scale your application.

One of the best VPS providers is Slice Host providing all of the above for the lowest price on the market. So far, I've experienced only one short downtime. You can choose your Linux distribution, reinstall everything within a few minutes they have no nonsense policies, upgrades in both directions are painless. The only downside is a long waiting list if you're a new user. Another great host is Rose Hosting. They offer burstable memory and used to have very competitive pricing. I couldn't find any way to add another option to add servers to my account.

Some general things to watch out for:

  1. cancellation policy. It's very important to read and understand it as most of the hosts have ridiculous cancellation policies – like you have to cancel at least 3 months ahead, or only 10 days before the end of the month or only on Monday, Wednesday and Sunday 3 – 4 am. Also, money back guarantee is much more a dream then reality.
  2. Server location. Many people believe believe that the closer the server is to them the faster it is. I've heard many scientific explanations to this, but based on my experience there is usually no difference in access time unless it's on the same subnet (i.e. same provider) which is hardly a case. There is so many other factors (like aggregation, PC speed, last mile connection) that will affect the actual speed that for me it doesn't make any difference in speed for my servers located in the U.S. and servers located in Singapore. The problem with local (Singaporean, Malaysian :-) hosting is that it's several times more expensive, provides several times less resources (like space, bandwidth, memory, etc.) and only 9 – 5 support. I believe this is due to lack of local competition, lack of market awareness and undue local patriotism. Anyways, ...
  3. Backup – some hosts provide automatic backup for very reasonable price (e.g. Slice Host offers images for US$ 5). While most of the hosts claim to have auto backup it happened to me several times that it took them 2 days to retrieve this backup. As such you should think of your own back up strategy – one of the ways is to use Amazons AWS. Most of our customers require direct access to back up files and recovery within 30 minutes (this means that no matter what happens we have to be able get back online within 30 minutes). Another issue is frequency of back up – most of the auto back up is daily, which is far too little for any production application. Our standard is hourly backup with possibility to increase this during peak hours.