Clustering, scalability and performance
I’ve used ipvs (http://www.linuxvirtualserver.org/software/ipvs.html) effectively on a few sites for clients – it’s more scalable than using reverse proxies.
It’s a handy & fast and efficient way to:
· load balance
· manage traffic to the cluster (allowing for transparently bringing servers online/offline, migrations)
· firewall the cluster and back-end services
I’m not a fan of moving the ORM layer – in terms of bang/buck it’s just not efficient or cost effective.
Logical separation is more important that physical separation. It’s enough to use a dedicated db server and optimise the machine for the purpose.
One of the hardest problems is dealing with assets in dynamic sites – images, movies, etc – when you have multiple servers.
Shared file systems (ie NFS) just doesn’t cut it. For a couple of clients I’ve used libs built on fuse but OS support can be patchy.
The handy thing about fuse is that you can use it fairly easily in conjunction with CDN’s but planning the financials is complex – and it’s something you need to consider in your architecture.
Fuse in local mode is easy to setup, scalable, fault tolerant and fast – most hosting providers have gigabit local network connections (and local network traffic isn’t billable). There’s a couple of hosting providers that implement local CDN’s which they make available for clients – but these are few and far between.
Most projects I tend to recommend the VPS route rather than dedicated machines. It’s cost effective, allows for growth, machines can be provisioned in minutes rather than days and you can respond quickly if traffic increases. Allocating/de-allocating extra resources is usually just a few clicks away and if you start exceeding your optimal utilisation you can provision another machine, hot tweak the IPVS table and you suddenly have another machine serving your users. Good hosting providers even have API’s that allow your application to adjust its resources up/down from within the app. If it’s a short term spike (due to a promotion, press, etc) then when things calm down a few days later – you can hot tweak the ipvs table, un-provision the machine and hey presto – you’ve only incurred costs for the duration. Implementing this type of approach means you need to understand when it’s best to scale up and when you scale out – and it’s hard to determine what strategy to take until you have optimised your environment and have accurate metrics about how your application performs. With this you can set thresholds and with monitoring in-place the application can notify you when these are exceeded.