In this first post, of a series on Ruby’s EventMachine, I will introduce EventMachine and explain why event-based programming is good for your wallet.
EventMachine, which just turned 1.0.0 this week, is more than just a gem, it is a new paradigm for many Ruby programmers and is not always easy to just drop into your existing stack. As the name suggests, it gives you event-based programming.
I have been doing event-based programming for several years now, in Perl (Event::Lib and POE), Python (Tornado), C (nginx) and now Ruby (EventMachine) and I think it is the way to get the most out of your server’s resources.
For the sake of example, I’m going to side-step and look at a switch that you may have already made, from Apache to Nginx. If you have switched over from Apache to Nginx and thought, “hey, this thing has a much smaller footprint than Apache”, then you have touched one the main of benefit of Nginx… it is built using event-based programming.
With Apache, if a new client connects, you need a new process, or at least use of process that another client is not using. If the client is slow in sending its HTTP request, then your process will be consumed until that client has finished its role in the request-response cycle. When other client connections come in, they either have sit in a queue, waiting, or new Apache processes are fired up (or used from a pool) to handle the connections.
Therefore, with Apache, your throughput is governed by the number of Apache processes you can run simultaneously divided by the average time it takes to process requests. The number of Apache processes you can run simultaneously is dependant on the RAM you have on your machine and RAM is expensive. The average time it takes to process requests is partly dependant on you (how long does your application server take to process the requests) and your users (how slow are they sending and receiving the HTTP data). The second one, your users, is not under your control. Anytime you hit things that are not under your control and affect your systems performance, you are on shaky ground and you make denial-of-service attacks so much easier.
When Nginx receives a new client connect it simply creates a new socket and not much more. Therefore, the same process can potentially handle tens of thousands requests. Instead of blocking the process, while waiting for data to be sent or received from the socket, it creates a callback and says to the operating system, “hey, let me know when some more data comes in on this socket. I’ve got other things to do”. By doing this the process is free to do anything else until something new happens with that socket. In this way, Nginx’s performance is unaffected by a few slow clients. Whether a client is fast or slow, Nginx uses approximately the same system resources, whereas Apache is holding onto resources that might be in a dormant state.
Out of the box, Rails is much like Apache. It is a single process and any work you do, such as communicating with the client over HTTP or talking to database, is done in a blocking manner. The process sits and waits for I/O responses and nothing else can get in during that time.
Thin is a Ruby web application server built by Marc-André Cournoyer. Thin is built using EventMachine. Things like client connections are handled in an event-based manner. Therefore, if you have used Thin, you have used EventMachine.
If you have unwittingly used EventMachine, via Thin, then it is likely that you have also severely degraded Thin’s potential performance.
Within an event-based application, everything must be non-blocking to get the real benefit of event-based programming. The event-loop is running in a single thread within a single process. Therefore, the first time you do something that is blocking, it causes the whole event loop to stop and wait for you. That means while you are talking to you database for “just 10 milliseconds”, nothing else happens. You would be surprised at how many operations you can perform in those 10ms of halting your entire event loop. In that time, potentially hundreds of connections, that only hit RAM, could have come and gone, but instead they get queued waiting for your blocking action.
The perfect event-loop is one which passes every blocking request to the operating system, via the event library. This basically means any time you touch the disk or a network network, but also includes talking other processes.
Achieving a perfectly non-blocked event-loop is hard in practice. The packages you generally use are not designed this way, and finding event-friendly alternatives is hard. This is the biggest issue I have hit with every language I done event-based programming in.
As I mentioned briefly at the beginning of the post, event-based programming is good for your wallet. This is due to the fewer processes you need to run and how much you can squeeze out of each process. Less processes equals less RAM and RAM is the thing you will often find is the biggest cost in firing up servers. This is especially true if you work with IaaS archiectures such Amazon EC2 or Heroku.
This is first of several blog posts I will be writing on EventMachine. In future posts, I’ll be hitting on topics such as talking to your data-store, connecting to 3rd party services (such as Twitter), shelling out to other processes and running background tasks. Most importantly, how to do all this in a non-blocking event-based way, using EventMachine inside you Rails application.
If there is a particular topic that you are interested in, then please leave a comment below.
Other posts in this series