The Limits to Scaling Ruby

When discussing whether a web technology "scales", usually the conversation alternates between two basic questions:
  1. Does the implementation perform acceptably? Does it continue to perform acceptably as load is increased in orders of magnatude?
  2. Can the implementation be maintained and enhanced?
To show that a technology "scales", one must illustrate how both questions can be answered at the same time without sacrificing one for the other.

The simplest way to run Ruby from a web server to use CGI. Unfortunately, with CGI applications each additional request loads a new interpreter with a new instance of the program. This leads to the notoriously bad performance of CGI under increased load.

Ruby has several ways of serving web pages which offer better performance:
  • mod_ruby: embeds the interpreter into the Apache server
  • fastcgi: using the FastCGI protocol, scripts run in a loop and hang around to serve multiple requests
  • webrick / mongrel: 100% ruby web servers. Often these are integrated with a major web server like Apache or IIS via proxying.
Note that each of these implementation persists an interpreter across requests:
  • mod_ruby: one interpreter per Apache process
  • fastcgi: one pool (1-10) of interpreters per script
  • webrick / mongrel: one interpreter per "site", but different proxying strategies can lead to the interpreter being shared in various ways.
When the same interpreter is used across different requests, there are important consequences to the programmer:
  • require 'lib'
    The first call is processed, and the rest are skipped. If you are dynamically adjusting your load path, the question becomes: Which 'lib' was loaded?

  • overloading core methods
    Can one be sure that adjustments to the Array class are ok for every instance of an Array on the site?

  • global variables leading to memory leaks
    If it is possible for variables to be referenced between requests, these references can prevent garbage collection and lead to memory leaks
These concerns illustrate that when scaling ruby, your scripts have become an Application [1]. They also illustrate the difficulties that ISPs face when hosting Ruby, since for security reasons an ISP must force "Share Nothing" behaviour between users.

Scaling Ruby to provide performance forces a change in development mentality. This transition from writing scripts to writing applications increases the cost of maintenance. For this reason, I do not think that Ruby scales on the Web [2].

Luckily, I consider this to be a failure of the current Ruby web implementation, not a fact of the web development problem space. We have examples which can be followed.

The commercial web scripting languages (ASP and ColdFusion) allow casual "scripts" to perform well. A bit more useful example, PHP has implemented a web scripting language which performs acceptably without requiring developers to go into "Application Mode."
  • All caching is performed by the implementation under the covers. It is impossible for hosted code to share anything between requests.
  • Every request is given a fresh and identical operating environment, regardless of webserver.
It's worth dwelling on PHP for a moment longer. In PHP, it is possible for a novice to deploy multiple large-scale multi-developer projects (PHP-BB, WordPress, Drupal) on the same $4/month shared hosting environment.

It is imperative for the future of Ruby that we recognize how the implementation of PHP has lead to dominance of the Web scripting market [3]. Ruby must outgrow its Perl roots and discard the notion that large-scale web programming requires the "sobriety" of Application development.


Footnotes
  1. Note that the change in developer mentality has also been observed in mod_perl development.
  2. There's another insiduous property to Ruby's implementation. Not only does the implementation push developers to think in Application Mode, it also pushes developers to choose one Application Framework per web server. The forthcoming arguments of Rails vs XXXX vs XXXX will be a waste of time.
  3. Even though the (easy) comparison to Java is a seductive angle, it is important to note that every other environment also shines when compared to Java. Tcl, Python, and now even mod_perl are niche markets. Ruby will remain a popular novelty unless it compares itself to the real competition.



~ Patrick May