Rehash of scaling limits discussion
This is a rehash of something I've written before, but I keep returning to this issue.
I wrote this in response to the ongoing post about PHP. This is in follow up to the pro-PHP rant closing the discussion.
Harry Fuecks is on target when he points to the execution model, but I think I can get more specific about two ways that PHP's execution model has made it successful:
Fuecks points out how the execution model eliminates hard to debug user land memory leaks. He also points out how this makes PHP servers easier to administer.
I'd like to also point out that execution model has no dependency on a particular web server scaling model. This allows PHP to be ported to almost every existing web server.
By making PHP an easy business to manage, and by eliminating server/os specific niches in the hosting market, PHP hosting has become a cheap commodity. It's basically free.
It's Still a Scripting Language
Moving on to the rant. Basically, I'd like to raise awareness of this single point:
Aside from dynamic typing, I believe that the most distinctive feature of general purpose scripting languages (perl, python, ruby, etc) is the support for writing programs of a limited scope. I can write a script confident that it will only affect the problem at hand. For all inputs? Who cares, it only needs to work for this input.
Using ruby as my most familiar example, in a basic command line script I have a range of scopes available:
+ Create a ruby file (script)
+ Dynamically decide how to setup globals (per invocation)
+ Require a library in my home directory (user)
+ Use some standard libraries (ruby)
+ Install some gems (machine)
Just by installing ruby I can write little scripts focused on my problem while leveraging shared code from the larger community. And I am in control of when I pull in successively "more" global code.
But when I go online, I lose some of this control. The most common means of scaling general purpose scripting languages is to simply span the interpreter across multiple requests (with various tricks). I believe that this lazy route causes a loss of "scripting language" features. Continuing to use ruby as an example:
CGI: Create an interpreter per invocation
CGI has the same available scopes as the home directory, except that even a trivial web site will cause more invocations than one would see with a commandline script. It doesn't matter that I still have all these scopes available because the server dies of load.
FastCGI: Create 1 to 10 interpreters per script, reuse the interpreters
- limit on the number of scripts
All those interpreter instances will consume between 5 and 10 megabytes of memory. That overhead can add up, and is enough to discourage me from creating many scripts. Unfortunately, this loses a nice feature of scripting -- the ease of forking.
- per invocation scope lost
I get to worry about memory links since the interpreters are re-used. Another feature of scripting lost. Plus, I can't do any dependency injection tricks b/c I need to be careful about setting globals per invocation. This is 'free' in PHP and in commandline ruby, but for ruby online it's another feature of scripting lost.
Mod Ruby: Create 1 interpreter per web server process
- script scope lost
- per invocation scope lost
- user scope lost
Mod Ruby is quite bloody. Since the same interpreter is used across any requests to the web server, you lose scopes smaller than the machine. All development happens at the scope as the web server, which is usually on the machine level. You have the memory leak / globals issues of FastCGI made worse by the fact that now unrelated scripts will interfere with each other.
This doesn't sound like scripting at all to me :-)
Conclusion
As far as I'm concerned, ruby is not a web scripting language. Neither is perl, python, or tcl. Once used online, scaling these languages turns them into a variant of SmallTalk, i.e. a dynamically typed systems language. PHP is the only open source scripting language of which I am aware.
It's worth noting that Coldfusion (another horrid language with in-comprehensible success) does a good job of separating requests from each other. While there is a less than obvious auto-include of application.cfm files, requests are independent.
I wrote this in response to the ongoing post about PHP. This is in follow up to the pro-PHP rant closing the discussion.
Harry Fuecks is on target when he points to the execution model, but I think I can get more specific about two ways that PHP's execution model has made it successful:
- Cheap Hosting (a short point)
- It's Still a Scripting Language (a longer point)
Fuecks points out how the execution model eliminates hard to debug user land memory leaks. He also points out how this makes PHP servers easier to administer.
I'd like to also point out that execution model has no dependency on a particular web server scaling model. This allows PHP to be ported to almost every existing web server.
By making PHP an easy business to manage, and by eliminating server/os specific niches in the hosting market, PHP hosting has become a cheap commodity. It's basically free.
It's Still a Scripting Language
Moving on to the rant. Basically, I'd like to raise awareness of this single point:
Scope management is as important as dynamic typing in the development of general purpose scripting languages.Because I would really like to ruby online, but the architecture insists that I write fire breathing applications rather than the simple pet scripts that I prefer.
Aside from dynamic typing, I believe that the most distinctive feature of general purpose scripting languages (perl, python, ruby, etc) is the support for writing programs of a limited scope. I can write a script confident that it will only affect the problem at hand. For all inputs? Who cares, it only needs to work for this input.
Using ruby as my most familiar example, in a basic command line script I have a range of scopes available:
+ Create a ruby file (script)
+ Dynamically decide how to setup globals (per invocation)
+ Require a library in my home directory (user)
+ Use some standard libraries (ruby)
+ Install some gems (machine)
Just by installing ruby I can write little scripts focused on my problem while leveraging shared code from the larger community. And I am in control of when I pull in successively "more" global code.
But when I go online, I lose some of this control. The most common means of scaling general purpose scripting languages is to simply span the interpreter across multiple requests (with various tricks). I believe that this lazy route causes a loss of "scripting language" features. Continuing to use ruby as an example:
CGI: Create an interpreter per invocation
CGI has the same available scopes as the home directory, except that even a trivial web site will cause more invocations than one would see with a commandline script. It doesn't matter that I still have all these scopes available because the server dies of load.
FastCGI: Create 1 to 10 interpreters per script, reuse the interpreters
- limit on the number of scripts
All those interpreter instances will consume between 5 and 10 megabytes of memory. That overhead can add up, and is enough to discourage me from creating many scripts. Unfortunately, this loses a nice feature of scripting -- the ease of forking.
- per invocation scope lost
I get to worry about memory links since the interpreters are re-used. Another feature of scripting lost. Plus, I can't do any dependency injection tricks b/c I need to be careful about setting globals per invocation. This is 'free' in PHP and in commandline ruby, but for ruby online it's another feature of scripting lost.
Mod Ruby: Create 1 interpreter per web server process
- script scope lost
- per invocation scope lost
- user scope lost
Mod Ruby is quite bloody. Since the same interpreter is used across any requests to the web server, you lose scopes smaller than the machine. All development happens at the scope as the web server, which is usually on the machine level. You have the memory leak / globals issues of FastCGI made worse by the fact that now unrelated scripts will interfere with each other.
This doesn't sound like scripting at all to me :-)
Conclusion
As far as I'm concerned, ruby is not a web scripting language. Neither is perl, python, or tcl. Once used online, scaling these languages turns them into a variant of SmallTalk, i.e. a dynamically typed systems language. PHP is the only open source scripting language of which I am aware.
It's worth noting that Coldfusion (another horrid language with in-comprehensible success) does a good job of separating requests from each other. While there is a less than obvious auto-include of application.cfm files, requests are independent.
