Full text indexing with Swish-e

Swish-e is a command-line full text indexer similar to ht://Dig. Swish-e sports a brilliant hack. When indexing, one can ask Swish-e to index the output of a command:
swish-e -S prog -i ./output_documents.rb
The output_documents.rb dumps out a series of html documents:
#!/usr/bin/ruby

require 'dbi'
require 'pp'

dbh = DBI.connect("dbi:Mysql:test:localhost", "user", "pass")
# get server version string and display it
artbase = dbh.select_all("SELECT * from objects")

artbase.each do |object|
path = '/object/' + object['object_id'].to_s
mtime = object['modified'].to_time.to_i
html = <<HTML
<html>
<title>#{object['title']}</title>
<meta name="author" value="#{object['from_string']}">
<body>
#{object['body']}
</body>
</html>
HTML
print "Path-Name: #{path}\r\n"
print "Last-Mtime: #{mtime}\r\n"
print "Content-Length: #{html.length}\r\n"
print "\r\n"
print html
end
Swish-e is FAST, in every way that matters:

* Installation is simple -- it's just a few commands
* Indexing and searching is speedy
* Integration is a breeze, since you are just executing commands.

Discovering Swish-e is like witnessing the birth of Athena, fully grown and ready for battle. I highly recommend it.

There is an important security note for use with websites. I'll post more when I work out the Ruby equivalent of the recommended Perl techniques.


meat and gravy

In the other half of my day, I am a painter. For about half of a year, I've been capturing snapshots of my work on some digital prints. I've been just showing these running down a page, but I haven't been happy with it:
http://hexane.org/slider/artwork-old.html
Last night I pulled this work together using dHTML:
http://hexane.org/slider/artwork.html
The programming is nothing too exciting: script.aculo.us is cool, browsers are still annoying, but all it takes is a few hours of fiddling.

The insight for me was: given interesting enough data (meat), all you need is an evening of whiz-bang javascript (gravy) to put together something nifty.

Of course, if you prefer the running-down-the-page version, let me know. That way, I can go to bed earlier tomorrow night :-)



~ Patrick May