XSLT is AWK in 2000…

At work, I have been working on a bit of functionality to allow the users of our system to fetch quotes from a vendor live in our system. Now, this particular vendor offers an XML API as the one true way to handle all of this. You send them a query crafted in XML and you get back an answer crafted in XML. However, XML is not really human readable or, at least, not human presentable. I had heard of XSLT before and knew that it was a way to transform XML documents, however, I had no opportunity to use it as I have fairly little contact with XML. This seemed like the perfect opportunity to learn and use a tool ideally suited to what I was trying to accomplish so I dug in.

First, before I go any farther, let me say that: I LOVE XSLT! At least, insofar as I can love anything that comes in an XML package. It allowed me to do exactly what I needed and I can already forsee other uses for this.

So, to sum up, this isn’t a rant, but XSLT ends up being almost a letdown. When you hear the acronym and see the terminology being slung around, it sounds like there is a great deal more involved. XSLT, though, is basically awk in the twenty-first century. In transformation mode, you basically list a series of templates along with what you would like the output to look like, using special tags to insert values from the current segment. Compare, if you will:

Awk:

$2 ~ /foo/ { print “hi” }
XSLT (forgive the square brackets being used in place of angled; a hazard of WordPress)
[xslt:template match=”foo”]

hi
[/xslt:template]

The idea is somewhat contrived and makes assumptions as to what Awk will regard as a field and line separators. However, the idea is demonstrated: in both cases the “program” (awkinology) or “template” (xsltinology) is a list of rules, each of which is applied to the input. Indeed, the whole sequence is not unlike several attempts made at an XML comprehending Awk. Though, in my opinion, XMLGawk is a little more awkward than XSLT.

One important difference between XSLT and Awk is that, if multiple rules match, Awk applies them all whereas XSLT has a ranking method to decide which rule to apply.
It is not in vain that it is written that “there is nothing new under the sun.”

Prettying up PHP Code

I love functional programming languages. I really do. One of the things (among others) that is very nice about functional languages is that they are far more predisposed towards writing beautiful code (the benefits, of which, are a common enough topic). At work, I don’t work with functional languages, I work with PHP. While PHP 5.3 is well on its way towards having the key features of a functional language, its roots are clearly as a simplified Perl. As such, it is a language particularly prone to having ugly code written in it. I was working on some code that I knew would be fairly ugly and, having read of some PHP pretty printers in the past, I regoogled for a bit. I eventually settled on the PHPBeautifier tool at PEAR.

It seems to work fairly well and the filter architecture looks promising. After some tinkering, the following command got (for me) fairly good results:

php_beautifier -f $1 –filters “IndentStyles(style=allman) ArrayNested NewLines(before=T_CLASS,after=T_COMMENT,before=if)”

In simple terms this specifies the following rules:

It certainly isn’t a perfect match for my coding style (the definition of perfect, right? ;-), but it came pretty close on a pretty good test. Things I would like to tweak but couldn’t get to work with the current filters (it would probably require creating a new filter or altering an existing one):

  • Newlines after certain blocks. For example, I usually put a newline after the end of an if/elseif/else sequence but I couldn’t find a way to make this work reliably
  • Indention of HTML in a sequences of echoes. In the code I have to maintain, there are long strings of echo foo, echo bar, echo baz. Not the way I would do it (I prefer either using templates or, if that is not an option, using heredocs with echo), but I also don’t want to have to rewrite it all.
  • The option to indent all code between the opening and closing tags. This helps when it is interjected into a ton of HTML. I honestly don’t know if I would use it all the time, but it would be nice to have.

Admittedly, this is nitpicking stuff. php_beautifier is passably documented (finding command line invocation methods in the docs is kind of a pain as it is kind of an after thought in a sea of library docs) and does an excellent example at pretty printing PHP code.

Windows Terminals

A colleague of mine was out for a few days this week and, for reasons of efficiency in our hectic, you-never-know-what-is-going-to-happen schedule, I moved from a back room to his desk. Of course, I had to borrow his PC with the desk, as lugging PCs around for a couple of days would be a waste of time. So, I settled down to bang out code on his Vista box, where I have been spoiled all this time on a Kubuntu 8.04 desktop running KDE 3.5. Now, to put this in perspective, my ideal IDE is screen, a shell (bash being my favorite, for now), and my array of other tools (grep, find, awk, sed, etc.). This friend of mine had installed Cygwin which, by default, runs under the DOS terminal emulator under Windows. I do not think it unfair to say that the DOS terminal emulator is, perhaps, the worst I have ever used. Being spoilt on far better terminals and perferring to work from the terminal, I went off in a quick search for what general purpose terminal emulators there are out there for Windows. Here is what I found:

• Putty – putty has a built in terminal emulator that, in my opinion, is wonderful. However, it is only for SSH connections so I can’t run a Cygwin shell through putty without pulling some stupid trick like running the SSH service from Cygwin and then logging in through putty.
• Poderosa – very nice emulator, written in .NET 2.0 (so it is Windows specific) and backed, from what I understand, by the Japanese government. Poderosa sports a tabbed interface and nice point and clicky love for configuration. In that sense, it is not unlike Konsole or Gnome’s console.
• Rxvt – Cygwin ships with its own terminal emulator, rxvt. This emulator is, almost inexplicably, not installed by default but, rather, an add on package. To make it the default, you also need to edit C:\cygwin\Cygwin.bat
• Terminator – Terminator is, according to the author, written primarily in Java with a smidgeon of Ruby and [what was it?]

In addition to making life quite livable on someone else’s PC, I have also gone ahead and added these to my own Vista partition. As for which is my favorite…I can’t really say, though I think that Poderosa is becoming a favorite as it is the roughest equivalent to Konsole that I have seen for Windows. Putty, on the other hand, has long been a companion of mine. A single, small executable, Putty is, perhaps, the most convenient way to get an SSH connection going from a borrowed computer. Rxvt is, of course, more in the *NIX tradition of terminal emulators.

The point here is less to indicate a preferance than to create a good old fashioned list.

References
http://en.poderosa.org/
http://blasphemousbits.wordpress.com/2007/02/12/rxvt-solves-many-cygwin-woes/
* http://www.chiark.greenend.org.uk/~sgtatham/putty/