A Quick Note on Building noweb on Cygwin

My laptop has bitten the dust. Until I have the chance to open it up and see if the damage is fixable, I have been borrowing my wife’s computer to tinker (to her annoyance, I’m sure, but she used my laptop until we replaced the desktop so all’s fair). I was going to install noweb on cygwin, and hit the following error on build:

In file included from notangle.nw:28:
getline.h:4: error: conflicting types for 'getline'
/usr/include/sys/stdio.h:37: error: previous declaration of 'getline' was here
getline.h:4: error: conflicting types for 'getline'
/usr/include/sys/stdio.h:37: error: previous declaration of 'getline' was here

As I had built noweb before, this error struck me as a little strange. It turns out, that in stdio.h, Cygwin includes its own definition of getline, unlike on standard Unix-likes. A quick googling turned up that this was not unique to noweb, but that other packages had encountered similar difficulties. The answer that worked for me is here:

http://ftp.tug.org/mail/archives/pdftex/2006-February/006370.html

In sort, all one has to do is open /usr/include/sys/stdio.h and comment out the line that reads:

ssize_t _EXFUN(getline, (char **, size_t *, FILE *));

For safety’s sake, I reinstated the line after installing noweb and everything seems to be running fine.

Literature Review: PEGs

Parsing Expression Grammars, or PEGs, are syntax-oriented parser generators, meant to ease the task of building rich programming languages. I had had the opportunity to tinker with PEGs sparingly and, finally, I got around to reading the original paper (available here: http://pdos.csail.mit.edu/~baford/packrat/popl04/). My reading notes from the paper can be downloaded here:

http://www.mad-computer-scientist.com/blog/wp-content/uploads/2011/06/peg.html

I am fully aware that this is not, as it were, a new paper. It came up originally in my searches for a good parsing library in Common Lisp. For the project that it was intended for, I ultimately moved on to using Ometa. While Ometa is a fine system, it actually did not win on power grounds because, quite simply, I do not need the extra expressiveness for what I am working on. It won out because the implementation was better than the PEG library I had tried.

As it is kind of old territory, my review has little to say. In reality, when I first ran across PEGs I felt strangely out of the loop, but here goes anyway:

PEGs are a powerful mechanism for defining parsing grammars. The form of the language itself is similar to standard EBNF in its general layout, but allows native creation of grammars. It avoids the ambiguities inherent to Context Free Grammars by using prioritized selection of paths through the grammar. As a result, it is actually more powerful than traditional CFGs while being simpler to use.

While PEGs seem to have also caught on a lot better across than its predecessors (discussed in the paper), the seem to receive less notice than Ometa, which further builds on PEGs.

How WPF gets GUI programming right

WPF is another in a long line of Microsoft UI related technologies, each promising more than the one before. WPF is basically Silverlight for the desktop (or, if you prefer, Silverlight is WPF for the web). We have been building an application in WPF as of late at my place of employment, and I’d thought I’d post what I thought that WPF does right.

The biggest thing is that WPF builds UIs declaratively. I cannot stress enough how important I think this really is. The biggest pain about using Java’s Swing framework was writing long sequences of code that initialized controls in a form’s constructor. Under the hood, Windows forms works pretty much the same way. The biggest difference is that Microsoft ships a nice designer with Visual Studio, so the raw kludginess of the approach is hidden from most programmers, since they look at everything through the lens of the designer.

The declarativeness goes beyond simply allowing one to declare that widgets will exist to their layout (via the Grid mechanisms–really, these should be used by default and the designer left on the shelf) and their data flow. The latter is particularly interesting. ASP.NET has data binding, but the version employed by WPF is far more sophisticated. When I jumped back to an ASP.NET project, I immediately found myself missing the power of WPF databinding, but to add it to a web framework would unquestionably require a continuation based framework like the one employed by Weblocks or Seaside.

The importance here is that both the interface and how it interacts with data can be declared. Many GUI designers and markup languages have come along that allowed one to declare the layout, but few, if any, mainstream GUI designers have allowed so much expressiveness.

The hard part about all this, is that C# is a statically typed language and, as a result, a lot of these features are based heavily on reflection which is a performance hit, due to the fact that the JIT compiler cannot really optimize these things. Perhaps it was just my imagination, but I feel pretty sure that WPF applications lag behind their windows forms cousins in terms of speed.

All in all, though, WPF is a fine framework, though.

Polymorphism, Multiple Inheritance, & Interfaces…Pick 2.

The title for this post comes from a statement that was brought up by a  coworker as having been said to him. The overall point of this post will be simple: given that choice, your answer should be obvious: you want polymorphism and multiple inheritance, because there is nothing that you can do with interfaces that you cannot do with multiple inheritance.

Interfaces provide two things, depending on their use: a form of multiple inheritance in languages that do not otherwise support it and design-by-contract capabilities. Clearly, in the former case, you are better off with multiple inheritance, as you receive the full power of the feature. In the latter case, it is trivial to create an almost-empty class that acts as an interface, if that is the effect you are after.

The main objection raised was the counter example: what if you have a class Animal and another class Plant. Surely you do not want a programmer to inherit from both? That would not make sense. To which I would answer Why not? If it makes sense to whomever wrote it, why prevent it? They might, after all, be creating something for the little shop of horrors.

Largely, I  think the thinking that interfaces are somehow superior to multiple inheritance comes from never having used multiple inheritance in a system built from the ground up to support it (like CLOS in Common Lisp) as multiple inheritance strictly supersedes interfaces.

The Literature

Looking back at my last few posts, something occurred to me: a lot of the more exotic focus of this blog has been lost. While I enjoy examining MVVM and QuickBooks, one of the whole points of this blog was to offer a fusion between useful code monkey concepts and computer scientist (hence, the domain name of this site). Lately, there has not been much “scientist” at the mad computer scientist.

One of my new series of posts is going to be literature reviews. I have a massive reading list of computer science papers queued up as well as some other materials. In these posts, I will read a journal article or watch a lecture and post my notes and thoughts about it. The first one will be coming soon, so look out for it.

More on Microblogging and Programming

I had been rolling around some thoughts on microblogging and programming since my last blog post. First of all, I found it interesting that Twitter started life as an internal project before getting VC funding. This reenforces, to me, the value of what I as saying, which is that microblogging for more limited audiences and topics is more useful than the present day and age where we have people microblogging about brushing their teeth.

I have also been interested in doing more work on Sheepshead. According to gitorious, my last commit was over a month ago. Such are the results of having a family, a job, and a life–but I really want to get back to working on it. As I start gearing it all up again, I have decided to try a little experiment. Instead of simply waiting on someone else to try out microblogging for a small development team, I am going to try to bootstrap a small team while microblogging. As I develop Sheepshead and push it forward, I am going to try and use microblogging to mull over design decisions and announce progress.

The service I have decided to use for this endeavor is Identi.ca (you can see the stream here), rather than the more ubiquitous Twitter. I did this for a few reasons, chief among them being that I expect there to be more engineering types as well as more open source-minded individuals on Identi.ca. Another important consideration is that Identi.ca allows its users to export data. My intention is to keep backups of the information on the feed, so that if something were to happen to Identi.ca and the project attained a meaningful size, a StatusNet instance could be setup, even if only as a stopgap.

We will see how this all goes (or if it does–I can definitely see how Sheepshead is sort of a niche development). In the mean time, I am going to try and get some code written.

Linq to Sql is not fit for GUI Applications

The title is a little incendiary, I admit, but I think it is a good place to start.

We are building a database-driven application with WPF (using MVVM) & Linq to SQL and, in the process, a few caveats about Linq to SQL have come out in a truly fine way.

The issues all revolve around that little innocuous thing known as a DataContext. For those of you who may not be familiar with the idea, in Linq to SQL a DataContext is “the source of all entities mapped over a database connection. It tracks changes that you made to all retrieved entities and maintains an “identity cache” that guarantees that entities retrieved more than one time are represented by using the same object instance.”

Further down the reference page for the DataContext we read that

In general, a DataContext instance is designed to last for one “unit of work” however your application defines that term. A DataContext is lightweight and is not expensive to create. A typical LINQ to SQL application creates DataContext instances at method scope or as a member of short-lived classes that represent a logical set of related database operations.

so the most logical place to create and dispose of our DataContexts is in the methods that implement the business logic. This works perfectly well for retrieving data, and for updates on entities that have no relationships, but fails with a

Cannot Attach An Entity that already exists.

exception when an update is made to entity relationships. The problem is that Linq to SQL cannot move objects between DataContexts, so if one context was used to lookup the object in question and another was used to lookup one used in a relation (say, to a lookup table), then Linq throws the fit seen here. In a web application, it is much easier to keep this from ever happening, as a single DataContext will likely be used to do the work from a BL call (or, at least, the calls will be sufficiently separate as not to trod on each others’ feet).

If the context is moved up to the business object layer (i.e. as a static member), the problem is partially alleviated and partially aggravated. It is somewhat alleviated in that all of the objects of a certain type will, at least, have been pulled from a central DataContext and so will have no issues amongst themselves. However, there is still the issue of when an object is set (via databinding) from a list that was pulled by another datacontext. An easy, and genuine example, is where one entity (call it A) has an attribute named “type”, which must be one of the entries in a lookup table (which we will call entity B). If a drop down list is databound to the entries in the lookup table are pulled by entity B (the most logical choice) the same error message as above is hit–unless, of course, all of the entities are repulled by entity A’s datacontext before saving. A labor-intensive, innefficient, and maintenance heavy process. At any rate, the application could be written this way, but not without a great deal of effort to repull and remerge data with a single context.

Finally, one could move the context up to the application layer–the entire application shares a single datacontext. The problem with this is that, in an application where multiple tabs or windows can be open, if any single object attempts to save its changes via SubmitChanges, the pending changes for all windows will get submitted, even if the user comes back and hits “Cancel”. The result in this scenario is utter and complete chaos.

Ultimately, what we did in this scenario was to create a single DataContext per ViewModel (where we experienced issues with this, not universally) and pass it through all of the data fetching operations. The bookkeeping was certainly a little tedious to write, but it worked. From a conceptual standpoint, this is very dirty as it makes the presentation layer aware, even in a limited sense, of what is being done by the data access layer. While Linq to Sql is very nice, it has some very bad shortcomings when used in GUI applications.

One too many Tiers

Something has been nagging me lately about the three tier architecture–quite simply, it has too many tiers. If you subscribe to the full three tier architecture, you have an application that, at the end of the day, looks like this:

Yet, if you are using that architecture, you are almost certainly using it with an object oriented programming language–and if both things are true, there is a problem. It’s nature may not be immediately obvious, but it is there nonetheless: this flavor of the n-tier architecture defeats the entire point of object oriented programming.

To review, one of the upside of object orientation is that data and the operations performed on it are encapsulated into a single structure. When so-called business rules (operations, really) are split into ancillary classes (the BL classes), encapsulation is broken. In effect, we are using object oriented techniques to implement procedural programming with dumb C-style structs.

The true value in the multitiered architecture is actually far simpler than this birthday-cake methodology that has been faithfully copied into so many projects: keep presentation and logic separate. Any good methodology gets this much right (like MVC).

In conclusion, the remedy is simple: if you have or are building an application with a multitiered architecture, make your code base cleaner and more intuitive by merging the BO and BL layers.

A Short Introduction to MVVM

Our team is building an application using WPF with the Model-View-ViewModel design pattern and I wanted to take a few minutes to give an introduction to MVVM. The pattern itself is comparable to the venerable MVC pattern, though by no means identical. Let’s begin by examining each piece and then looking at how they fit together.

  1. Model–the model is very much the same thing as the model in MVC or business objects in a three tiered architecture. It is a straight up model of the data being manipulated without any display logic of any variety.
  2. View–the view is, again, very much the same as the view in MVC. It is the formatting or display.
  3. ViewModel–if you are familiar with MVC or similar patterns, the ViewModel is the largest departure. There are two ways to look at a ViewModel, which will become clearer after reading through some code.
    1. The ViewModel as an adapter between the model and the view. This is, perhaps, the most familiar and comforting way to view it, though it is also the least accurate, as the logic behind a view is also encapsulated in the ViewModel.
    2. The ViewModel can be seen as an encapsulation of the logic and state of the view, independent of any display logic. In short, a ViewModel Models a View.

On the ViewModel, the second explanation is the best, though I did find #1 helpful when first examining the pattern.

MVVM is a fairly new pattern, seeing most (or all?) of its use in some of the newer Microsoft technologies, WPF and Silverlight. As a result, the fit between framework and pattern is often subpar. The easiest example (which does not seem to arise in Silverlight) is that of popping up a dialog in a WPF application. If the ViewModel knows how to pop up a dialog, then we are clearly violating the pattern, as the ViewModel is supposed to model a view’s operations and state and leave such details to the view.

After all, the whole idea here is that we should be able to bolt multiple views onto a single Model-View pair, especially (and here is where the aims differ a little from MVC, if not in theory, at least in practice), views that cross paradigm. For example, a WPF view and a Silverlight view, allowing the application to exist as both a desktop application and a web-based application.

If you do not do something, though, you are unable to perform an elementary task: prompt the user (after some fashion or another) for input. In practice, we are using a mediator to allow the ViewModel to send messages which the View can then receive and act on as its implementation mandates.

On one hand, this works well and I like how it falls out in practice. The View and the ViewModel remain separate and mockups or tests could be written that simply interact with the mediator.

From a more theoretical standpoint, it makes me uneasy because it is plastering over a severe weakness in the pattern that, perhaps, ought to be addressed at the pattern level instead of at the implementation level. Moreover, what is a mediator, really? It is very much like an ad-hoc event handling system. Would it not be better to simply use events as they were meant to be used?

Another thing I noted was causing some people angst, was that the MSDN description of MVVM (see the section entitled “Relaying Command Logic”) said that the codebehind for a xaml file should be empty. While I certainly think the idea of the View itself not doing anything, as it were, is a good one, there is sometimes logic that is View-specific and should, therefore, be kept in the view. A better formulation, in my humble opinion, is that there should be only tasks specific to the view itself in the codebehind. For example, if you are writing the basic set of CRUD operations for some object, the act of saving the object will not be view specific. Taking care of some rendering details might be. The optimum case is, of course, that all logic find its way into the ViewModel. Until WPF and MVVM are a better fit, there will still be oddball cases that mandate violating the principal.

To wrap up, the most important thing about MVVM is that the ViewModel acts as a model for a view rather than a traffic controller (like the Controller in MVC) so that, in theory, one could bolt entirely different UIs on top of one Model-ViewModel set. In practical terms, MVVM is in its infancy and, consequently, there are still some rough edges that developers should be aware of when writing code.

No, I do not want to reboot…

What is it with Windows and this urgent, burning desire to reboot?

Here is how the last couple of weeks on my work PC have gone, as an example.

I boot up my computer (which is hard, because every so often, for no discernible reason, the PC hangs on boot) and login. Windows chipperly informs me that it updated everything. Yay! Butterflies and daisies and happiness. I start up my usual army of suspects. Visual Studio. Firefox. Bug tracker. Et cetera, et cetera.

Then AVG Professional pops up a message. All perky, it tells me that it has finished update. I need to reboot, how about now?

Grrrrrr. I’m just getting down to work and you want to reboot? Heck no.

But I can’t say no. I can postpone it. For 60 minutes.

Fine. 1 hour. Just get the heck out of my face.

So, every hour or so I tell AVG it better flipping not reboot my computer.

Then the shiny, dolphin blue box pops up. Windows has, like, just finished installing the most totally awesome bunch of updates. How ’bout rebooting now?

NO. GO AWAY. I DO NOT WANT TO REBOOT.

Well, all right then. We can postpone for four hours.

FINE POSTPONE IT FOR 4 HOURS. I JUST WANT TO GET SOME WORK DONE.

Then, as it turns out, Flash and Java want to update too.