In Search of C# Omnicomplete for Vim

By day, I write in C#, mostly on a stock .NET install (version 4, as of this writing; I expect that the principles laid out here will transfer forward as the Vim ecosystem is fairly stable). I often find myself switching back and forth between Visual Studio 2010 (with VsVim) and gvim 7.3. Frankly, I should like to spend more time on the gvim side than I do. While a great deal of time and effort has gone into customizing my vimrc for .NET development, I often find myself switching back to VS in order to get the benefits of Intellisense when working with parts of the very hefty .NET Framework that I do not recall by memory.

Every so often, I do some fishing around for something useful to make my Vim Omnicomplete more productive. In this post, I will layout my newest attempt and analyze the findings. As such, this post may or may not be a tutorial on what you should do. In any event, it will be a science experiment in the plainest sense of the word.

First, the hypothesis. While checking out the Vim documentation on Omnicomplete, we see that the Omnicomplete function for C makes heavy use of an additional tag file, generated from the system headers
, and that this file is used in conjunction with what Omnicomplete understands about the C programming language to make a good guess as to what the programmer likely intends.

It should be possible then, with minimum fuss, to generate a similar tag file for C#. It may also be necessary to tweak the completion function parameters. We will look at that after we have checked the results of the tag file generation.

It turns out that Microsoft releases the .NET 4 Framework’s sourcecode under a reference-only license
. The initial vector of attack will be to download the reference code, and build a tag file from it (this seems well in keeping with the intent behind the license—if this is not so, I will gladly give up the excercise). The link with the relevant source is the first one (Product Name, “.NET” and version of “8.0” as of this writing). The source was placed under RefSrc in Documents.

After running:

ctags -R -f dotnet4tags *

in the RefSrc\Source\.Net\4.0\DEVDIV_TFS\Dev10\Releases\RTMRel directory, we got our first pass at a tag file. A little googling prompted the change to this

ctags -R -f dotnet4tags --exclude="bin" --extra=+fq --fields=+ianmzS --c#-kinds=cimnp *

Then, as the documentation says, we added the tag to our list of tag files to search:

set tags+=~/Documents/RefSrc/Source/.Net/4.0/DEVDIV_TFS/Dev10/Releases/RTMRel/dotnet4tags

When this is used in conjunction with tagfile completion (C-X C-]) the results are superior to any previous attempts, particularly in conjunction with the Taglist plugin

With this alone, we do not get any real contextual searching. For example, if we type something like:

File f = File.O

and then initiate the matching, we get practically any method that begins with an O regardless of whether or not said method is a member of the File class.

If we stop here, we still have a leg up over what we had before. We can navigate to .NET framework methods and fetch their signatures through the Taglist browser—but we would still like to do better.

The only reason the resulting tagfile is not here included, is that it is fairly large—not huge, but much too large to be a simple attachment to this post.

A Quick Note on Building noweb on Cygwin

My laptop has bitten the dust. Until I have the chance to open it up and see if the damage is fixable, I have been borrowing my wife’s computer to tinker (to her annoyance, I’m sure, but she used my laptop until we replaced the desktop so all’s fair). I was going to install noweb on cygwin, and hit the following error on build:

In file included from notangle.nw:28:
getline.h:4: error: conflicting types for 'getline'
/usr/include/sys/stdio.h:37: error: previous declaration of 'getline' was here
getline.h:4: error: conflicting types for 'getline'
/usr/include/sys/stdio.h:37: error: previous declaration of 'getline' was here

As I had built noweb before, this error struck me as a little strange. It turns out, that in stdio.h, Cygwin includes its own definition of getline, unlike on standard Unix-likes. A quick googling turned up that this was not unique to noweb, but that other packages had encountered similar difficulties. The answer that worked for me is here:

In sort, all one has to do is open /usr/include/sys/stdio.h and comment out the line that reads:

ssize_t _EXFUN(getline, (char **, size_t *, FILE *));

For safety’s sake, I reinstated the line after installing noweb and everything seems to be running fine.

Literature Review: PEGs

Parsing Expression Grammars, or PEGs, are syntax-oriented parser generators, meant to ease the task of building rich programming languages. I had had the opportunity to tinker with PEGs sparingly and, finally, I got around to reading the original paper (available here: My reading notes from the paper can be downloaded here:

I am fully aware that this is not, as it were, a new paper. It came up originally in my searches for a good parsing library in Common Lisp. For the project that it was intended for, I ultimately moved on to using Ometa. While Ometa is a fine system, it actually did not win on power grounds because, quite simply, I do not need the extra expressiveness for what I am working on. It won out because the implementation was better than the PEG library I had tried.

As it is kind of old territory, my review has little to say. In reality, when I first ran across PEGs I felt strangely out of the loop, but here goes anyway:

PEGs are a powerful mechanism for defining parsing grammars. The form of the language itself is similar to standard EBNF in its general layout, but allows native creation of grammars. It avoids the ambiguities inherent to Context Free Grammars by using prioritized selection of paths through the grammar. As a result, it is actually more powerful than traditional CFGs while being simpler to use.

While PEGs seem to have also caught on a lot better across than its predecessors (discussed in the paper), the seem to receive less notice than Ometa, which further builds on PEGs.

How WPF gets GUI programming right

WPF is another in a long line of Microsoft UI related technologies, each promising more than the one before. WPF is basically Silverlight for the desktop (or, if you prefer, Silverlight is WPF for the web). We have been building an application in WPF as of late at my place of employment, and I’d thought I’d post what I thought that WPF does right.

The biggest thing is that WPF builds UIs declaratively. I cannot stress enough how important I think this really is. The biggest pain about using Java’s Swing framework was writing long sequences of code that initialized controls in a form’s constructor. Under the hood, Windows forms works pretty much the same way. The biggest difference is that Microsoft ships a nice designer with Visual Studio, so the raw kludginess of the approach is hidden from most programmers, since they look at everything through the lens of the designer.

The declarativeness goes beyond simply allowing one to declare that widgets will exist to their layout (via the Grid mechanisms–really, these should be used by default and the designer left on the shelf) and their data flow. The latter is particularly interesting. ASP.NET has data binding, but the version employed by WPF is far more sophisticated. When I jumped back to an ASP.NET project, I immediately found myself missing the power of WPF databinding, but to add it to a web framework would unquestionably require a continuation based framework like the one employed by Weblocks or Seaside.

The importance here is that both the interface and how it interacts with data can be declared. Many GUI designers and markup languages have come along that allowed one to declare the layout, but few, if any, mainstream GUI designers have allowed so much expressiveness.

The hard part about all this, is that C# is a statically typed language and, as a result, a lot of these features are based heavily on reflection which is a performance hit, due to the fact that the JIT compiler cannot really optimize these things. Perhaps it was just my imagination, but I feel pretty sure that WPF applications lag behind their windows forms cousins in terms of speed.

All in all, though, WPF is a fine framework, though.

Polymorphism, Multiple Inheritance, & Interfaces…Pick 2.

The title for this post comes from a statement that was brought up by a  coworker as having been said to him. The overall point of this post will be simple: given that choice, your answer should be obvious: you want polymorphism and multiple inheritance, because there is nothing that you can do with interfaces that you cannot do with multiple inheritance.

Interfaces provide two things, depending on their use: a form of multiple inheritance in languages that do not otherwise support it and design-by-contract capabilities. Clearly, in the former case, you are better off with multiple inheritance, as you receive the full power of the feature. In the latter case, it is trivial to create an almost-empty class that acts as an interface, if that is the effect you are after.

The main objection raised was the counter example: what if you have a class Animal and another class Plant. Surely you do not want a programmer to inherit from both? That would not make sense. To which I would answer Why not? If it makes sense to whomever wrote it, why prevent it? They might, after all, be creating something for the little shop of horrors.

Largely, I  think the thinking that interfaces are somehow superior to multiple inheritance comes from never having used multiple inheritance in a system built from the ground up to support it (like CLOS in Common Lisp) as multiple inheritance strictly supersedes interfaces.

The Literature

Looking back at my last few posts, something occurred to me: a lot of the more exotic focus of this blog has been lost. While I enjoy examining MVVM and QuickBooks, one of the whole points of this blog was to offer a fusion between useful code monkey concepts and computer scientist (hence, the domain name of this site). Lately, there has not been much “scientist” at the mad computer scientist.

One of my new series of posts is going to be literature reviews. I have a massive reading list of computer science papers queued up as well as some other materials. In these posts, I will read a journal article or watch a lecture and post my notes and thoughts about it. The first one will be coming soon, so look out for it.

More on Microblogging and Programming

I had been rolling around some thoughts on microblogging and programming since my last blog post. First of all, I found it interesting that Twitter started life as an internal project before getting VC funding. This reenforces, to me, the value of what I as saying, which is that microblogging for more limited audiences and topics is more useful than the present day and age where we have people microblogging about brushing their teeth.

I have also been interested in doing more work on Sheepshead. According to gitorious, my last commit was over a month ago. Such are the results of having a family, a job, and a life–but I really want to get back to working on it. As I start gearing it all up again, I have decided to try a little experiment. Instead of simply waiting on someone else to try out microblogging for a small development team, I am going to try to bootstrap a small team while microblogging. As I develop Sheepshead and push it forward, I am going to try and use microblogging to mull over design decisions and announce progress.

The service I have decided to use for this endeavor is (you can see the stream here), rather than the more ubiquitous Twitter. I did this for a few reasons, chief among them being that I expect there to be more engineering types as well as more open source-minded individuals on Another important consideration is that allows its users to export data. My intention is to keep backups of the information on the feed, so that if something were to happen to and the project attained a meaningful size, a StatusNet instance could be setup, even if only as a stopgap.

We will see how this all goes (or if it does–I can definitely see how Sheepshead is sort of a niche development). In the mean time, I am going to try and get some code written.

Linq to Sql is not fit for GUI Applications

The title is a little incendiary, I admit, but I think it is a good place to start.

We are building a database-driven application with WPF (using MVVM) & Linq to SQL and, in the process, a few caveats about Linq to SQL have come out in a truly fine way.

The issues all revolve around that little innocuous thing known as a DataContext. For those of you who may not be familiar with the idea, in Linq to SQL a DataContext is “the source of all entities mapped over a database connection. It tracks changes that you made to all retrieved entities and maintains an “identity cache” that guarantees that entities retrieved more than one time are represented by using the same object instance.”

Further down the reference page for the DataContext we read that

In general, a DataContext instance is designed to last for one “unit of work” however your application defines that term. A DataContext is lightweight and is not expensive to create. A typical LINQ to SQL application creates DataContext instances at method scope or as a member of short-lived classes that represent a logical set of related database operations.

so the most logical place to create and dispose of our DataContexts is in the methods that implement the business logic. This works perfectly well for retrieving data, and for updates on entities that have no relationships, but fails with a

Cannot Attach An Entity that already exists.

exception when an update is made to entity relationships. The problem is that Linq to SQL cannot move objects between DataContexts, so if one context was used to lookup the object in question and another was used to lookup one used in a relation (say, to a lookup table), then Linq throws the fit seen here. In a web application, it is much easier to keep this from ever happening, as a single DataContext will likely be used to do the work from a BL call (or, at least, the calls will be sufficiently separate as not to trod on each others’ feet).

If the context is moved up to the business object layer (i.e. as a static member), the problem is partially alleviated and partially aggravated. It is somewhat alleviated in that all of the objects of a certain type will, at least, have been pulled from a central DataContext and so will have no issues amongst themselves. However, there is still the issue of when an object is set (via databinding) from a list that was pulled by another datacontext. An easy, and genuine example, is where one entity (call it A) has an attribute named “type”, which must be one of the entries in a lookup table (which we will call entity B). If a drop down list is databound to the entries in the lookup table are pulled by entity B (the most logical choice) the same error message as above is hit–unless, of course, all of the entities are repulled by entity A’s datacontext before saving. A labor-intensive, innefficient, and maintenance heavy process. At any rate, the application could be written this way, but not without a great deal of effort to repull and remerge data with a single context.

Finally, one could move the context up to the application layer–the entire application shares a single datacontext. The problem with this is that, in an application where multiple tabs or windows can be open, if any single object attempts to save its changes via SubmitChanges, the pending changes for all windows will get submitted, even if the user comes back and hits “Cancel”. The result in this scenario is utter and complete chaos.

Ultimately, what we did in this scenario was to create a single DataContext per ViewModel (where we experienced issues with this, not universally) and pass it through all of the data fetching operations. The bookkeeping was certainly a little tedious to write, but it worked. From a conceptual standpoint, this is very dirty as it makes the presentation layer aware, even in a limited sense, of what is being done by the data access layer. While Linq to Sql is very nice, it has some very bad shortcomings when used in GUI applications.

One too many Tiers

Something has been nagging me lately about the three tier architecture–quite simply, it has too many tiers. If you subscribe to the full three tier architecture, you have an application that, at the end of the day, looks like this:

Yet, if you are using that architecture, you are almost certainly using it with an object oriented programming language–and if both things are true, there is a problem. It’s nature may not be immediately obvious, but it is there nonetheless: this flavor of the n-tier architecture defeats the entire point of object oriented programming.

To review, one of the upside of object orientation is that data and the operations performed on it are encapsulated into a single structure. When so-called business rules (operations, really) are split into ancillary classes (the BL classes), encapsulation is broken. In effect, we are using object oriented techniques to implement procedural programming with dumb C-style structs.

The true value in the multitiered architecture is actually far simpler than this birthday-cake methodology that has been faithfully copied into so many projects: keep presentation and logic separate. Any good methodology gets this much right (like MVC).

In conclusion, the remedy is simple: if you have or are building an application with a multitiered architecture, make your code base cleaner and more intuitive by merging the BO and BL layers.

A Short Introduction to MVVM

Our team is building an application using WPF with the Model-View-ViewModel design pattern and I wanted to take a few minutes to give an introduction to MVVM. The pattern itself is comparable to the venerable MVC pattern, though by no means identical. Let’s begin by examining each piece and then looking at how they fit together.

  1. Model–the model is very much the same thing as the model in MVC or business objects in a three tiered architecture. It is a straight up model of the data being manipulated without any display logic of any variety.
  2. View–the view is, again, very much the same as the view in MVC. It is the formatting or display.
  3. ViewModel–if you are familiar with MVC or similar patterns, the ViewModel is the largest departure. There are two ways to look at a ViewModel, which will become clearer after reading through some code.
    1. The ViewModel as an adapter between the model and the view. This is, perhaps, the most familiar and comforting way to view it, though it is also the least accurate, as the logic behind a view is also encapsulated in the ViewModel.
    2. The ViewModel can be seen as an encapsulation of the logic and state of the view, independent of any display logic. In short, a ViewModel Models a View.

On the ViewModel, the second explanation is the best, though I did find #1 helpful when first examining the pattern.

MVVM is a fairly new pattern, seeing most (or all?) of its use in some of the newer Microsoft technologies, WPF and Silverlight. As a result, the fit between framework and pattern is often subpar. The easiest example (which does not seem to arise in Silverlight) is that of popping up a dialog in a WPF application. If the ViewModel knows how to pop up a dialog, then we are clearly violating the pattern, as the ViewModel is supposed to model a view’s operations and state and leave such details to the view.

After all, the whole idea here is that we should be able to bolt multiple views onto a single Model-View pair, especially (and here is where the aims differ a little from MVC, if not in theory, at least in practice), views that cross paradigm. For example, a WPF view and a Silverlight view, allowing the application to exist as both a desktop application and a web-based application.

If you do not do something, though, you are unable to perform an elementary task: prompt the user (after some fashion or another) for input. In practice, we are using a mediator to allow the ViewModel to send messages which the View can then receive and act on as its implementation mandates.

On one hand, this works well and I like how it falls out in practice. The View and the ViewModel remain separate and mockups or tests could be written that simply interact with the mediator.

From a more theoretical standpoint, it makes me uneasy because it is plastering over a severe weakness in the pattern that, perhaps, ought to be addressed at the pattern level instead of at the implementation level. Moreover, what is a mediator, really? It is very much like an ad-hoc event handling system. Would it not be better to simply use events as they were meant to be used?

Another thing I noted was causing some people angst, was that the MSDN description of MVVM (see the section entitled “Relaying Command Logic”) said that the codebehind for a xaml file should be empty. While I certainly think the idea of the View itself not doing anything, as it were, is a good one, there is sometimes logic that is View-specific and should, therefore, be kept in the view. A better formulation, in my humble opinion, is that there should be only tasks specific to the view itself in the codebehind. For example, if you are writing the basic set of CRUD operations for some object, the act of saving the object will not be view specific. Taking care of some rendering details might be. The optimum case is, of course, that all logic find its way into the ViewModel. Until WPF and MVVM are a better fit, there will still be oddball cases that mandate violating the principal.

To wrap up, the most important thing about MVVM is that the ViewModel acts as a model for a view rather than a traffic controller (like the Controller in MVC) so that, in theory, one could bolt entirely different UIs on top of one Model-ViewModel set. In practical terms, MVVM is in its infancy and, consequently, there are still some rough edges that developers should be aware of when writing code.