Announcing SpiralWeb version 0.1

Version 0.1 of SpiralWeb is available for download at http://pypi.python.org/pypi/spiralweb/0.1. To install, make sure that you have Python and pip, then run pip install spiralweb to download and install. The project home page can be found at https://gitorious.org/spiralweb.

About SpiralWeb

SpiralWeb is a Literate Programming system written in Python. Its primary aims are to facilitate the usage of literate programming in real-world applications by using light weight text-based markup systems (rather than TeX or LaTeX) and painlessly integrating into build scripts. It is language agnostic. The default type-setter is Pandoc’s extended markdown, but new backends can be readily added for any other system desired.

For more information on literate programming, please see literateprogramming.com.

Usage

The syntax is minimal:

@doc (Name)? ([option=value,option2=value2...])?

Denotes a document chunk. At the moment, the only option that is used is the out parameter, which specifies a path (either absolutely or relative to the literate file it appears in) for the woven output to be written to.

@code (Name)? ([option=value,option2=value2...])?

Denotes the beginning of a code chunk. At present, the following options are used:

  • out which specifies a path (either absolutely or relative to the literate file it appears in) for the tangled output to be written to.
  • lang which specifies a language that the code is written in. This attribute is not used except in the weaver, which uses it when emitting markup, so that the code can be highlighted properly.

@<Name>

Within a code chunk, this indicates that another chunk will be inserted at this point in the final source code. It is important to note that SpiralWeb is indentation-sensitive, so any preceding spaces or tabs before the reference will be used as the indentation for every line in the chunks output–even if there is also indentation in the chunk.

@@

At any spot in a literate file, this directive results in a simple @ symbol.

Putting it all Together: Reverting a Single Commit in Git

First, we locate the patch number with git log.

Then we dump it out as a patch

git format-patch -1 05bc54585b5e6bea5e87ec59420a7eb3de5c7f10 –stdout > changes.patch

(note that the -1 switch limits the number of patches…by default, git-format will pull many more.)

Once we know that we have the patches that we wish to roll back, we run this command:

git apply –reverse –reject changes.patch

Finally, we commit the reverse-changes.

The big thing is the -1 switch on git format. Many of the articles I found were pulling a large number of patches and I did not need it.

In Search of C# Omnicomplete for Vim

By day, I write in C#, mostly on a stock .NET install (version 4, as of this writing; I expect that the principles laid out here will transfer forward as the Vim ecosystem is fairly stable). I often find myself switching back and forth between Visual Studio 2010 (with VsVim) and gvim 7.3. Frankly, I should like to spend more time on the gvim side than I do. While a great deal of time and effort has gone into customizing my vimrc for .NET development, I often find myself switching back to VS in order to get the benefits of Intellisense when working with parts of the very hefty .NET Framework that I do not recall by memory.

Every so often, I do some fishing around for something useful to make my Vim Omnicomplete more productive. In this post, I will layout my newest attempt and analyze the findings. As such, this post may or may not be a tutorial on what you should do. In any event, it will be a science experiment in the plainest sense of the word.

First, the hypothesis. While checking out the Vim documentation on Omnicomplete, we see that the Omnicomplete function for C makes heavy use of an additional tag file, generated from the system headers
[http://vimdoc.sourceforge.net/htmldoc/insert.html#ft-c-omni]
, and that this file is used in conjunction with what Omnicomplete understands about the C programming language to make a good guess as to what the programmer likely intends.

It should be possible then, with minimum fuss, to generate a similar tag file for C#. It may also be necessary to tweak the completion function parameters. We will look at that after we have checked the results of the tag file generation.

It turns out that Microsoft releases the .NET 4 Framework’s sourcecode under a reference-only license
[http://referencesource.microsoft.com/netframework.aspx]
. The initial vector of attack will be to download the reference code, and build a tag file from it (this seems well in keeping with the intent behind the license—if this is not so, I will gladly give up the excercise). The link with the relevant source is the first one (Product Name, “.NET” and version of “8.0” as of this writing). The source was placed under RefSrc in Documents.

After running:

ctags -R -f dotnet4tags *

in the RefSrc\Source\.Net\4.0\DEVDIV_TFS\Dev10\Releases\RTMRel directory, we got our first pass at a tag file. A little googling prompted the change to this
[http://arun.wordpress.com/2009/04/10/c-and-vim/]
:

ctags -R -f dotnet4tags --exclude="bin" --extra=+fq --fields=+ianmzS --c#-kinds=cimnp *

Then, as the documentation says, we added the tag to our list of tag files to search:

set tags+=~/Documents/RefSrc/Source/.Net/4.0/DEVDIV_TFS/Dev10/Releases/RTMRel/dotnet4tags

When this is used in conjunction with tagfile completion (C-X C-]) the results are superior to any previous attempts, particularly in conjunction with the Taglist plugin
[http://www.thegeekstuff.com/2009/04/ctags-taglist-vi-vim-editor-as-sourece-code-browser/]
.

With this alone, we do not get any real contextual searching. For example, if we type something like:

File f = File.O

and then initiate the matching, we get practically any method that begins with an O regardless of whether or not said method is a member of the File class.

If we stop here, we still have a leg up over what we had before. We can navigate to .NET framework methods and fetch their signatures through the Taglist browser—but we would still like to do better.

The only reason the resulting tagfile is not here included, is that it is fairly large—not huge, but much too large to be a simple attachment to this post.

A Quick Note on Building noweb on Cygwin

My laptop has bitten the dust. Until I have the chance to open it up and see if the damage is fixable, I have been borrowing my wife’s computer to tinker (to her annoyance, I’m sure, but she used my laptop until we replaced the desktop so all’s fair). I was going to install noweb on cygwin, and hit the following error on build:

In file included from notangle.nw:28:
getline.h:4: error: conflicting types for 'getline'
/usr/include/sys/stdio.h:37: error: previous declaration of 'getline' was here
getline.h:4: error: conflicting types for 'getline'
/usr/include/sys/stdio.h:37: error: previous declaration of 'getline' was here

As I had built noweb before, this error struck me as a little strange. It turns out, that in stdio.h, Cygwin includes its own definition of getline, unlike on standard Unix-likes. A quick googling turned up that this was not unique to noweb, but that other packages had encountered similar difficulties. The answer that worked for me is here:

http://ftp.tug.org/mail/archives/pdftex/2006-February/006370.html

In sort, all one has to do is open /usr/include/sys/stdio.h and comment out the line that reads:

ssize_t _EXFUN(getline, (char **, size_t *, FILE *));

For safety’s sake, I reinstated the line after installing noweb and everything seems to be running fine.

Literature Review: PEGs

Parsing Expression Grammars, or PEGs, are syntax-oriented parser generators, meant to ease the task of building rich programming languages. I had had the opportunity to tinker with PEGs sparingly and, finally, I got around to reading the original paper (available here: http://pdos.csail.mit.edu/~baford/packrat/popl04/). My reading notes from the paper can be downloaded here:

http://www.mad-computer-scientist.com/blog/wp-content/uploads/2011/06/peg.html

I am fully aware that this is not, as it were, a new paper. It came up originally in my searches for a good parsing library in Common Lisp. For the project that it was intended for, I ultimately moved on to using Ometa. While Ometa is a fine system, it actually did not win on power grounds because, quite simply, I do not need the extra expressiveness for what I am working on. It won out because the implementation was better than the PEG library I had tried.

As it is kind of old territory, my review has little to say. In reality, when I first ran across PEGs I felt strangely out of the loop, but here goes anyway:

PEGs are a powerful mechanism for defining parsing grammars. The form of the language itself is similar to standard EBNF in its general layout, but allows native creation of grammars. It avoids the ambiguities inherent to Context Free Grammars by using prioritized selection of paths through the grammar. As a result, it is actually more powerful than traditional CFGs while being simpler to use.

While PEGs seem to have also caught on a lot better across than its predecessors (discussed in the paper), the seem to receive less notice than Ometa, which further builds on PEGs.

How WPF gets GUI programming right

WPF is another in a long line of Microsoft UI related technologies, each promising more than the one before. WPF is basically Silverlight for the desktop (or, if you prefer, Silverlight is WPF for the web). We have been building an application in WPF as of late at my place of employment, and I’d thought I’d post what I thought that WPF does right.

The biggest thing is that WPF builds UIs declaratively. I cannot stress enough how important I think this really is. The biggest pain about using Java’s Swing framework was writing long sequences of code that initialized controls in a form’s constructor. Under the hood, Windows forms works pretty much the same way. The biggest difference is that Microsoft ships a nice designer with Visual Studio, so the raw kludginess of the approach is hidden from most programmers, since they look at everything through the lens of the designer.

The declarativeness goes beyond simply allowing one to declare that widgets will exist to their layout (via the Grid mechanisms–really, these should be used by default and the designer left on the shelf) and their data flow. The latter is particularly interesting. ASP.NET has data binding, but the version employed by WPF is far more sophisticated. When I jumped back to an ASP.NET project, I immediately found myself missing the power of WPF databinding, but to add it to a web framework would unquestionably require a continuation based framework like the one employed by Weblocks or Seaside.

The importance here is that both the interface and how it interacts with data can be declared. Many GUI designers and markup languages have come along that allowed one to declare the layout, but few, if any, mainstream GUI designers have allowed so much expressiveness.

The hard part about all this, is that C# is a statically typed language and, as a result, a lot of these features are based heavily on reflection which is a performance hit, due to the fact that the JIT compiler cannot really optimize these things. Perhaps it was just my imagination, but I feel pretty sure that WPF applications lag behind their windows forms cousins in terms of speed.

All in all, though, WPF is a fine framework, though.

Polymorphism, Multiple Inheritance, & Interfaces…Pick 2.

The title for this post comes from a statement that was brought up by a  coworker as having been said to him. The overall point of this post will be simple: given that choice, your answer should be obvious: you want polymorphism and multiple inheritance, because there is nothing that you can do with interfaces that you cannot do with multiple inheritance.

Interfaces provide two things, depending on their use: a form of multiple inheritance in languages that do not otherwise support it and design-by-contract capabilities. Clearly, in the former case, you are better off with multiple inheritance, as you receive the full power of the feature. In the latter case, it is trivial to create an almost-empty class that acts as an interface, if that is the effect you are after.

The main objection raised was the counter example: what if you have a class Animal and another class Plant. Surely you do not want a programmer to inherit from both? That would not make sense. To which I would answer Why not? If it makes sense to whomever wrote it, why prevent it? They might, after all, be creating something for the little shop of horrors.

Largely, I  think the thinking that interfaces are somehow superior to multiple inheritance comes from never having used multiple inheritance in a system built from the ground up to support it (like CLOS in Common Lisp) as multiple inheritance strictly supersedes interfaces.

The Literature

Looking back at my last few posts, something occurred to me: a lot of the more exotic focus of this blog has been lost. While I enjoy examining MVVM and QuickBooks, one of the whole points of this blog was to offer a fusion between useful code monkey concepts and computer scientist (hence, the domain name of this site). Lately, there has not been much “scientist” at the mad computer scientist.

One of my new series of posts is going to be literature reviews. I have a massive reading list of computer science papers queued up as well as some other materials. In these posts, I will read a journal article or watch a lecture and post my notes and thoughts about it. The first one will be coming soon, so look out for it.

More on Microblogging and Programming

I had been rolling around some thoughts on microblogging and programming since my last blog post. First of all, I found it interesting that Twitter started life as an internal project before getting VC funding. This reenforces, to me, the value of what I as saying, which is that microblogging for more limited audiences and topics is more useful than the present day and age where we have people microblogging about brushing their teeth.

I have also been interested in doing more work on Sheepshead. According to gitorious, my last commit was over a month ago. Such are the results of having a family, a job, and a life–but I really want to get back to working on it. As I start gearing it all up again, I have decided to try a little experiment. Instead of simply waiting on someone else to try out microblogging for a small development team, I am going to try to bootstrap a small team while microblogging. As I develop Sheepshead and push it forward, I am going to try and use microblogging to mull over design decisions and announce progress.

The service I have decided to use for this endeavor is Identi.ca (you can see the stream here), rather than the more ubiquitous Twitter. I did this for a few reasons, chief among them being that I expect there to be more engineering types as well as more open source-minded individuals on Identi.ca. Another important consideration is that Identi.ca allows its users to export data. My intention is to keep backups of the information on the feed, so that if something were to happen to Identi.ca and the project attained a meaningful size, a StatusNet instance could be setup, even if only as a stopgap.

We will see how this all goes (or if it does–I can definitely see how Sheepshead is sort of a niche development). In the mean time, I am going to try and get some code written.

Linq to Sql is not fit for GUI Applications

The title is a little incendiary, I admit, but I think it is a good place to start.

We are building a database-driven application with WPF (using MVVM) & Linq to SQL and, in the process, a few caveats about Linq to SQL have come out in a truly fine way.

The issues all revolve around that little innocuous thing known as a DataContext. For those of you who may not be familiar with the idea, in Linq to SQL a DataContext is “the source of all entities mapped over a database connection. It tracks changes that you made to all retrieved entities and maintains an “identity cache” that guarantees that entities retrieved more than one time are represented by using the same object instance.”

Further down the reference page for the DataContext we read that

In general, a DataContext instance is designed to last for one “unit of work” however your application defines that term. A DataContext is lightweight and is not expensive to create. A typical LINQ to SQL application creates DataContext instances at method scope or as a member of short-lived classes that represent a logical set of related database operations.

so the most logical place to create and dispose of our DataContexts is in the methods that implement the business logic. This works perfectly well for retrieving data, and for updates on entities that have no relationships, but fails with a

Cannot Attach An Entity that already exists.

exception when an update is made to entity relationships. The problem is that Linq to SQL cannot move objects between DataContexts, so if one context was used to lookup the object in question and another was used to lookup one used in a relation (say, to a lookup table), then Linq throws the fit seen here. In a web application, it is much easier to keep this from ever happening, as a single DataContext will likely be used to do the work from a BL call (or, at least, the calls will be sufficiently separate as not to trod on each others’ feet).

If the context is moved up to the business object layer (i.e. as a static member), the problem is partially alleviated and partially aggravated. It is somewhat alleviated in that all of the objects of a certain type will, at least, have been pulled from a central DataContext and so will have no issues amongst themselves. However, there is still the issue of when an object is set (via databinding) from a list that was pulled by another datacontext. An easy, and genuine example, is where one entity (call it A) has an attribute named “type”, which must be one of the entries in a lookup table (which we will call entity B). If a drop down list is databound to the entries in the lookup table are pulled by entity B (the most logical choice) the same error message as above is hit–unless, of course, all of the entities are repulled by entity A’s datacontext before saving. A labor-intensive, innefficient, and maintenance heavy process. At any rate, the application could be written this way, but not without a great deal of effort to repull and remerge data with a single context.

Finally, one could move the context up to the application layer–the entire application shares a single datacontext. The problem with this is that, in an application where multiple tabs or windows can be open, if any single object attempts to save its changes via SubmitChanges, the pending changes for all windows will get submitted, even if the user comes back and hits “Cancel”. The result in this scenario is utter and complete chaos.

Ultimately, what we did in this scenario was to create a single DataContext per ViewModel (where we experienced issues with this, not universally) and pass it through all of the data fetching operations. The bookkeeping was certainly a little tedious to write, but it worked. From a conceptual standpoint, this is very dirty as it makes the presentation layer aware, even in a limited sense, of what is being done by the data access layer. While Linq to Sql is very nice, it has some very bad shortcomings when used in GUI applications.