Wednesday 29 February 2012

I hate make

Actually, I hate make and shell (all forms) and sed and basically, the *nix command structure.

The main reason: escapes. I just can’t seem to ever get them right.  In Makefiles they are a nightmare!

Just try issuing a sed -e command after a dependency — AFAICT§ there are three levels of escape to consider: 
  1. the make level (where $ is special, and you double it to get a real one), 
  2. the shell level (where $ is special outside of single quotes) and 
  3. the sed pattern level, where $ is special and you “escape” it with a backslash to make it not special again.
Here is the line (almost) I had to get right:
sed -e 's:^ABC=$:ABC=${VAR}:' file >file.tmp

Now, execute this on the command line (bash shell) and it nearly works — it should replace a complete line “ABC=” by the complete line “ABC=${VAR}” in the file called file.

It doesn’t actually work because of the second $.  I have used (single) quotes so that the shell doesn’t try to substitute the (apparent) shell variable ${VAR} but I forgot that $ means ‘end-of-line’ to sed (even though I used it to mean that in the first part of the substitution string). So actually, it ought to be:

sed -e 's:^ABC=$:ABC=\${VAR}:' file >file.tmp

using a backslash escape to tell sed to treat the second $ literally.

So this second version goes in the Makefile, and of course it fails.  (And, it takes me some time to find this out, since the make is executed in the heart of some long-winded build process, and if I don’t execute it there it will not have the context to make it work — bórza!)

The reason is that single quotes don’t protect expressions in make, so it tries to substitute the ${VAR}.  “Aha,” I cried, “I’ll have to escape the backslash and the $ for make.”  Like this:
sed -e 's:^ABC=$:ABC=\\$${VAR}:' file >file.tmp

But I was wrong, the backslash isn’t sufficiently special in make — well it is, but double backslash is not “escaped” to backslash: instead it leaves it as double backslash, and the single quotes still offer no protection.  So we end up with “ABC=\${VAR}” in the output.

Round I go again:

sed -e 's:^ABC=$:ABC=\$${VAR}:' file >file.tmp
and this works.  Finally.

Why all this fuss?

Because *nix systems and their utilities have internalized all this escapery, inherited from an old-fashioned and stupid macro-language-style command structure, and we are now stuck with it.  Everywhere.  And of course, *nix being the democratic organisation it is, everyone does it differently, with different rules and with different exceptions.  Ugh.

So... I think we should start to unravel it.  We could start with make, for example. Let’s create a version of make where all the expressions are evaluated with referential transparency, which is to say that a literal string, once established, is never rescanned and re-interpreted again, even if I pass it around in a variable and it appears in another expression.  Substitutions are only made once if I say they should happen.  If I write an expression with a $ in it, it cannot be interpreted as a variable indicator anywhere else.

Then, the utilities that we call from make should have an ‘uninterpreted’ interface. That is, it should be possible to pass them a literal string without worrying about the contents of the string.  If we are a utility passing strings we haven’t checked, we no longer have to worry about what they might contain and scan them for things to escape.  That way, we also don’t have to worry about what the utility is that we are calling, nor where it might send this string.

I guess we would have to tackle the shell, too.

There would be consequences: some of our tricks (for example: escaping the right number of times so that the cascade of string passings remove the escapes and trigger interpretation at just the right level; or, varying the name of a reference using the value of another) would be harder to achieve — but let’s face it, who needs these really? And when you do, how long does it take to get it right?  And have you ever got it (really) right?

Yup, a proper referentially transparent shell would be a first step.  Or maybe a grep....

hmm...

§ I also hate all these cheesy acronyms — it’s just a linguistic barrier to ensure the prols can’t easily join the club.

PS: Did you notice that the ‘footnote’ didn’t link properly? I don’t know how to do this in this form — how do I link to an anchor in the same page?