Sunday, September 19, 2010

Programming: Things We Should Take For Granted

Since an anniversary is coming up, and I've accumulated a nice chunk of experience in the years since I started learning how to program—I've been reflecting on what languages have thought me, about getting stuff done. This is also, obviously a rant.

That's the cardinal rule. I don't give a shit if it has more fan boys than Apple, or if it's the most brilliant idea since they blamed the invention of fire on Prometheus. It has to be useful for getting stuff done!

Here's some nick nacks that you should expect from programming languages since the 1970s, oh how little we have moved forward...







Serious Scoping


The ability to define modular scopes. In most cases this simply has to do with managing namespaces, either via built in language constructs (C++, PHP5.3, C#) or the natural order of modules in general (C, Java, Python), but really it's trivial. It kind of has to do with building out of the whole 'scope' concept in general.  In other wards, we could say that a namespace is a kind of scope. Looking at the Wikipedia page, it seems someone agrees with that notion. Using this lingual functionality, you essentially have object oriented programming at your disposal: sans the the 'tors hidden behind the curtain. Some languages even thrown those in, making OOP a real snap. While the inheritance model is popular and the less well known prototyping model is interesting, they are not required for achieving the core aims of object oriented programming; at least not the ones touted ;).

Anonymous Executable Code Blocks

Basically anonymous functions. This means we can write higher order functions, in other words functions can take and return functions. We can even do part of this in C through the use of function pointers (and C++ also has function objects, which are nicer if more verbose). But we can't define those functions on the fly, right there and then, making things harder. C++ and Java will at long fucking last be gaining a solution to this, if we live long enough to see it happen. Most languages worth learning have made doing all this pretty easy.

Abuse of Lexical Scoping

This usually means lexical closures, which in tern makes the aforementioned anonymous functions a prerequisite for happiness in normal cases. They could also cook up something different but equivalent to that for a change, but alas don't count on that! So the anon' func' thing is where it's at for now. If the language lacks closures, it's obviously a fucking moron, or it had better be old. C at least has that excuse. What about you Java? The road to 7 is a sick joke.  Anyway as I was saying, if you have both serious scoping and a way to define executable blocks without a name, you ought to be able to treat it just like anything else that has a value. Here's an example in pseudo code:

var x = function {
  var y = 0
  return function { y = y+1; return y }
}

test1 = x()
test2 = x()

do { test1(); test2() } for 0 to 5


You can expect either of two things to happen here:

  1. Every function returned by x will increment the instance of y stored in x.
  2. Every function returned by x will get it's own private copy of y.

The latter is the norm. I rather think the former is better, if you also provide means of doing the latter, syntactically; this makes the closing over of data more explicit. In terms of how it would look, think about the static and new keywords in many staticly typed OO languages and copy constructors, then you'll get some sort of idea. I blame my point of view on this on spending to much time with Python. Ditto for my thoughts on memory allocation, courtesy of C.

While we're at it, if the vaguely EcmaScript like pseudo code above was a real language: you should expect that function x {} and test1(), test()2 for ... to be sugar for the relevant snippets out of the above, or else you ought to break out the tar and feathers! I wrote them as I did to make it painfully obvious what's happening to people still struggling at var x = function {}. Kudios if you figured out why I wrote do { } for instead of for {}. If you're laughing after reading this, after reading the above paragraph, yippe :-). If you're now scrolling back up, then sorry.


Just shuddup & build it!

Which is so sorely missed in most compiled languages. There should be no need to define the relationship between modules or how to build it, it should be inferable from the source code. So should the type of most things, if the Lisp, SML, C#, and C++0x people are anyone to listen to. Building a cross platform C or C++ app is a bitch. Java is gay. Most dynamic languages are fine, so long as you keep everything in that language and have no C/C++ extensions to build. The closest that is possible to this "Just shuddup & build it" concept in most main stream languages, depends on dedicated build infrastructures built on top of make tools (FreeBSD, Go, etc) or IDEs. Either case is a crock of shit until it comes as part of the language, not a language implementation or a home brewed system. Build tools like CMake and SCons can kiss my ass. It's basically a no win situation. JVM/CLI-languages seem to take a stab at things but still fail flat; Go manages better.

Dependency management also complicates things, because building your dependencies and using them becomes part of building your project. Most dynamic languages have gown some method of doing this, compiled ones are still acting like it's the 70s. For what it's worth, Google's goinstall is the least painful solution I've encountered. Ruby's gems would come in second place if it wasn't for Microsoft Windows and C/C++ issues cropping in here and there. Python eggs and Perl modules are related.

Here's an example of brain damage:


using Foo;

class X {
  public test() {
    var x = new Foo.Bar();
    // do something to x
  }
}


Which should be enough to tell the compiler how to use Foo. Not just that you can type new Bar() instead of new Foo.Bar(), should you wish to do so. Compiling the above would look something like:


dot-net> csc -r:Foo.dll /r:SomeDependencyOfFoo.dll
mono$ mcs -r:Foo.dll -r:SomeDependencyOfFoo.dll


Which is a lazy crock of shit excuse for not getting creative with C#'s syntax (See also the /r:alias=file syntax in Microsoft's compiler documentation for an interesting idea). The only thing that a using statement lacks for being able to tell the compiler, is what file is being referenced, i.e. where to find the Foo namespace at run time. Some languages like Java (and many .NET developers seem to agree) impose a convention about organising namespaces and file systems. It's one way, but I don't care for it.

What is so fucking hard about this:


using "Foo-2.1" as Foo;


in order to say that Foo is found in Foo-2.1.dll instead of Foo.dll. If that sounds like weird syntax, just look closer at P/Invoke, how Java does packages, and Pythons import as syntax.


So obviously we can use the languages syntax to provide whatever linking info should be needed at compile time; figuring out where to find the needed files at compile and run time is left as an exercise for the reader (and easily thunk about if you know a few language implementations).


In short, beyond exporting an environment variable saying where extra to look for dependencies at compile/runtime, we should not have to be arsed about any of that bull shit. If you're smart, you will have noted most of what I just talked about has to do with the compiling and linking against dependencies, not building them. Good point. If we're talking about building things without a hassle, we also have to talk about building dependencies.

Well guess what folks, if you can build a program without having to bitch fuck about with linking to your libraries, you shouldn't have to worry about building them either. That's the sense behind the above. What's so hard about issuing multiple commands (one per dependency and then again for your program) or doing it recursively. Simple. But no, just about every programming language has to suck at this. Some language implementations (Visual C++) make a respectable crack at it, so long as you rely on their system. Being outside the language standards, such things can suck my rebel dick before I'll consider them in this context.

Let's take a fairly typical language as an example. C# and Java for starters, and to the lesser extent of C and C++ as far as their standards allow, we can infer a few things by looking at source code. Simply put if the main method required for a program is not there, obviously the bundle of files refer to a library or libraries. If it is there, you're building a program. Bingo, we have a weener! Now if we're making life easier, you might ask about a case where the code is laid out in the file system in a way that makes that all harder to figure out. Well guess what, that means you laid it out wrong, or you shouldn't be caring about the static versus dynamic linking stuff. D'uh.


Run down of the Main Stream

Just about every language has had serious scoping built in from day one, or has grown it over the years, like FORTRAN and COBOL. Some languages get a bit better at it, C# for example is very good at it. C++ and Python get the job done but are a bit faulty, since you can circumvent the scoping in certain cases; although I reckon that's possible in any language that can compile down to a lingual level below such scoping concepts. Some might also wonder why I had listed PHP 5.3 earlier when talking about "Serious scoping", well 5.3 basically fixes the main faults PHP had, the only problem is in common practice PHP is more often written like an unstructured BASIC. Die idiots Die. Languages like Java and Ruby basically run it fairly typical. By contrast Perl mostly puts it in your hands by abusing lexical scope. I love Perl.

In terms of anonymous functions, Perl, Lisp, and SML are excellent at it and modern C# manages quite nicely (if you don't mind the type system). Where as C, C++, and Java are just fucking retards on this subject. Python and Ruby also make things someone icky: functions are not first class objects in Ruby, so you have to deal with Proc to have much fun; like wise Pythons lambda's are limited, so you usually have to result to Pascalesque scope abuse to manage something useful, in the way of nesting things in the scope. Lisp is queen here, Perl and JavaScript are also very sexy beasts when getting into anonymous functions.

In terms of lexical closures across languages, I'll just leave it to the text books.


As far as being able to shout "Just shuddup & build it!", they all suck!!! Most of the build related stuff is not iron clad defined by the C99/C++03 standards documents, and you are a fucking moron if you expect the entire world to use any given tool X to build their C/C++ software \o/. That rules out things like Visual Studio (vcbuild/msbuild), Make, and all the wide world of build tools you can think of. The most common dynamic languages in the main stream are better. In order of least to most suckyness Perl, Ruby, and Python make the process less painful; my reason for rating them thus is because of extension modules. PHP and Java, I'm not even going to rate. They provide tools to help you build and distribute your extension modules and your native modules. The only gripes come from the principal language implementations being implemented in C. For pure modules (E.g. no C extensions), they are excellent! The least painful language that I've gotten to use, has been Go - which has the best thing since CPAN. Which is still harder than it should be \o/.





The question I wonder, is if it took until like the 2000s to get closures going strong after the serious scoping stuff took off by the late 1960s/early 1970s; will I have to survive until the 2030s-2050s to be able to bask in the wake of just being able to build stuff in peace? Most likely old friend C will still be there, but other languages should reach main stream in another 30+ years... not just the ones we have now. That, or we could go back to lisp.... hehehe.


No comments:

Post a Comment