Monday, March 9, 2009

Terry's Portable SHell

Hmm, my recent pet project (tpsh) is actually shaping up quite nicely. It is almost to what I would consider a usable interactive state, the only big irk left I guess is the pipe handling and a few odds and ends. Development started like last month (first commit in my master branch is dated 2009-02-28), and has been a very on/off thing in my spare time; a lot remains to be done, but progress is quite nice.


I rather like the korn shells, but I find it annoying to see what implementation of korn from who behaves in what way. OpenBSDs customized pdksh being the best of the lot in my opinion, the only problem with the korn *family* being consistency in the wild... cmd.exe just barely counts as a command shell, better for scripting then for interactive use. I can't help but think, cmd.exe's filename completion was either written by a 5th grader or a hold over from command.com (I haven't used command.com in years). Currently I use zsh most of the day on FreeBSD, ksh on OpenBSD, and cmd.exe on Windows XP. So it is really important to me to have a single and SANE shell.

I used to use tcsh a lot, but after getting into Bourne-style shell scripting (Nota bene: I know nothing about csh scripting) I got pissed and sought out a shell that would allow me to test sh snippets at the prompt. To be honest, I do not have any issues with the GNU Bourne Again SHell (bash); some distros default init files aside. Yet just like tcsh/zsh -- I have no big desire to cart bash around with me everywhere (and I do not like cygwin).


The solution? Create a shell that behaves more or less consistently across platforms, does what I want, and is not a pain in the ass to port along for the ride.


Line editing is completely delegated to Term::ReadLine. Someday writing something like readline or curses on an alien OS for terminal XYZ would be one thing, having to implement shell line editing and be portable between the UNIX and Windows NT environments: I wouldn't do it if you paid me. (and even if drunk, I would still likely use normal libraries!)


On the interactive level, line editing for me is an absolute must. Command completion I can live without for awhile. Unix names are usually short and I like fairly terse and structured ones myself, winsucks is a bit of a different story. The child like way cmd.exe does command completion, it's almost better to do without completion on windows haha!


Right now basic expansions work, ${VAR} and $VAR, line editing and history is via Term::ReadLine (I've been using Term::ReadLine:: Zoid and Perl implementations). There's only rudimentary handling of quoting for the time being, something that needs labouring with in order to handle nested quotes (e.g. echo "`ls 'C:/Program Files'`") but currently does the job well enough for right now. The 'single' and "double" quotes mostly work as expected, The `backticks` work as expected in simple cases, but since there is no proper support for shell script or -c "command" yet, things like `find /some/where -name foo -print` will work as expected, but `find /some/where -name foo -print | xargs whatever` will not work. Once things are more evolved on the scripting side, it should be trivial to implement backticks correctly (nested quotes aside). The idea being to pass the backticks onto a child shell. Right now the only big catch 22 with quotes is only one set can be used and they do not nest yet; fixing that will be fun.


I/O Redirection and pipes also work for starters, but not completely. The >, >>, and < operators do work, but don't accept an optional file descriptor number yet (e.g. foobitz 2>/dev/null) or other tricks of the trade. Basically just the I/O redirection I use a lot is implemented for right now; since the focus is on interactive use by yours truly, it will probably stay that way for a while. Command1 | command2 also works well enough right now, but I've yet to implement pipe lines properly (e.g. cat f1 f2 f3 | sort | uniq | less; doesn't work yet); just wasn't enough time to shift from prototype to finish without sleep lol. Only one use of pipes currently is supported, until I have the time for coding unlimited segments in a pipe line.

Aliases even work ok at the moment, but no (damn!) addictive runtime parameter expansions have been added yet (e.g. alias qux="xuq -f $1 -x $2"). That will have to be done someday, hehe. In point of fact, $*, $@, and the things related to positional parameters are not even implemented yet.


One benefit of my development environment, is I can use I/O redirection to run a simple test suite. I created a 'test-sh' script that strips the environment of all but the interesting stuffs before invoking tpsh. Then I have a test file with commands written out.

 $ ./test-sh < file
and I can watch the results of the test file fly across my screen, as if typed interactively (I freaking love unix shells). It is not quite a unit test, but it is enough to be able to run the test file and monitor the results for unexpected regressions or fuck ups, before pushing the code to vectra. Currently shell globbing is limited to Perls glob(), probably to be implemented directly with File::Glob::bsd_glob() in the future, and someway of letting an environment variable specify the desired GLOB_* flags. (perl 5.8s glob() seems to do what I want). I have yet to decide how to handle tab completion across the Gnu, Perl, and Zoid ReadLine modules but coding the completion functionality itself should be fairly easy; the issue is just how to plug it in properly. Exactly what the globbing behaviour will be like, I dunno yet: because to be honest I have never noticed any big difference between (t)csh, perl, and zsh when it comes to using globs (but I also admit that I rarely do more then abuse * and {x,y,z} on a daily basis). If at all possible, I would like to use File::Glob rather then roll my own (and modern perl glob() does use File::Glob in the background afaik). I believe that as soon as a full pipe line implementation is done, I'll be ready to start coding it from a tpsh session lol. Goal wise, where am I trying to go? It is implemented in Perl, because perl is fun, perl is less pain to port then C/C++, and I'm not going to use Java or C# period lol (my zsh setup starts up slow enough as is). Python, Ruby, or PHP would also have been good choices, except I'm trying to avoid rubber banding between languages. (Also note: I _hate_ writing C code on Win32.) It should generally behave like a traditional Bourne shell most of the time, but not be a true sh knock off. Whenever in doubt, mimic ash / sh behaviour. It's not meant to be POSIX compliant, anymore then it is meant to be a real sh: based on, not clone of. It is supposed to behave as close to identical under unix and winnt as possible, with "pains of installing" meant to be a working ReadLine module. A good example is perl -e 'code' should work on Windows without having to adapt cmd.exe quoting rules. (really I think cmd.exe has shitty documentation for such an essential program). One reason I like languages like Perl and Python, they mostly behave the same on whatever OS, with a minimal of pain involved (especially Python) It should be easy to use, providing all the expected features -- without massive bloat. It behaves like _I_ expect, not what someone else expects (good example: unix style shell, not dos style shell) It is not an interface to Perl. There's at least 2 different perl shells out there that I know of, but I do not want a shell script that behaves like perl or even does configuration in perl; I want a shell written in perl that behaves like a regular shell lol. It's also an easy project to take my mind else other matters in life right now... tpsh's builtin eval command will work more or less like sh, not CORE::eval !!! Maybe I will implement a peval or pleval builtin that works like shells eval, but instead executes it as perl code rather then shell commands. (basically type `peval perl` code instead of `perl -e 'perl code'`). Accessing perl code through tpsh just is not a priority, when I want to talk to a perl interpretor, I'll invoke one. Eventually I want it to be able to execute shell script in some form, the goal of which (obviously) will be handling my shells initialization files (~/.${USER}_shrc and ~/.site_shrc) without going belly up. To pull that off, a Bourne-style shell, a which program or built in, and support for functions / aliases is required. (My init files handle sh, bash, zsh, and several flavours of korn; plus several OSes from one main file + a site local settings file). My init file is rather elaborate in design compared to most others I've seen, yet very conservative in what it demands of the shell: the most advanced sh features required being function support lol. Basic idea of feature/priorities for tpsh:
 a usable shell for daily use (basically done)
 tab completion (filenames, then command; then pluggable, then 'smart' completion)
 code the remaining builtins and allow a coloured prompt ;-) (etc)
 expand the expression syntax into a more usable (read normal) form
 do shell functions
 handle job control (basic (i.e. bg, fg, &), then rest)
 allow shell scripting in a conventional way, rather then tpsh < cmdfile
 grow the control flow and things for shell script (if, while, for, ...)
 take POSIX into closer consideration for the minor details

the only part that worries me, is implementing the [ builtin lol. (mm, maybe feed [ stuff ] into a syntax tree that we can build a corresponding perl statement to eval())

No comments:

Post a Comment