from The Last Supplement to the Whole
Earth Catalog (1971); co-editor Paul Krassner's
response to Ken Kesey's 17-page screed
entitled "Tools From My Chest"
( www.ep.tc/realist/89 )
"As the UNIX system has spread, the fraction of its
users who are skilled in its application has decreased."
— Kernighan & Pike, "The UNIX Programming Environment" (1983)
I'm a software guy, so my tools are software tools. Over the years I've
picked up tools that have stayed with me; some I found, and adapted to
my uses, while some I created myself. This blog entry is about some
of the found tools — as well as tools for building tools (meta-tools?)
— which have become my favorites.
The Search for Portability
Keeping my favorite tools around has been no easy feat, and frankly,
sometimes I'm astonished that I was able to do it at all. You see,
throughout my computer career (which has spanned about
60% of the history of commercially available computers), there has been an
incessant struggle by vendors to "lock in" customers to proprietary systems
and tools. This tended to work for them in the short term, but in the long
tertm they almost all failed, and left orphaned software at every turn
when the proprietary systems became extinct. The only force working counter
to this was the occasional heroic struggles
of the users, first by banding together to demand standards, and then later
by writing and giving away free standards-based software. These brave
efforts have kept the ovderall trend towards extinction of tools
partially at bay.
At first I didn't understand how important this struggle was.
I began programming in the era of proprietary operating systems such as:
OS on IBM 360 mainframes, RSTS on DEC PDP-11
minicomputers, AOS on Data General minicomputers, plus the toy
Apple DOS disk operating system on the Apple II personal computer,
and the equally toy DOS from Microsoft which ran on PCs.
I wrote several productivity tools during each of these technical epochs
that were all lost to the sands of time.
This began to bother me. I was pretty sure my dad had tools in his physical
workshop dating from the 1940s, such as screwdrivers, chisels and alligator
clips, that all still worked fine. Why couldn't I keep the tools I built
myself from one decade to the next?
When I first used UNIX in 1983 I was elated to find an OS that was somewhat
standard, and ran on multiple hardware platforms. Soon I began to appreciate
its robust set of small tools that can be recombined to quickly solve problems,
and its tendency to hide hardware details from programmers, and eventually
I also greatly appreciated its universal presence on nearly every hardware
platform. But, honestly, the big deal at the beginning was that so many
people I though were smart were gung ho for UNIX.
A Voice in the Wilderness
K&P
I think I first hear about UNIX from some of my smart, geeky friends in
the late 1970s, but I do believe the first hard information I got about
it came from a book I read around 1980, at the recommendation of my friend
Wayne. It was "Software Tools" (1976) by Brian W. Kernighan and P. J. Plauger.
The funny thing was it hardly mentioned UNIX at all, but its co-author
Brian Kernighan was tightly wound into the social network at Bell Labs
which produced UNIX.
I have been re-reading this book this month, and I am amazed to find how much
of its teachings have been deeply integrating into my thinking about software.
It mentions UNIX in only two places I could locate, as an example of a
well-designed operating system, while explaining how to work around the
limitations of poorly-designed systems. This understated praise made UNIX
seem uber-cool.
When I'm reading a book I own I usually note interesting quotations with notes
in the inside front cover, with a page number and some identifying words.
In a typical book I like I'll list a handful of quotes. In this one there
are over thirty. It's hard to pick just a few, but here are some I really
like.
Whenever possible we will build more complicated programs up from the
simpler; whenever possible we will avoid building at all, by finding new
uses for existing tools, singly or in combination. Our programs work
together; their cumulative effect is much greater than you could get
from a similar collection of programs that you couldn't easily connect.
* * * * * *
What sorts of tools [should we build]? Computing is a broad field, and we
can't begin to cover every kind of application. Instead we have concentrated
on an activity which is central to programming — tools that help us
develop other programs. They are programs that we use regularly, most of
them every day; we used essentially all of them while we were writing this book.
* * * * * *
How should we test [a program to count the lines in a file] to make sure it
really works? When bugs occur, they usually arise at the "boundaries" or
extremes of program operation. These are the "exceptions that prove the
rule." Here the boundaries are fairly simple: a file with no lines and a
file with one line. If the code handles these cases correctly, and and the
general case of a file with more than one line, it is probable that it will
handle all inputs correctly.
In addition to these general principles, I learned to appreciate any operating
system that can string programs together, such as with the "pipes" mechanism
of UNIX.
So by the time I actually got to use UNIX, I already knew a little about what
made it so awesome.
Joining the UNIX Tribe
UNIX license plate ( www.unix.org/license-plate.html )
Before the World Wide Web (WWW), it was hard to learn about computers. It
took a rare genius to be able to simply read the manuals and figure out computer
hardware and software. Most of the time a tribe would form around an
installation, with oral traditions and hands-on demonstrations being passed
on to those who were less adept at learning from the manuals. In many cases
the field technical staff for the computer vendor would come in to do demos
and training, and pass along the oral tradition that way, and so there didn't
have to be a "seed" person who could figure it out by reading alone.
I remember asking the the hypothetical question if it would be possible
for someone in an igloo with a generator to figure out a computer all alone.
I was fortunate when I was learning to use UNIX to have just started a new
job where I had access to three very knowledgable and clever people named Dan,
Phil and Jeff.
Dan was a full-time employee of the company, and person who helped me
get the job. He had been adamant that the company get UNIX, accepting
no other option. (He had a "Live Free Or Die — UNIX" license plate
similar to the one shown above, a variation of the New Hampshire state motto,
on display in his office.) They ended up buying a VAX minicomputer from DEC,
but getting the operating system from a third-party vendor, Uniq,
which sold them a version of Berkeley Standard Distribution (BSD) Unix.
It was on this system that I began to learn.
Dan gave me the most hands-on help, setting me up with a login, teaching
me about the tricky but vital $PATH and $TERM variables and the squirrely
nature of backspaces, teaching me how to customize my prompt and giving me
templates for shell aliases and C programming language source code
(more on those in a minute).
He also taught me something I've never seen written down anywhere: for about
50% of the common UNIX commands (I counted), the two-letter abbreviation is
formed by the first letter followed by the next consonant. So:
- archive becomes ar
- copy becomes cp
- c compiler becomes cc
- link becomes ln
- list becomes ls
- move becomes mv
- names becomes nm
- remove becomes rm
- translate becomes tr
- visual editor becomes vi
&etc. Dan also explained to me that UNIX began in the minicomputer era
when most people used teletypes, which printed on paper, to communicate
with timesharing systems, and each keystroke took time and energy, so the
commands were made as short as possible. (We used to have a joke that
you were born with only so many key presses to use throughout your life,
so you had to conserve them.)
A similar impulse encouraged the liberal use of aliases, which allowed
a long command to represented by a short abbreviation, such as 'd' in place
of 'ls -CF' (list files in columnar form, formatted). Dan gave me my first
set of aliases; more on that below.
Through Dan's social network of UNIX experts the company found Phil and
Jeff, who worked as consultants. I didn't see them often — they
both frequently worked remotely from nearby UCSD — but they were
also very helpful.
Phil helped me understand the history and philosophy of UNIX. He told me how
most hardware manufacturers would rush their software people to finish an
operating system as soon as possible, so they could start shipping hardware
for revenue. At Bell Labs, UNIX was developed slowly and lovingly over a
period of about a dozen years. There was no time pressure because AT&T
(also known as "the phone company") was still a government-granted monopoly
for voice traffic, and to keep them from having an unfair advantage in other
communications they were prohibited from selling computer hardware or software.
UNIX was originally for internal use only at Bell Labs.
He also explained that every program was designed to do one thing well,
and be interconnected with others. One important principle was that any
program shouldn't know — or care — whether a human or another
program was providing its input. This is why programmers were encouraged
to have output be terse, or sometimes provide none at all.
The standard UNIX report from
a successful operation is no output. For example, the program
diff lists the differences between two
files. No differences results in no output. In the same vein, on most
traditional mainframe computers if you connected to the command console
and you press the Enter key without typing a command, you get an error
something like "ERROR #200232 -- ZERO LENGTH COMMAND" and then another
prompt. In UNIX you just get the prompt. The assumption is you know
what you're doing. Maybe you just wanted to see if the system is "up,"
or how responsive it is (a relic of the timesharing era which is also
useful when you run large simultaneous jobs on a PC).
Jeff was the most terse. He was very busy writing original software of the
"kernel" of an embedded system, which is very gnarly work, and usually had a
lot on his mind. I'd ask him, "How do you do X?" and he'd usually say "I am
not a manual." Once I said, "But there's six feet of manuals on the shelf;
I don't have to to read them all." "Use man," he replied. "What's that?"
I asked. "man man" he said cryptically. But when I typed "man man" into
a command prompt, I got something like this:
man(1) man(1)
NAME
man - format and display the on-line manual pages
SYNOPSIS
man [-acdfFhkKtwW] [--path] [-m system] [-p string] [-C config_file]
[-M pathlist] [-P pager] [-B browser] [-H htmlpager] [-S section_list]
[section] name ...
DESCRIPTION
man formats and displays the on-line manual pages. If you specify sec-
tion, man only looks in that section of the manual. name is normally
the name of the manual page, which is typically the name of a command,
function, or file.
......
Jeff taught me to be self-sufficient. He would help me if I already tried to
help myself, and hit a blockade. When I told him I couldn't find a "search"
or "find" feature, he said "man grep" and that did the trick.`
Nuts and Bolts
There are some specific tools that I learned 30 years ago which I still use
frequently today as I earn my daily bread. They include:
- the vi (visual) editor
( en.wikipedia.org/wiki/Vi )
The creation myth is that nearly every minicomputer with a teletype
interface had a "line editor," usually called
ed and in the UNIX world
t evolved into ex, the "extended
editor." When Cathode Ray Tube Terminals (CRTs) became cheap enough
for widespread use, it further evolved into
vi, the "visual editor."
Much to my astonishment I have ben able to use vi, with nearly
exactly the same features, on every computer I've owned for the
last two decades.
It's also ben part of the evolution of other UNIX tools.
As the Wikipedia article on ed explains: "Aspects of ed went on to
influence ex, which in turn spawned vi. The non-interactive Unix
command grep was inspired by a common special uses of qed and later
ed, where the command g/re/p means globally search for the regular
expression re and print the lines containing it. The Unix stream
editor, sed implemented many of the scripting features of qed that
were not supported by ed on Unix. In turn sed influenced the
design of the programming language AWK — which inspired
aspects of Perl."
For more details see "A History of UNIX before Berkeley: UNIX
Evolution: 1975-1984."
( www.darwinsys.com/history/hist.html )
- the C programming language
( en.wikipedia.org/wiki/C_language )
The other main thing these three guys got me started with was the C
programming language. C is infamous for being a language that
many consider too primitive, or too close to "the metal," and of
course it predates the "object oriented" revolution, but it is just
about perfect for implementing UNIX. For this reason and others
it probably will last a very long time.
Dan gave me a C code template which I use to this day. It could
use updating, but what the heck, the computer doesn't care, and it
still works.
IQ Tester ( www.lehmans.com/p-4542-original-iq-tester.aspx )
This program I wrote to solve an "IQ Tester" puzzle in 2007 follows
a template that evolved out of the template Dan gave me in 1983.
(In other words it has improvements I've made along the way).
And it runs fine on Windows, Mac and UNIX/Linux!
The definitive book on C is by the ubiquitous Brian W. Kernighan,
coauthoring with Dennis Ritchie who actually invented C:
"The C Programming Language" (1978).
- the C shell (csh) interface and scripting language
nautilus icon for csh in old SunOS UNIX
( toastytech.com/guis/sv4112.html )
( en.wikipedia.org/wiki/C_shell )
Phil explained to me that the "kernel" of UNIX had an application
program interface (API) so application programs could "call" into
operating system. Then there were "layers" that could "wrap
around" the kernel, each adding a new "shell" which was a higher-
level interface. The symbol was a nautilus shell with its
logarithmic chambers. The term came to mean a text-based scripting
interface to the operating system. The first was Bourne's original
shell, which we now call the "Bourne Shell" be he called
sh in fine UNIX naming tradition.
It was the famous Berkley port (and re-invention) that introduced
the "C Shell," csh, a fine
pun in the UNIX humor tradition. This was the first shell I learned,
and I stick with it if I can. These days I'm often forced to
use bash, the
"Bourne-Again Shell" which is based on the Bourne Shell and runs
easily on Windows and Mac.
I'm writing this using vi, and I can "bang-bang out" to a command for
my local bash (precede it by two exclamation points). Here, I'll
make a deliberate error:
!!foo
/bin/bash: foo: command not found
That was my local bash responding, right into my text document. It seems
like I can't live without these things.
Perhaps the most mind-blowing thing Dan did for me was to teach me the
alias mechanism in the C Shell.
By way of example, I can type the echo command with arguments:
echo hello there
and I will get back the "hello there" from my local C Shell. Or I can type:
alias ht echo hello there
and it will appear that nothing has happened, but henceforth when I type
"ht" to my shell I will get "hello there" back. I have created the alias
ht and it's like my own personal custom
UNIX command.
Next, Dan typed this command for me to ponder:
alias a alias
What do you think it does?
- shell tools (UNIX utilities)
In preparing to write this blog I talked to family and friends
about the topic. "It's going to be about UNIX shell tools!" I would
gleefully state, and I kept getting that same glassy-eyed stare
that I'm used to getting if I bring up any of the following topics:
- "The Eastern Question" in nineteenth-century European
politics, dealing with instabilities caused by the
collapse of the Ottoman Empire,
- the Laplace Transform in advanced calculus, economics and
engineering, and
- one's own personal medical problems
And yet I persist. This stuff has been massively useful
to me, and I'm just trying to follow the "hacker ethic" (white hat
version) and share my best tips and tricks. I take it on faith
that there is someone out there who wants and needs this information.
I make frequent use of the following UNIX shell tools (i.e.,
software that can be invoked from a UNIX shell or equivalent):
Thankfully, much of the oral tradition of the UNIX shell tools was captured
in the still-useful book "The UNIX Programming Environment" (1983) by
Brian W. Kernighan & Rob Pike.
If you do find yourself in that igloo with a generator it's nice to know it's
there.
Using What I'd Learned
In 1985 I made a mistake I would never repeat in this century: I quit one job
before another was firmly in hand. When the new opportunity slipped away I
not only faced the inconvenience of having to find another while earning no
money, I found myself going through UNIX withdrawal. There had been
a time when I didn't know what UNIX was; now I couldn't live without it.
I recalled that my email and conferencing account at "The Well"
included a C shell login, and I dialed up just to edit my aliases with vi
for old time's sake.
Then when I got another job it was on a project
maintaining some crufty FORTRAN code on a clunky little Hewlett Packard
minicomputer with a poorly-designed proprietary operating system whose
name I don't even remember. A book came to my rescue: good old "Software
Tools."
I didn't have the time to implement the "Ratfor" (Rational FORTRAN)
pre-processor provided in the book, but I did manage a few pieces, including
comment processing (FORTRAN comments have rigid requirements for placement in
column numbers), so I could be a little sloppier in my editing and still
produce working code quickly. The operating system didn't have UNIX pipes,
but I hacked together a work-around using suggestions from the book. And I
wrote a "find" program, which helped me make an automated index of all the
function and subroutine calls in the code base, which had never
been done before. This set of strategies made my life much easier and only
confirmed the productivity benefits of UNIX in my mind.
The Gift of LINUX
Linus Torvalds stamp
( uncyclopedia.wikia.com/wiki/Linus_Torvalds )
For a while I used tools
on a variety of "workstations" from DEC, IBM, Sun, HP, SGI, and others,
and my tools moved with me from one to the next. The problem was most of
these systems cost in excess of $40,000, and I couldn't afford one myself.
It was only at work that I had access to UNIX. Then with the arrival of
Windows 95 it looked like UNIX was destined for the scrap heap, overtaken
by another proprietary OS on a dirt cheap hardware platform. I worried
that my favorite OS and its whole ecology were going to become extinct before
my eyes.
And along came Linux, the open-source freeware UNIX work-alike that had
since taken over the world. I couldn't be happier about the way this has
worked out. My old tools have new life. I'd like to emphasize that I had
practically nothing to do with it, except being a cheerleader and a user. I
am extremely grateful for all the talented people in the open source world
who have made it possible for me to keep using my favorite tools on modern
computers.
Sometimes I think the greatest impediment to more widespread Linux usage is
the difficulty of pronouncing it correctly. Linux creator Linus Torvalds
has a name that sounds like a popular American comic strip character, Linus
from "Peanuts," but the comic character has a long I (as in "like") while
the Swedish programmer has a short I (as in "lick"), and so does Linux.
Perhaps a re-branding is needed.
Portability Survival Skills
19th century interface adapter
( ridevintage.com/railway-bicycles/ )
It wasn't until the last twenty years that I had Linux on a home computer.
Meanwhile, my wife uses Windows at home (Windows 7 at this point), but I prefer
Mac for my primary system. But during this same two decades I have always
had some for of Windows I was required to use in my work. In that world what
has come to my rescue is the tool set called CygWin from Cygnus.
One of the three founders of Cygnus Solutions the company is a friends of
mine; imagine my surprise to spot him posed with the other two in a hot
tub on the cover of "Forbes" magazine in 1994.
The CygWin tools give you most of the tools I describe above on Windows
platforms, and they're free. I always load them right away whenever I am
issued a PC.
Since 1996 I have been investing my time in the technologies of HTML and Java
for their portability. I create content (such as this blog) in raw HTML
using vi, confident that many other programs can read it.
The appeal of Java was the "write once, run anywhere" philosophy, which
mobilized Microsoft to embrace and sabotage Java, creating a "write once,
test everywhere" requirement in real deployment. That battle seems to be
over now, and instead we're watching Oracle, who bought Sun Microsystems
the Java creators, fumble the franchise in creative new ways. Still,
I have Java code that runs on Windows, Mac and Linux that I continue to
maintain and use.
Vindication
finally, a book that agrees with me
Lest you think that I'm this fossil who can only write C code to process text,
let me assure you that I've kept up with the changing world of software
development, and I've used modern languages such as
Objective C,
Javascript and
PHP,
Microsoft tools like Visual Basic and
Access, Integrated Development
Environments (IDEs) such as JBoss and
Visual Studio, Graphical User Interface
(GUI) frameworks such as XWindows,
Microsoft Foundation Classes, and
Java Swing, and cutting-edge development
methodologies such as Software Patterns
and Agile Development.
But at a point a few years back when I did some soul-searching on my "core
competencies," I ran across a rule of thumb from the book "Blink: The Power
of Thinking Without Thinking" (2007) by Malcolm Gladwell.
He said that to become a master at something you have to put in 10,000 hours
over the course of 10 years (if I remember right). I realized the one thing
I've done that long is code in C.
Amazingly, about the time I began to embrace my inner C programmer, I
discover that, at least for a few months, C was the world's most popular programming language, having experienced a new renaissance.
Eventually it seemed like history caught up with me. All the people who bet
on Visual Basic had to start over with C Sharp, as did all the poeple who
bet on Visual C++, but the UNIX/C code base just keeps on working. I was
gratified that books began to emerge by people who shared my views:
- "In the Beginning Was the Command Line" (1998) by Neal Stephenson
This delightful manifesto by cyberpunk sci-fi author Neal Stephenson of
"Snow Crash," "Diamond Age" and "Cryptonomicon" fame — one of
the few SF authors I know of who can actually program — explains
why real programmers still mostly use their keyboards, and there is no
"royal road" to clicking your way into software development.
His analogy of Macs to sportcars, Windows to clunky station wagons,
and Linux to urban tanks, is priceless.
- "The Pragmatic Programmer: From Journeyman to Master" (1999) by Andrew Hunt and David Thomas
It is a rarity for me to read a book and be exclaiming "Yes!" with
nearly every page, but it happened with this one. It was even more
exciting when I got into the stuff I didn't know about, because
by then I trusted the author completely.
One of the most important principles espoused here is "Don't Repeat
Yourself" (DRY), which often requires code that writes other code.
This has long been one of my favorite tricks, and it is almost magical
in powers. If used correctly it can prevent whole classes of errors
as well as the tedium of hand-coding highly redundant source code.
- "Data Crunching: Solve Everyday Problems Using Java, Python, and more" (2005) by Greg Wilson
This follow-on book by the same publisher deals with the very issues
I grapple with weekly: approaching some unknown input file with a
need to rationalize, "cleanse" and reformat its contents.
Moving Forward
from article: "Linux Now Has 'Double' the Market Share of Windows"
( www.tomshardware.com/news/linux-windows-microsoft-android-ios,20220.html )
The new year brought a new client for my consulting, and I once again found
myself having to use nearly all the tools in my chest to get jobs done quickly.
As Kernighan & co. pointed out more than three decades ago, a lot of what
passes for programming involves processing text files in predictable ways.
I keep encountering the same "patterns" in the problems I solve, and there's
almost always a "sort" involved.
A lot of kibitzers tell me there are other ways to solve my problems,
but they can't seem to get things done as quickly as I can with the UNIX tools.
I went "grepping" through some of the projects I've worked on in the last
six months, and I found, in addition to programs I wrote in C, Java, Python
and the 'R' statistics language, I used these shell tools frequently:
- awk
- cat
- cp
- echo
- grep
- head
- join
- mv
- python
- rm
- sed
- sort
- tail
- uniq
- vi
- wc
And I endeavor to continue to learn. For about four years I have been playing
with Python, and I find it quite promising. It doesn't hurt that its name
comes from Monty Python's Flying Circus, the British comedy troupe, and not
a snake. And just this year I finally began to dig into an old UNIX favorite:
the text processing program called awk.
It's named after the three people who created it: Alfred Aho, Peter Weinberger,
and... wait for it... the ubiquitous Brian W. Kerhighan.
Further Reading
One of the best ways I have found to absorb a new computer language quickly
is to study very short programs or scripts. So called "one-liners" pack a
lot into a small space. The key is to find a tutorial web site that has
a maniacal obsession with explaining every little nuance of each one-liner.
Another great resource is the "Stack Overflow" web site.
Using a clever combination of curating, moderating and crowdsourcing they
maintain a highly reliable and timely question and answer site for programming.
Many of my questions have been answered there. (Note to self: maybe it's time
I gave something back.)
And of course there are good old books. Here are my recommendations for
a well-rounded programmer:
- "Computer Lib: You Can and Must Understand Computers Now" (1974)
by Theodor H. Nelson
I was fortunate to have this highly educational comic book and
manifesto to learn the inner guts of computing, just before starting
my first job in the industry.
- "The Mythical Man-Month: Essays on Software Engineering" (1975) by
Frederick P. Brooks Jr.
This classic of software management is still as relevant as the
day it was published, even though it's based on the work done
for a large IBM mainframe in the 1960s. Ask anybody.
- "Hackers: Heroes of the Computer Revolution" (1984) by Steven Levy
I firmly believe that in order to program well one must learn to
think like a programmer, which means learning about so-called
"hacker culture," including the "hacker ethic" and "hacker humor."
Technical journalism superstar Steven Levy packages the essential
history in this book. In addition it is useful to refer to
following lexicons:
- "The New Hacker's Dictionary" (1996) by Eric S. Raymond (editor)
A book form of a long-lived and highly-evolved computer
file, with heavy contributions from Stanford and MIT.
- "The Devil's DP Dictionary" (1981) by Stan Kelly-Bootle
One man's sarcastic reworking of Ambrose Bierce's venerable
"Devil's Dictionary" (1911).
Dated by its mainframe point of view, but still hilarious
and educational. It has the classic definition,
"recursive — see recursive."
From these sources you will learn that the name UNIX
is a joke, derived from the Multics operating system
developed by MIT, General Electric and Bell Labs..
- "The Information: A History, A Theory, A Flood" (2011) by James Gleick
Sometimes it's difficult to see the forest for the trees.
As the Information Revolution has washed over us through
the last 70 years — a human lifetime — it has changed
nearly every aspect of our technical civilization. Gleick, an
excellent technology journalist, puts the pieces together here
with a nice long view. He also provides a good overview of the
pioneering work of mathematician Alan Turing, and its relevance
to computing today.
Disclaimer
I receive a commission on everything you purchase from Amazon.com after
following one of my links, which helps to support my research. It does
not affect the price you pay.
This free service is brought to you by the book:
A Survival Guide for the Traveling Techie
travelingtechie.com