sam's Structural Regular Expressions
By: . Published: . Categories: quote ed regex plan9.This is the first post in a series of quotes from papers. These are the great turns of phrase, the intriguing idea I’ve run into nowhere else, the start of something that could have been great but probably fizzled.
First up: structural regular expressions, as introduced in the GUI text
editor sam
:
In other UNIX programs, regular expressions are used only for selection, as in the sam g command, never for extraction as in the x or y command. For example, patterns in awk are used to select lines to be operated on, but cannot be used to describe the format of the input text, or to handle newline-free text. The use of regular expressions to describe the structure of a piece of text rather than its contents, as in the x command, has been given a name: structural regular expressions. When they are composed, as in the above example, they are pleasantly expressive. Their use is discussed at greater length elsewhere.
x
extracts every chunk of text matching the regex provided to it. Each chunk
has the rest of the editing pipeline run on it. Want to change every n
in a
hunk of text to an m
? Select it all in the window with button 1, focus the
sam
command window with button 1, and type in:
x/n/ c/m/
Hit return and this command pipeline runs on the (implicit range) dot
, also
known as “the current selection.” x
grabs an n
, c
then changes it to an
m
. You can layer on more commands, including g
(guard) as a by-the-way if
statement. The command text stays in the command window in case you want to
run it again.
Boom! Instant macro, no memorization of registers required. Take that,
vim
qX…q @X
.
This search for symbiosis between mouse and keyboard is what led to sam
. Most
UNIX editors bolt mouse input onto an established keyboard-centric paradigm.
Sam
rethinks editing to make the mouse an integral part of it. (Acme
would
later take this mouse integration to new heights. We’ll get to acme
in time.)
Back to structural regular expressions now. Pike has a whole paper on the topic that I will doubtless get to eventually. But just this little bit is tantalizing enough.
I mean, think about awk
, think about how you use regular
expressions there, or how you use them in your editor du jour.
Are these tools really making the most of regular expressions?
awk
and friends just perform record splitting on a set of separator
characters. Imagine how limited your regexes would be if all you got to do was
specify what to stick between two character class braces: [
your characters
here]{1,}
. That’s all you get with this simple record separator
construction.
And it’s not like we’ve made great strides: Search-and-replace in an IDE like Xcode or Eclipse gives you even less expressiveness.
I look forward to reading more about structural regular expressions in future.
For more on sam
, see:
- Rob Pike’s paper introducing the editor. (Also available in PDF.)
sam
at cat-v