quarta-feira, 25 de julho de 2018

HISTORY: Brian Kernighan Remembers the Origins of 'grep' - The New Stack

Interesting new old insights from one of the creators of the C language:

Brian Kernighan Remembers the Origins of 'grep' - The New Stack



Quoting:

"This month saw the release of a fascinating oral history, in which 76-year-old Brian Kernighan remembers the origins of the Unix command grep.



Kernighan is already a legend in the world of Unix — recognized as the man who coined the term Unix back in 1970. His last initial also became the “k” in awk — and the “K” when people cite the iconic 1978 “K&R book” about C programming. The original Unix Programmer’s Manual calls Kernighan an “expositor par excellence,” and since 2000 he’s been a computer science professor at Princeton University — after 30 years at the historic Computing Science Research Center at Bell Laboratories.

K and R C - cover of The_C_Programming_Language_cover (via Wikipedia - fair use)

In new interviews with the YouTube channel Computerphile, Kernighan has been sharing some memories…

Brian Kernighan – via Princeton.edu

Two years ago Kernighan remembered when he’d joined Bell Labs as a graduate student in 1967 — studying electrical engineering because at the time there was no “computer science” majors. “It was a wonderful place because there was an enormous number of really good people doing really interesting things and nobody telling you what to do…”

“In one single, largish building there were probably 4,000 people, of who about 2,000 were probably PhDs in various forms of science, physics, chemistry, materials, and then on the, call it the softer end, mathematics and the relatively new field at that point of computer science.”"
(...)

"But there also wasn’t even enough memory to edit large files. Back then ed was the standard text editor — a small program written by Thompson for a world with only primitive monitors “There was no cursor addressing, so you couldn’t move around within a line. The ed text editor reflected that kind of thing.”

ed let you specify regular expressions, which appeared between slashes, along with some operations (specified outside those slashes) to perform on the lines which matched. The operations were often indicated with a single letter, like ‘p’ for print or ‘a’ for append. And there was also a ‘g’ flag which stood for global and would perform a command not just on one line, but on every line of a file (that matched the specified regular expression). For example, that print command, “p”.

And if you wrote out that command — with g for “global” and p for “print”, applying it to a regular expression between the slashes — it would look like this:

g/re/p

Kernighan supplies some crucial context — in the form of a story. Their colleague Lee McMahon had wanted to study the Federalist papers, which were written by several different authors (including Alexander Hamilton) but published under the same pseudonym, carefully analyzing the text for clues about their original authors by finding all the occurrences of specific words and phrases. Unfortunately, a plaintext version of the collection was one megabyte — “down in the noise by today’s standard,” but at the time: “wouldn’t fit. He couldn’t edit them all in ed.”

“He sent this to Ken Thompson, and then went home for dinner or something like that. And he came back the next day, and Ken had written him a program.

“And the program was called grep.”"