greenbar = G = gribble

grep /grep/ vi.

[from the qed/ed editor idiom g/re/p, where re stands for a regular expression, to Globally search for the Regular Expression and Print the lines containing matches to it, via Unix grep(1)] To rapidly scan a file or set of files looking for a particular string or pattern (when browsing through a large set of files, one may speak of `grepping around'). By extension, to look for something by pattern. "Grep the bulletin board for the system backup schedule, would you?" See also vgrep.

[It has been alleged that the source is from the title of a paper "A General Regular Expression Parser", but dmr confirms the g/re/p etymology -ESR]

--The Jargon File version 4.3.1, ed. ESR, autonoded by rescdsk.

Generally, to go through a list, keeping only items which meet a condition. The word originated with the ed editor, in which the command 'g/re/p', where 're' is a regular expression, would globally (that's the 'g') search for lines matching the regular expression in the file you were editing and print (that's the 'p') the results.

Presumably some time after ed was written, the program grep was born. grep, some version of which can be found on virtually every UNIX installation, searches through files and prints lines matching a regular expression. The grep program embodies the UNIX philosophy: a small, simple, reusable tool that handles plain text. grep is flexible: among the many options you can give it, you can tell it to print non-matching lines (-v), print some number of lines before (-B), after (-A) or around (-C (context)) matches, to count the number of matches (-c), ignore case (-i), print line numbers (-n), and to search recursively through directories (-r). grep is perfect for use in pipe-linked shell commands.

Examples:

  • grep -n foo somefile

    Searches for the pattern "foo" in somefile, and prints lines found, with line numbers.

  • grep '^.b.$' /usr/share/dict/words

    Searches for three-letter words with the middle letter "b" in the dictionary: this kind of thing is great for cheating at crosswords.

  • last | grep daf | head

    Lists my last ten logins.

  • find /var/log/apache | grep 'access.*gz$' | xargs zcat | grep ja.net | awk '{print $1}' | sort | uniq

    Finds compressed Apache log files, get the contents, search for accesses from ja.net, extract the hostname, sort the results and remove duplicates. I used this, out of curiosity, to get a list of the JANET proxy servers that had accesed a web server, since they are named after pizza toppings. grep is used here twice: first to select filenames and secondly to select lines from those files.

See your local grep manual page for details, or see the GNU documentation for their version at http://www.gnu.org/software/grep/grep.html.

Since grep's invention, however, the verb "grep" has acquired a more general meaning: to search through a list of things, keeping only items which meet a certain condition. Some programming languages, notably Perl, have a grep function, a higher-order function that selects values from a list. Each value is only kept if substituting that value into an expression returns a true value. Other languages have an equivalent function, such as Scheme's select, Ruby's Array#select, and Haskell's filter.

grep can be defined nicely in Haskell as

grep predicate list = [ item | item <- list; predicate item ]

Examples:

  • Perl
    grep { $_ > 3 } qw(1 2 3 4 5 6);
    
    # result: qw(4 5 6)
    
  • Ruby
    [1, 2, 3, 4, 5, 6].select { |x| x > 3 }
    
    # result: [4, 5, 6]
    
  • Haskell
    filter (>3) [1, 2, 3, 4, 5, 6]
    
    -- result: [4, 5, 6]
    
  • Scheme
    (select (lambda (x) (> x 3)) '(1 2 3 4 5 6))
    
    ; result: '(4 5 6)
    

Here's a wonderfully easy mistake that you can make using grep:

grep -r whatever . > output.file

For those that don't know this command, here's what's going on:

So what this command says is "look in all files in this directory and its subdirectories for the string "whatever" and write any lines you find to output.file."

Do you see the error? Neither did I, the first time I typed this in, hit enter, and waited. and waited. and waited.

Eventually, I hit ctrl-c, took a look at the output file, and began to laugh. (I wasn't sure who I was laughing at -- myself, or the person that wrote the command.) The problem was, I told grep to look in every file in the directory. I never told it not to look into the file it was creating. So it was merrily churning along, reading its own output file, finding the search string on every line (of course), and appending new lines on to the end.

This is a classic mistake -- failure to remember that the computer is just a very fast idiot.

If you're wondering how to avoid this problem, here are two solutions. The explanation of why they work is left as an exercise.

grep -r whatever . > ../output.file
grep -r whatever . | cat > output.file

Disclaimer
I did this using Cygwin under Windows NT. I do not know that it would behave the same way under other Unices, although I suspect it would. Also, while preparing this writeup, I made sure I could re-create this behavior, and found that it only occurs if there is at least one hit in a subdirectory. Somebody ambitious could read the source code and explain why this happens.

Log in or register to write something here or to contact authors.