GREP.EXE, the text search utility

From RAD Studio
Jump to: navigation, search

Go Up to Command-Line Utilities Index


GREP (Global Regular Expression Print) is a powerful text search program derived from the UNIX utility of the same name. GREP searches for a text pattern in one or more files or in its standard input stream.

Using GREP

Here is a quick example of a situation where you might want to use GREP. Suppose you wanted to find out which text files in your current directory contained the string Bob. You would type:

grep Bob *.txt

GREP responds with a list of the lines in each file (if any) that contained the string Bob. Because GREP does not ignore case by default, the strings bob and boB do not match.

GREP can do a lot more than match a single, fixed string. You can make GREP search for any string that matches a particular pattern. (See The Search String section in this topic.)

Command-Line Syntax

The general command-line syntax for GREP is

grep [-<options>] <searchstring> [<files(s)>...]


Command-Line Elements

Option Description
<options>

Consists of one or more letters, preceded by a hyphen -, which changes the behavior of GREP.

<searchstring>

Gives the pattern to search for.

<file(s)>

Tells GREP which files to search. Files can be an explicit file name or a generic file name incorporating the ? and * wildcards. In addition, you can type a path (drive and directory information). If you list files without a path, GREP searches the current directory.
If you do not specify a file, GREP searches the standard input. This lets you use pipes (vertical bars |) and redirection ("greater than" symbol >).


To display a list of the GREP command-line options, special characters, and defaults for GREP, enter:

grep ?

Command-Line Options

You can pass options to the GREP utility on the command line by specifying one or more single characters preceded by a hyphen -. Each individual character is a switch that you can turn on or off: a plus symbol + after a character turns the option on, a hyphen - after the character turns the option off. The + sign is optional; for example, -r means the same thing as -r+.
You can list multiple options individually (like this: -i -d -l), or you can combine them (like this: -ild or -il, -d, and so on).

GREP supports the command-line options listed in the following table:

GREP Command-Line Options

Option Description
?

Displays a help screen showing the options, special characters, and defaults for GREP.

-c-

Count only: Prints only a count of matching lines. For each file that contains at least one matching line, GREP prints the file name and a count of the number of matching lines. Matching lines are not printed.
This option is off by default.

-d-

Search subdirectories: For each file specified on the command line, GREP searches for all files that match the file specification, both in the directory specified and in all subdirectories below the specified directory. If you give a file without a path, GREP assumes the files are in the current directory.
This option is off by default.

-e

Search expression follows: Indicates that the next argument is the search expression. This option is useful when you want to search for an expression that begins with -.

-i-

Ignore case: GREP ignores upper/lowercase differences. When this option is on, GREP treats all letters a to z as identical to the corresponding letters A to Z in all situations.
This option is off by default.

-l-

List file names only: Prints only the name of each file containing a match. After GREP finds a match, it prints the file name and the processing immediately moves on to the next file.
This option is off by default.

-n-

Line numbers: Each matching line that GREP prints is preceded by its line number.
This option is off by default.

-o-

UNIX output format: Changes the output format of matching lines to support more easily the UNIX style of command-line piping. All lines of output are preceded by the name of the file that contained the matching line.
This option is off by default.

-r+

Regular expression search: The text defined by searchstring is treated as a regular expression instead of as a literal string.
This option is on by default.
A regular expression is one or more occurrences of one or more characters optionally enclosed in quotation marks.
The following symbols are treated specially (for more information, see the Special Characters section in this topic):

  • ^ -- start of line
  • . -- any character
  • * -- match zero or more characters
  • [aeiou0-9] -- match a, e, i, o, u,, and 0-9
  • [^aeiou0-9] -- match all but a, e, i, o, u,, and 0-9
  • $ -- end of line
  • \ -- quote next character
  • + -- match one or more
-u <filename>

Update options: Creates a copy of GREP.EXE, called <filename>.EXE. Any options included on the command line are saved as defaults in the new copy of GREP. Use the -u option to customize the default option settings. To verify that the defaults have been set correctly, type
filename ?
Each option on the help screen is followed by a + or - to indicate its default setting.

-v-

Nonmatch: Prints only nonmatching lines. Only lines that do not contain the search string are considered nonmatching lines.
This option is off by default.

-w-

Word search: Text found that matches the regular expression is considered a match only if the character immediately preceding and following cannot be part of a word. The default word character set includes A to Z, 0 to 9, and the underscore character _.
This option is off by default.
An alternate form of this option lets you specify the set of legal word characters. Its form is -w[set], where [set] is any valid regular expression.
If you define the set with alphabetic characters, it is automatically defined to contain both the uppercase and lowercase values for each letter in the set (regardless of how it is typed), even if the search is case-sensitive.
If you use the -w option in combination with the -u option, the new set of legal characters is saved as the default set.

-z-

Verbose: GREP prints the file name of every file searched. Each matching line is preceded by its line number. A count of matching lines in each file is given, even if the count is zero.
This option is off by default.

The Search String

The value of <searchstring> defines the pattern GREP searches for. A search string can be either a regular expression or a literal string.

  • In a regular expression, certain characters have special meanings: they are operators that govern the search. (A regular expression is either a single character or a set of characters enclosed in brackets. A concatenation of regular expressions is a regular expression.)
  • In a literal string, there are no operators: each character is treated literally.

You can enclose the search string in quotation marks to prevent spaces and tabs from being treated as delimiters. To search for an expression that begins with -, use the -e option. The text matched by the search string cannot cross line boundaries; that is, all the text necessary to match the pattern must be on a single line.

When you use the -r option (on by default), the search string is treated as a regular expression (not a literal expression).

Special Characters

The following characters have special meanings:

Symbol Description
^

A circumflex at the start of the expression matches the start of a line.

$

A dollar sign at the end of the expression matches the end of a line.

.

A period matches any character.

*

An asterisk after a character matches any number of occurrences of that character followed by any characters, including zero characters. For example, bo* matches bot, boo, as well as bo.

+

A plus sign after a character matches any number of occurrences of that character followed by any characters, except zero characters. For example, bo+ matches bot and boo, but not b, bo, or bt.

{}

Characters or expressions in braces are grouped so that the evaluation of a search pattern can be controlled and so grouped text can be referred to by number.

[]

Characters in brackets match any one character that appears in the brackets, but no others. For example [bot] matches b, o, or t.

[^]

A circumflex at the start of the string in brackets means NOT. Hence, [^bot] matches any characters except b, o, or t.

[-]

A hyphen within the brackets signifies a range of characters. For example, [b-o] matches any character from b through o.

\

A backslash before a wildcard character tells GREP to treat that character literally, not as a wildcard. For example, \^ matches ^ and does not look for the start of a line.


Four of the "special" characters ($, ., *, and +) do not have any special meaning when used within a bracketed set. In addition, the character ^ is only treated specially if it immediately follows the beginning of the set definition (immediately after the [ delimiter).

GREP Examples

Example 1 -- Redirecting Output from GREP

If you find that the results of your GREP are longer than one screen, you can redirect the output to a file.

For example, you can use this command:

GREP "Bob" *.txt > temp.txt

This command searches all files with the TXT extension in the current directory, then puts the results in a file called TEMP.TXT. (You can name this file anything you like.) Use any word processor to read TEMP.TXT (the results of the search).

Example 2

grep -r "[^a-z]main\ *\(" *.c

Matches:

main(i,j:integer)
if (main ()) halt;
if (MAIN ()) halt;

Does Not Match:

mymain()

Explanation: The search string tells GREP to search for the word "main" with no preceding lowercase letters [^a-z], followed by zero or more occurrences of blank spaces \ *, then a left parenthesis. Since spaces and tabs are normally considered command-line delimiters, you must quote them if you want to include them as part of a regular expression.

Example 3

grep -ri [a-c]:\\data\.fil *.c *.inc

Matches:

A:\data.fil
B:\DATA.FIL
c:\Data.Fil

Does Not Match:

d:\data.fil a:data.fil

Explanation: Because the backslash \ and period . characters usually have special meaning in path and file names, you must place the backslash escape character immediately in front of them if you want to search for them. The -i option is used here, so the search is not case-sensitive.

Example 4

grep "search string with spaces" *.doc *.c

Matches:

A search string with spaces in it.

Does not match

This search string has spaces in it.

Explanation: This is an example of how to search for a string containing specific text.

Example 5

grep -rd "[ ,.:?'\"]"$ \*.doc

Matches:

He said hi to me.
Where are you going?
In anticipation of a unique situation,
Examples include the following:
"Many men smoke, but fu man chu."

Does not match:

He said "Hi" to me
Where are you going? I'm headed to the

Explanation: This example searches for any one of the characters " . : ? ' and , at the end of a line. The double quotation mark within the range is preceded by an escape character so it is treated as a normal character instead of as the ending quotation mark for the string. Also, the $ character appears outside of the quoted string. This demonstrates how regular expressions can be concatenated to form a longer expression.

Example 6

grep -w[=] = *.c

Matches:

i = 5;
j=5;
i += j;

Does not match:

if (i == t) j++;
/* ==================================== */

This example redefines the current set of legal characters for a word as the assignment operator = only, then does a word search. It matches C assignment statements, which use a single equal sign =, but not equality tests, which use a double equal sign ==.

See Also