How to exclude patterns, files and directories with grep

fatmawati achmad zaenuri/Shutterstock.com

Since 1974, Linux grep The command helps people find strings in files. But sometimes grep is just too thorough. Here are several ways to say grep ignore different things.

The grep command

The grep The command searches text files for strings that match the search patterns you provide on the command line. The power of grep lies in its use of regular expressions. These allow you to describe what you are looking for, rather than having to define it explicitly.

The birth of grep predates Linux. it was developed in the early 1970s on Unix. It takes its name from the g/re/p key sequence in the ed line editor (by the way, pronounced “ee-dee”). This represented goverall, reregular express search, pprint the corresponding lines.

grep is famously – perhaps, notoriously – thorough and determined. Sometimes it will search for files or directories that you’d rather not waste your time on, as the results may prevent you from seeing the wood for the trees.

Of course, there are ways to master grep. You can tell it to ignore patterns, files, and directories so that grep completes its searches faster and you don’t get swamped with meaningless false positives.

Exclusion of grounds

To search with grep you can direct the entry of another process there such as cat or you can supply a filename as the last command line parameter.

We use a short file that contains the text of the poem Jabberwocky, by Lewis Caroll. In both of these examples, we are looking for rows that match the search term “Jabberwock”.

cat jabberwocky.txt | grep "Jabberwock"
grep "Jabberwock" jabberwocky.text

Two different ways to search in the same text file with grep

Rows that contain matches with the search index are listed for us, with the corresponding item in each row highlighted in red. It’s a simple search. But what if we want to exclude rows containing the word “Jabberwock” and print the rest?

We can accomplish this with the -v (reverse correspondence). This lists rows that do not match the search term.

grep -v "Jabberwock" jabberwocky.text

Using the -v reverse lookup option with grep

Lines that do not contain “Jabberwock” are listed in the terminal window.

All lines that do not contain the word jabberwock

We can exclude as many terms as we want. Let’s filter out all rows that contain “Jabberwock” and all rows that contain “and”. To achieve this, we will use the -e option (expression). We have to use it for every search template we use.

grep -v -e "Jabberwock" -e "and" jabberwocky.txt

Using multiple search clauses with grep

There is a corresponding drop in the number of rows in the output.

Lines of text that do not match any of the search terms

If we use the -E (extended regular expressions), we can combine search patterns with “|“, which in this context does not indicate a pipe, this is the logic OR operator.

grep -Ev "Jabberwock|and" jabberwocky.txt

Using the logical OR operator with grep

We get exactly the same result as with the previous, longer command.

Lines of text that do not match any of the search terms

The command format is the same if you want to use a regex pattern instead of an explicit search index. This command will exclude all lines starting with any letter in the set of “ACHT”.

grep -Ev "^ACHT" jabberwocky.txt

Exclude files starting with particular letters

To see rows that contain a pattern but don’t either, we can pipe grep in grep . We will search for all rows that contain the word “Jabberwock” and then filter out all rows that too contain the word “killed”.

grep "Jabberwock" jabberwocky.txt | grep -v "slain"

Piping grep in grep to filter twice

File exclusion

We can ask grep to search for a string or pattern in a collection of files. You can list each file on the command line, but with many files this approach doesn’t scale.

grep "vorpal" verse-1.txt verse-2.txt verse-3.txt verse-4.txt verse-5.txt verse-6.txt

Search in a list of named files

Note that the name of the file containing the corresponding line is displayed at the beginning of each line of output.

To reduce typing, we can use wildcards. But that can be counter-intuitive. It seems to work.

grep "vorpal" *.txt

Using wildcard characters to search a collection of files

However, in this directory there are other TXT files, which have nothing to do with the poem. If we search for the word “sword” with the same command structure, we get a lot of false positives.

grep "sword" *.txt

Looking for "sword" through a collection of TXT files

The results we want are masked by the deluge of fake results from other files that have the TXT extension.

A large set of false positive results

The word “vorpal” didn’t match anything, but “sword” is included in the word “password”, so it was found several times in some pseudo log files.

We need to exclude these files. To do this, we will use the --exclude option. To exclude a single file called “vol-log-1.txt” we would use this command:

grep --exclude=vol-log-1.txt "sword" *.txt

In this case, we want to exclude several log files whose names begin with “vol”. The syntax we need is:

grep --exclude=vol*.txt "sword" *.txt

Excluding Files with Wildcards

When we use the -R (dereferencing-recursive) option grep will search entire directory trees for us. By default, it will search all files in these locations. There may very well be several file types that we want to exclude.

Under the current directory on this test machine, there are nested directories containing log files, CSV files, and MD files. These are all types of text files we want to exclude. We could use a --exclude option for each file type, but we can achieve what we want more efficiently by grouping file types together.

This command excludes all files that have CSV or MD extensions, and all TXT files whose names begin with “vol” or “log”.

grep -R --exclude=*.{csv,md} --exclude={vol*,log*}.txt "sword" /home/dave/data/

Using multiple --exclude clauses and filename groupings

Exclude directories

If the files we want to ignore are contained in directories and there are no files in those directories that we want to search, we can exclude those entire directories.

The concept is very similar to excluding files, except we use the --exclude-dir option and name the directories to ignore.

grep -R --exclude-dir=backup "vorpal" /home/dave/data

Exclude a directory from the search

We’ve excluded the “backup” directory, but we’re still looking in another directory called “backup2”.

It will not be surprising that we can use the --exclude-dir option multiple times in a single command. Note that the path to excluded directories must be given relative to the directory in which the search will begin. Do not use the absolute path from the root of the filesystem.

grep -R --exclude-dir=backup --exclude-dir=backup2 "vorpal" /home/dave/data

Exclude two directories from search

We can also use groupings. We can achieve the same thing more succinctly with:

grep -R --exclude-dir={backup,backup2} "vorpal" /home/dave/data

Exclude directories with grouping

You can combine file and directory exclusions in the same command. If you want to exclude all files in a directory and exclude certain file types from directories that are sought, use this syntax:

grep -R --exclude=*.{csv,md} --exclude-dir=backup/archive "frumious" /home/dave/data

Excluding file types and directories in the same command

Sometimes it’s what you leave out

sometimes with grep it can feel like trying to find a needle in a haystack. it makes a big difference to take the haystack off.

RELATED: How to use regular expressions (regex) in Linux

Comments are closed.