This tutorial focuses on finding text in files using the grep command and regular expressions.

When working on a Linux system, finding text in files is a very common task done by system administrators every day.

You may want to search for specific lines in a log file in order to troubleshoot servers issues.

In some cases, you are interested in finding actions done by specific users or you want to restrict lines of a big file to a couple of lines.

Luckily for you, there are multiple ways of finding text into files on Linux but the most popular command is the grep command.

Developed by Ken Thompson in the early days of Unix, grep (globally search a regular expression and print) has been used for more than 45 years by system administrators all over the world.

In this tutorial, we will focus on the grep command and how it can help us effectively find text in files all over our system.

Ready?

Grep Syntax on Linux

As specified above, in order to find text in files on Linux, you have to use the grep command with the following syntax

$ grep <option> <expression> <path>

Note that the options and the path are optional.

grep-version

Before listing and detailing all the options provided by grep, let’s have a quick way to memorize the syntax of the grep command.

In order to remember the syntax of the grep command, just remember that grep can be written as grEP which means that the expression comes before the path.

This is a great way to remember the grep syntax and the find syntax at the same time but the find syntax is the exact opposite : path first and expression after.

Quick grep examples

There are complex options that can be used with grep, but let’s start with a set of very quick examples.

Listing users using grep

On Linux, as you already probably know it, user accounts are listed in a specific file called the passwd file.

In order to find the root account in a specific file, simply enter your text and the file you want to search into.

$ grep root /etc/passwd
root:x:0:0:root:/root:/bin/bash

Another very popular way of using the grep command is to look for a specific process on your Linux system.

Filtering Processes using grep

As explained in one of our previous tutorials, you have to use the “ps” command in order to list all the processes currently running on your system.

You can pipe the “ps” command with the “grep” command in order to filter the processes you are interested in.

$ ps aux | grep <process>

If you are interested in bash processes for example, you can type the following command

$ ps aux | grep bash

root      1230  0.0  0.0  23068  1640 tty1     S+   Jan11   0:00 -bash
user      2353  0.0  0.1  23340  5156 pts/0    Ss   03:32   0:00 -bash
user      2473  0.0  0.0  14856  1056 pts/0    S+   03:45   0:00 grep --color=auto bash
user      6685  0.0  0.0  23140  1688 pts/2    Ss+  Nov09   0:00 bash
Note : if you are not sure about how to use pipes on Linux, here’s a complete guide on input and output redirection.

Inspecting Linux Kernel logs with grep

Another great usage of the grep command is to inspect the Linux Kernel buffer ring.

This is heavily used when performing troubleshooting operations on Linux systems because the kernel will write to its buffer ring when starting or booting up.

Let’s say for example that you introduced a new disk into your system and you are not sure about the name given to this new disk.

In order to find out this information, you can use the “dmesg” command and pipe it to the grep command.

$ dmesg | grep -E sd.{1}

list-disks-grep

Grep Command Options

The grep command is very useful by itself but it is even more useful when used with options.

The grep command literally has a ton of different options.

The following sections will serve as a guide in order to use those options properly and examples will be given along the way.

Search specific string using grep

In some cases, you may interested in finding a very specific string or text into a file.

In order to restrict the text search to a specific string, you have to use quotes before and after your search term.

$ grep "This is a specific text" .

To illustrate this option, let’s pretend that you are looking for a specific username on your system.

As many usernames may start with the same prefix, you have to search for the user using quotes.

$ grep "devconnected" /etc/passwd

grep-specific-text

Search text using regular expressions

One of the greatest features of the grep command is the ability to search for text using regular expressions.

Regular expressions are definitely a great tool to master : they allow users to search for text based on patterns like text starting with a specific letters or text that can be defined as an email address.

Grep supports two kinds of regular expressions : basic and extended regular expressions.

Basic Regular Expressions (BRE)

The main difference between basic and extended regular expressions is the fact that you can use regular expressions symbols with BRE (basic regular expressions) but they will have to be preceded by a backslash.

Most common regular expression patterns are detailed below with examples :

  • ^ symbol : also called the caret symbol, this little hat symbol is used in order to define the beginning of a line. As a consequence, any text after the caret symbol will be matched with lines starting by this text.

For example, in order to find all drives starting with “sd” (also called SCSI disks), you can use the caret symbol with grep.

$ lsblk | grep "^sb"

caret-symbol-linux

  • $ symbol : the dollar sign is the opposite of the caret symbol, it is used in order to define the end of the line. As a consequence, the pattern matching will stop right before the dollar sign. This is particularly useful when you want to target a specific term.

In order to see all users having bash shell on your system, you could type the following command

$ cat /etc/passwd | grep "bash$"

dollar

  •  (dot symbol) : the dot symbol is used to match one single character in a regular expression. This can be particularly handy when search terms contain the same letters at the beginning and at the end but not in the middle.

If for example, you have two users on your system, one named “bob” and one named “bab”, you could find both users by using the dot symbol.

$ cat /etc/passwd | grep "b.b"

dot-regex

  • [ ] (brackets symbol) : this symbol is used to match only a subset of characters. If you want to only match “a”, or “o”, or “e” characters, you would enclose them in brackets.

Back to the “bob” example, if you want to limit your search to “bob” and “bab”, you could type the following command

$ cat /etc/passwd | grep "b[ao]b"

brackets-regex

Using all the options provided before, it is possible to isolate single words in a file : by combining the caret symbol with the dollar symbol.

$ grep "^word$" <file|path>

word-grep

Luckily for you, you don’t have to type those characters every time that you want to search for single word entries.

You can use the “-w” option instead.

$ grep -w <expression> <file|path>

search-word

Extended Regular Expressions (ERE)

Extended regular expressions as its name states are regular expressions that are using more complex expressions in order to match strings.

You can use extended regular expressions in order to build an expression that is going to match an email address for example.

In order to find text in files using extended regular expressions, you have to use the “-E” option.

$ grep -E <expression> <path>

One great usage of the extended regular expressions is the ability to search for multiple search terms for example.

Searching multiple strings in a file

In order to search for multiple strings in a file, use the “-E” option and put your different search terms separated by straight lines (standing for OR operators in regular expressions)

$ grep -E "text1|text2|text3" <path>

Back to our previous grep example, you could find the root account and the bob account using extended regular expressions.

$ grep -E "root|bob" /etc/passwd

regular-expression-or

Search for IP addresses using grep

In some cases, you may want to isolate IP addresses in a single file : using extended regular expressions is a great way of finding IP addresses easily.

Plenty of different websites provide ready-to-use regular expressions : we are going to use this one for IP addresses.

"\b([0-9]{1,3}\.){3}[0-9]{1,3}\b"

How would you read this regular expression?

An IP address is made of 4 3-digit numbers separated by dots, this is exactly what this regular expression describes.

([0-9]{1,3}\.){3}      = 3 3-digits numbers separated by dots

[0-9]{1,3}             = the last 3-digits number ending the IP address

Here is how you would search for IP addresses using the grep command

grep -E "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b" <file|path>

ip-address-regex

Search for URL addresses using grep

Similarly, it is entirely possible to search for URL addresses in a file if you are working with website administration on a daily basis.

Again, many websites provide regular expressions for URLs, but we are going to use this one.

grep -E '(http|https)://[^/"]+' <file|path>

url-regex (1)

Search for email addresses using grep

Finally, it is possible to search for email addresses using extended regular expressions.

To find email adresses, you are going to use the following regular expression

grep -E "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b" <file|path>

grep-email

Now that you have seen how to use extended regular expressions with grep, let’s see how you can recursively find text in a file using options.

Find Text Recursively with grep

In order to find text recursively (meaning exploring every directory and its children) on Linux, you have to use “grep” with the “-r” option (for recursive)

$ grep -R <expression> <path>

For example, to search for all files containing the word “log” in the /var/log directory, you would type

$ grep -R "log$" /var/log

Using this command, it is very likely that you will see a lot of entries with permission denied.

In order to ignore those permission denied entries, redirect the output of your command to /dev/null

$ grep -R "log$" /var/log 2> /dev/null

recursive

In order to find text recursively, you can also use the “-d” option with the “recurse” action.

$ grep -d recurse "log$" /var/log

Searching text recursively can be pretty handy when you are trying to find specific configuration files on your system.

Printing line numbers with grep

As you can see in our previous examples, we were able to isolate lines matching the pattern specified.

Over, if files contain thousands of lines, it would be painful identifying the file but not the line number.

Luckily for you, the grep option has an option in order to print line numbers along with the different matches.

To display line numbers using grep, simply use the “-n” option.

$ grep -n <expression> <path>

Going back to our user list example, if we want to know on which line those entries are, we would type

$ grep -n -E "root|bob" /etc/passwd

line-numbers

Find text with grep using case insensitive option>

In some cases, you may not be sure if the text is written with uppercase or lowercase letters.

Luckily for you, the grep command has an option in order to search for text in files using a case insensitive option.

To search for text using the case insensitive option, simply use the “-i” option.

$ grep -i <expression> <path>

case-insensitive

Exclude patterns from grep

Grep is used as a way to identify files containing a specific files, but what if you wanted to do the exact opposite?

What if you wanted to find files not containing a specific string on your Linux system?

This is the whole purpose of the invert search option of grep.

To exclude files containing a specific string, use “grep” with the “-v” option.

$ grep -v <expression> <file|path>

As a little example, let’s say that you have three files but two of them contain the word “log”.

In order to exclude those files, you would have to perform an invert match with the “-v” option.

invert-grep

Show filenames using grep

In some cases, you are not interested in finding text inside files, but only in their filenames.

In order to print only filenames, and not the filename with the actual output, use the “-l” option.

$ grep -l <expression> <path>

Using our previous example, we would not get the content of the file, but only the filename.

grep-l

Conclusion

In this tutorial, you learnt how you can easily find text in files on Linux.

You learnt that you can use many different options : basic regular expressions or more advanced (extended) regular expressions if you are looking to match IP addresses or phone numbers for example.

You also discovered that you can perform inverse lookups in order to find files not matching a specific pattern on your system.

If you are interested in Linux system administration, we have a complete section dedicated to it on the website, so make sure to have a look!

Leave a Reply

Your email address will not be published.