The Power of Regular Expression (Part 2 - grep/egrep)

Basic Usage 

egrep or grep -E

Run grep with extended regular expressions.
Ignore case (ie uppercase, lowercase letters).
Return all lines which don't match the pattern.
Select only matches that form whole words.
Print a count of matching lines.
Can be combined with the -v option to print a count of non matchine lines.
Print the name of each file which contains a match.
Normally used when grep is invoked with wildcards for the file argument.
Print the line number before each line that matches.

Regular Expression 
A single character
Range. ie any one of these characters
Not range. A character that is not one of those enclosed.
Group these characters and remember for later.
Replace n with a number. Recall the charactes matched in that set of brackets.
May also be used to rename files or directories.
The logical 'or' operation.
In front of a character, removes it's special meaning.

Regular Expression Multiplier 
The preceding item is optional, it is matched zero or one times.
The preceding item will be matched zero or more times.
The preceding item will be matched one or more times.
The preceding item will be matched exactly n times.
The preceding item will be matched n or more times.
The preceding item will be matched between n and m times.

Regular Expressions Anchors 
From the beginning of the line.
To the end of the line.
At the beginning of a word.
At the end of a word.
Match either the beginning or end of a word.
egrep 'mellon' myfile.txt
Print every line in myfile.txt containing the string 'mellon'.
egrep -n 'mellon' myfile.txt
Same as above but print a line number before each line.
egrep '(.)bb\1' myfile.txt
Find every line with 2 b's and the same character both before and after those b's.
egrep -l '[0-9]{8,}' /files/projectx/*
Print each file in the directory projectx which contains a number of 8 digits or more.
egrep '\b[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,4}\b' myfile.txt
Print every line of myfiles.txt containing an email address.
Note: this is just a simple email matching pattern. There is a miniscule number of email addresses it will not match.


Popular Posts