Practical RegEx For Bash, C, C++, Perl, PHP, and Python

Practical RegEx For Bash, C, C++, Perl, PHP, and Python

(Updated on 05/31/2017)

 

Regular expressions aren’t just about finding text patterns: more important than matching is knowing how to use them to make substitutions in a string or split it into parts.

As programming languages have their own peculiarities, this post will simplify your life by showing how they work in Bash, C, C++, Perl, PHP, and Python!

 

 

What’s This Post About

The last post introduced a lot of metacharacters used by regular expressions and now it’s possible to see what purpose they actually serve with some practical examples.

RE’s basic usage comprises of the following:

  • How to write the RE that will be used in the functions and/or methods, called here as raw regular expression;
  • How to search for text patterns;
  • How to make substitutions and split the text that matched;
  • How to configure the interpreter to ignore uppercase/lowercase;
  • How to enable the global search and replace functionality.

This ensamble is the basic of what you need to know for each programming language. As there are an incredible number of those, only 6 were chosen for this post.

How This Guide Works

Each programming language the index presents will show all the features mentioned previously and then show how to use them.

The example code serve as a guide so you can see each one in use. When applicable, the highlighted lines help identify where each feature is utilized.

Examples’ Description

All examples will do the same proceeding, using the same REs with the palindrome Sums are not set as a test on Erasmus (a sentence that can be read either from right to left or from left to right).

First, the example searches for erasmus, with e, without the flag that ignores the case, and will fail. Then, the same pattern is used once the program/script enables this flag.

Then, the substitution will replace not for indeed and Erasmus for Campus without the need of the previous flag.

The split will be responsible for getting each word of the sentence, using the space character as a delimiter, and finally the global substitution flag will be demonstrated by changing only the first space for an underline and then changing all its occurrences.

Index

Programming Languages

Bash

Search: grep or egrep

Replace: sed

Split: Using arrays.

Raw RE: Doesn’t apply.

Ignore Case: Modifier: -i (grep/egrep) or i (sed)

Global: Modifier: g (sed)

see all languages

C

Search: regexec()

Replace: None.

Split: None.

Raw RE: None.

Ignore Case: Flag REG_ICASE

Global: None.

More information here.

see all languages

C++

Search: regex_match, regex_search

Replace: regex_replace

Split: None.

Raw RE: std::regex

Ignore Case: std::regex_constants::icase

Global: It’s the default.

More information here.

see all languages

Perl

Search: m//

Replace: s///

Split: split

Raw RE: 'er' or qr'er'

Ignore Case: Modifier: i

Global: Modifier: g

More information here.

see all languages

PHP

Search: preg_match

Replace: preg_replace

Split: preg_split

Raw RE: '/er/'

Ignore Case: Modifier: i

Global: preg_match_all and it’s already the default for replace.

More information here.

see all languages

Python

Search: match, search

Replace: sub, subn

Split: split

Raw RE: r'er'

Ignore Case: Modifier: re.IGNORECASE

Global: It’s the default.

More information here.

see all languages

Final Words

Don’t forget to leave any questions in the comments, in case you need some help, and good luck!