I find regular expressions obtuse, maddening, and at the same time, strangely compelling. For those who don’t know, “regular expressions” (usually referred to as “regex”) is a common way for most computer programs to search text for particular patterns of characters, like for example a date or an address. Regex gives you access to atoms that represent individual character (“a”, “6”, “#”, etc…) or general character types (digits, symbols, uppercase…), as well as a concise grammar defining how these characters appear. Regex is a set of small blocks with specific rules that define how to put all these blocks together to represent something bigger.
A quick example
How to match a date like 2014/12/24?
Digits are represented like this: \d so the following regex will represent any character from 0 to 9:
What if you are looking for a series of 4 digits? You could do this:
But it can get a little heavy if you are looking for 200 digits, so you can use the following notation:
Okay, so now you have matched the first 4 digits, how about matching the “/” character? In this case, since it’s just a specific character, just go ahead and type it in:
Then match 2 digits, another “/” and two more digits:
It’s a mental puzzle that requires you to visualize how patterns can be broken down into concise nuggets of representative abstraction.
The following web site creates a diagram out of a regex expression to give you a useful visual graphic depiction of the logic.