Regular Expressions

A regular expression (or regex, the abbreviation) defines a search pattern for string, this pattern can be anything from a simple character, string or complex expression containing special character. This regex are used to search, edit and manipulate text.

The process of analysing text is applied to the string from left to right, once a source character has been used in a march, it can’t be reused. For example, the regex lol will match lolololol twice

Regex examples

A simple example for a regular expression is a string, so “Hi how are you?” will match with other string that contains the same the list says several expressions and describes which pattern they would match

RegexMatches
Hello worldMatches exactly “Hello world”
Hellos+s+worldMatches the world “Hello” followed
by a whitespace characters follow by “world”
^d+(.d+)?^ defines that the patter must start at beginning
of a new line. d+ matches one or several digits
. The ? makes the statement in brackets optional. . matches “.”

Matching symbols

This description is an overview of an available meta characters which can be used in regular expressions.

RegexDescription
.Matches any character
^patternFinds pattern at start of the line
pattern&Finds pattern at the end of the line
[123]set definition can match 1 or 2 or 3
[123][abc]set a compound definition can match 1 or 2 or 3 followed by a or b or c
[^123]this mean that everything except 1 or 2 or 3 will match
[1-5a-c] Range marches a letter between 1 to 5 or a to c
X|YFind or X or Y
12 finds 1 followed by 2

Meta Characters

RegexDescription
\dany digit short for 0-9
\Da non digit short of 0-9
\sa whitespace character, short for [ \t\n\x0b\r\f]
\Sa non-whitespace character
\wa word character, shor for (a-zA-Z_0-9]
\”a non word character shot for [^\w]

Negative look ahead

Negative look ahead provides the possibility to exclude a pattern this are defined via (?!pattern) (?!b)

Specifying modes inside the regular expression

(?i) makes the regex case insensitive

(?s) for “single line mode”

(?m) for “multi-line mode” makes the caret and dollar match at the start and end of each line in the subject string

Regular expressions with String methods

We can use four built-in functions for regular expressions in String

x.matches(String s): Evaluates if the value of s matches with x return true if the whole string is matched

x.split(String s): Creates an array with substring of x duvuded at occurrence of s

x.replaceFirst(String s, String r): Replace the first occurance of s with r

RegexDescriptionExamples
+Occurs one or more times, Is short for {0,1}
finds one or several letter
?Occurs no or one times ? is short for {0,1}
finds no or exactly one letter
{X}Occurs X number of times, {} Describes the order of the
preceding liberal
\d{3} searches for three digits , {10} for any character with length 10

{X,Y}
Occurs between X and Y times\d{1,4} means \d must occurs at least
Once and at maximum of four