Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 

Regular Expressions

 

Regular expressions (regex) is a highly descriptive language commonly used to search through a set of data. There are different variants of regex but most share a similar syntax.

This section describes regex in the context of PSM. PSM allows you to use regular expressions to control how entries in the Network tree are categorized.

A regular expression is created using one or more constructs. The expression is then compared against a set of data, which, in the case of a branch in the Network tree, is the set of entries or names in that branch. The simplest way to envision this is to imagine a scanner that scans through the set of names one character at a time looking for matches to the given expression.1

Note

1 In reality, the regex engine uses algorithmic methods to look for matches.

Table 1: Common Regex Constructs

Construct

Description

Example

Characters and character classes

char

Literal. Matches if the specified char matches the character at the current position in the data set.

a matches abc

. (dot)

Wildcard. Matches any single character at the current position in the data set. Does not match if there is no character at the current position (e.g. at the end of a name).

ab.d matches abcd and abdd

[chars]

Character class. Matches if any single char within the square parentheses matches the character at the current position in the data set.

The characters in the parentheses can be a list of characters or a range of characters, or a negation.

[a] matches abc

[abc] matches abc, abc, and abc

[a-zA-Z] matches any lowercase or uppercase letter

[^a0-9] matches any character that is not an a or a digit

\w

Word character, shorthand for [a-zA-Z0-9_].

\w matches !@#-a-%$#

\d

Digit character, shorthand for [0-9].

\d matches abc2def

\ (backslash)

Escape character.

When immediately preceding one of the following special characters, [\^$.|?*+(){} outside a character class, the \ suppresses the special character's meaning, and the special character is treated as a literal.

When immediately preceding one of the following special characters, ^-]\ inside a character class, the \ suppresses the special character's meaning, and the special character is treated as a literal.

\.com outside a character class matches mycompany.com

Construct

Description

Example

Anchors

^ (caret)

Match at the start of the string or after a newline.1

Note: A starting ^ inside a character class is a negation.

Note: 1Each name in the branch is separated by the newline character.

If the string is network_subnet, then:

^net matches network_subnet

$ (dollar)

Match at the end of the string or before a newline.

If the string is network_subnet, then:

net$ matches network_subnet

Construct

Description

Example

Quantification

* (asterisk)

Match the preceding item 0 or more times, as many times as possible (greedy match).

om* matches optical

om* matches commissioning

+

Match the preceding item 1 or more times, as many times as possible (greedy match).

om+ matches commissioning

?

Match the preceding item 0 or 1 time, as many times as possible (greedy match).

om? matches optical

om? matches commissioning

Construct

Description

Example

Groupings

(expression1|expression2|...)

Alternation. Matches if any of the expressions matches.

(aaa|bbb) matches aaa or bbb

(?<=lb_expression)expression

Lookbehind. Matches if the main expression and the preceding lookbehind expression match. The part of the match due to lb_expression is not included in the matched result. The matched result consists of the match resulting from expression only.

This is useful when you want to search for a string and you want to exclude the first part of that string from the matched result itself.

(?<=www.)\w*\.com matches www.mycompany.com

The resulting matched string is mycompany.com.

expression(?=la_expression)

Lookahead. Matches if the main expression and the following lookahead expression match. The part of the match due to la_expression is not included in the matched result. The matched result consists of the match resulting from expression only.

This is useful when you want to search for a string and you want to exclude the latter part of that string from the matched result itself.

\w*(?=@) matches john@mycompany.com

The resulting matched string is john.