Regular Expressions
A regular expression (regex or regexp for short) is a special text string for describing a search pattern. It goes behind simple wildcard (*) expression and it could be used to create specific and highly complex matching rules.
This article provides only basic information about regular expressions.
Special characters
Character | Meaning | Example |
abc | Literal characters, matches part of text. | lo matches hello! but not world |
\ | Indicates that next character is special or escape special character so it is not treated specially |
|
^ | Matches beginning of text or line start | ^A matches ABC but not BAC. |
$ | Matches end of text or line end | C$ matches ABC but not ACB. |
* | Matches the preceding character zero or more times | test* matches test, tes or testttt |
+ | Matches the preceding character one or more times | test+ matches test or testttt but not tes |
? | Matches the preceding character zero or one time. Equivalent to {0,1} | test? matches test, tes but not testttt |
. | Matches any single character except the newline character | tes. matches test, tess but not tes |
x|y | Matches either x or y, ie. boolean or | gray|grey can match gray or grey |
() | Parentheses are used for grouping and priority, some as in basic math | gray|grey and gr(a|e)y are equivalent patterns which both describe the set of_gray_ and grey |
{n} | Matches exactly n occurrences of the preceding characters | a{2} matches aa but not a or aaa |
{n,} | Matches at least n occurrences of the preceding characters | a{2,} matches aa or aaa but not a |
{n,m} | Matches at least n and maximum of m occurrences of the preceding characters | a{2,4} matches aa, aaa and aaaa but not a or aaaaa |
[xyz] | Character set that matches any of the enclosed characters | [xyz_ matches _x, y, or z |
[x-z] | Character set that matches any of the characters range | Same as previous |
[^xyz] | Characters set that should not match (negation of previous two) | [xyz_ matches a, b or other but not x, y, or z |
\b | Matches world boundary (space, newline, punctuation, end of string) | |
\B | Matches a non-word boundary | |
\d | Matches a digit character. Equivalent to [0-9] | |
\D | Matches any non-digit character. Equivalent to [^0-9] | |
\n | Matches a linefeed (new line) | |
\r | Matches a carriage return (new line) | |
\s | Matches a single white space character (space, tab, line feed) | |
\S | Matches a single character other than white space | |
\t | Matches a tab | |
\w | Matches any alphanumerical character including underscore. Equivalent to [A-Za-z0-9_] | |
\W | Matches any non-word character. Equivalent to [^A-Za-z0-9_] |
YSoft SafeQ
Regular expressions in YSoft SafeQ are often matches directly to tested texts, ie. they behaves like every regular expression have ^ and $ characters written around them. This means that if you write world as regular expression and you will want to match it to Hello world it would not match because your expression will be matched as ^world$. You need to write something like .*world.* to make it work.
Various examples
.*Word.* | Matches MS Word Document |
MS Word 200. | Matches MS Word 2007 but not MS Word 2010 |
External links
See these external sources for more information and more complex examples and rules
http://en.wikipedia.org/wiki/Regular_expression
http://www.regular-expressions.info/