For more than 100 detailed regular expression recipes that include equal coverage for eight programming languages (C#, Java, Java Script, Perl, PHP, Python, Ruby, and VB. We’ll explain more about the question mark after discussing the other types of tokens in this regular expression.
NET), get your very own copy of , so that your phone number records are consistent. The parentheses that appear without backslashes are capturing groups and are used to remember the values matched within them so that the matched text can be recalled later.
The first word boundary is relevant only when matching a number without parentheses, since the word boundary always matches between the opening parenthesis and the first digit of a phone number.
You can allow an optional, leading “1” for the country code (which covers the North American Numbering Plan region) via the addition shown in the following regex: .
The first group can optionally be enclosed with parentheses, and the first two groups can optionally be followed with a choice of three separators (a hyphen, dot, or space). It’s important that the hyphen appears first in this character class, because if it appeared between other characters, it would create a range, as with .
The following layout breaks the regular expression into its individual parts, omitting the redundant groups of digits: ^ # Assert position at the beginning of the string. Any quantifier that allows something to be repeated zero times effectively makes that element optional.
Two simple changes allow the previous regular expression to match phone numbers within longer text: matches the position between a word character and either a nonword character or the beginning or end of the text.
Letters, numbers, and underscore are all considered word characters (see Recipe 2.6).
The entire, added noncapturing group is also optional, but since the “1” is required within the group, the preceding plus sign and separator are not allowed if there is no leading “1”.A regular expression can easily check whether a user entered something that looks like a valid phone number. In this case, backreferences to the captured values are used in the replacement text so we can easily reformat the phone number as needed.By using capturing groups to remember each set of digits, the same regular expression can be used to replace the subject text with precisely the format you want. Two other types of tokens used in this regular expression are character classes and quantifiers.This includes the United States and its territories, Canada, Bermuda, and 16 Caribbean nations.It excludes Mexico and the Central American nations.
This is a textbook example of where we need a backslash to escape a special character so the regular expression treats it as literal input.