Regular Expressions
Last updated
Last updated
This page contains recommendations for using regular expressions.
Do not use regular expressions if there is a clean non-regex solution, for example, searching for a substring or using if
conditions.
Use regular expression engines that provide linear time expression matching at least for user-provided regular expressions or matching "hard-coded" expressions against user-controlled data, see the Linear time regular expression matching implementation section.
Do not use multi-line
matching mode in regexes that are used for validation. Otherwise, make sure that full string matching ^...$
works as expected or rewrite regexes using more specific expressions like \A...\z
.
Remember in some engines multi-line
matching mode is a default mode, for example, the built-in regex engine in Ruby.
Implement input validation for strings for matching, at least for string length and allowed characters, see the Input Validation page.
Use the following practices to simplify regular expressions and reduce the likelihood of problems with catastrophic backtracking:
Avoid nested quantifiers, for example (a+)+
.
Try to be as precise as possible and avoid the .
pattern.
Use reasonable ranges, for example {1,10}
, for repeating patterns instead of unbounded *
and +
patterns.
Simplify character ranges, for example [ab]
instead of [a-z0-9]
.
Log regex failures, especially if a regex is used for validation, see the Logging and Monitoring page.
Comply with requirements from the Error and Exception Handling page.
Use regular expression engines that provide linear time expression matching for matching all regular expressions, see the Linear time regular expression matching implementation section.
There is the re2 engine that provides linear time expression matching. Try to find a library that is based on the re2 engine.