Next: Regexp Functions, Previous: Syntax of Regexps, Up: Regular Expressions
Here is a complicated regexp which was formerly used by Emacs to
recognize the end of a sentence together with any whitespace that
follows. (Nowadays Emacs uses a similar but more complex default
regexp constructed by the function sentence-end
.
See Standard Regexps.)
Below, we show first the regexp as a string in Lisp syntax (to distinguish spaces from tab characters), and then the result of evaluating it. The string constant begins and ends with a double-quote. ‘\"’ stands for a double-quote as part of the string, ‘\\’ for a backslash as part of the string, ‘\t’ for a tab and ‘\n’ for a newline.
"[.?!][]\"')}]*\\($\\| $\\|\t\\| \\)[ \t\n]*" ⇒ "[.?!][]\"')}]*\\($\\| $\\| \\| \\)[ ]*"
In the output, tab and newline appear as themselves.
This regular expression contains four parts in succession and can be deciphered as follows:
[.?!]
[]\"')}]*
\"
is Lisp syntax for a double-quote in
a string. The ‘*’ at the end indicates that the immediately
preceding regular expression (a character alternative, in this case) may be
repeated zero or more times.
\\($\\| $\\|\t\\| \\)
[ \t\n]*