Elasticsearch ships with a wide range of built-in analyzers, which can be used in any index without further configuration:
standard
analyzer divides text into terms on word boundaries, as defined
by the Unicode Text Segmentation algorithm. It removes most punctuation,
lowercases terms, and supports removing stop words.
simple
analyzer divides text into terms whenever it encounters a
character which is not a letter. It lowercases all terms.
whitespace
analyzer divides text into terms whenever it encounters any
whitespace character. It does not lowercase terms.
stop
analyzer is like the simple
analyzer, but also supports removal
of stop words.
keyword
analyzer is a “noop” analyzer that accepts whatever text it is
given and outputs the exact same text as a single term.
pattern
analyzer uses a regular expression to split the text into terms.
It supports lower-casing and stop words.
english
or
french
.
fingerprint
analyzer is a specialist analyzer which creates a
fingerprint which can be used for duplicate detection.
If you do not find an analyzer suitable for your needs, you can create a
custom
analyzer which combines the appropriate
character filters,
tokenizer, and token filters.