When the built-in analyzers do not fulfill your needs, you can create a
custom analyzer which uses the appropriate combination of:
The custom analyzer accepts the following parameters:
| 
 | A built-in or customised tokenizer. (Required) | 
| 
 | An optional array of built-in or customised character filters. | 
| 
 | An optional array of built-in or customised token filters. | 
| 
 | 
    When indexing an array of text values, Elasticsearch inserts a fake "gap"
    between the last term of one value and the first term of the next value to
    ensure that a phrase query doesn’t match two terms from different array
    elements.  Defaults to  | 
Here is an example that combines the following:
PUT my_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_custom_analyzer": {
          "type":      "custom",  "tokenizer": "standard",
          "char_filter": [
            "html_strip"
          ],
          "filter": [
            "lowercase",
            "asciifolding"
          ]
        }
      }
    }
  }
}
POST my_index/_analyze
{
  "analyzer": "my_custom_analyzer",
  "text": "Is this <b>déjà vu</b>?"
}
          "tokenizer": "standard",
          "char_filter": [
            "html_strip"
          ],
          "filter": [
            "lowercase",
            "asciifolding"
          ]
        }
      }
    }
  }
}
POST my_index/_analyze
{
  "analyzer": "my_custom_analyzer",
  "text": "Is this <b>déjà vu</b>?"
}| 
Setting  | 
The above example produces the following terms:
[ is, this, deja, vu ]
The previous example used tokenizer, token filters, and character filters with their default configurations, but it is possible to create configured versions of each and to use them in a custom analyzer.
Here is a more complicated example that combines the following:
:) with _happy_ and :( with _sad_
Here is an example:
PUT my_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_custom_analyzer": {
          "type": "custom",
          "char_filter": [
            "emoticons"  ],
          "tokenizer": "punctuation",
          ],
          "tokenizer": "punctuation",  "filter": [
            "lowercase",
            "english_stop"
          "filter": [
            "lowercase",
            "english_stop"  ]
        }
      },
      "tokenizer": {
        "punctuation": {
          ]
        }
      },
      "tokenizer": {
        "punctuation": {  "type": "pattern",
          "pattern": "[ .,!?]"
        }
      },
      "char_filter": {
        "emoticons": {
          "type": "pattern",
          "pattern": "[ .,!?]"
        }
      },
      "char_filter": {
        "emoticons": {  "type": "mapping",
          "mappings": [
            ":) => _happy_",
            ":( => _sad_"
          ]
        }
      },
      "filter": {
        "english_stop": {
          "type": "mapping",
          "mappings": [
            ":) => _happy_",
            ":( => _sad_"
          ]
        }
      },
      "filter": {
        "english_stop": {  "type": "stop",
          "stopwords": "_english_"
        }
      }
    }
  }
}
POST my_index/_analyze
{
  "analyzer": "my_custom_analyzer",
  "text":     "I'm a :) person, and you?"
}
          "type": "stop",
          "stopwords": "_english_"
        }
      }
    }
  }
}
POST my_index/_analyze
{
  "analyzer": "my_custom_analyzer",
  "text":     "I'm a :) person, and you?"
}| 
The  | 
The above example produces the following terms:
[ i'm, _happy_, person, you ]