The path_hierarchy tokenizer takes a hierarchical value like a filesystem
path, splits on the path separator, and emits a term for each component in the
tree.
POST _analyze
{
  "tokenizer": "path_hierarchy",
  "text": "/one/two/three"
}The above text would produce the following terms:
[ /one, /one/two, /one/two/three ]
The path_hierarchy tokenizer accepts the following parameters:
| 
 | 
    The character to use as the path separator.  Defaults to  | 
| 
 | 
    An optional replacement character to use for the delimiter.
    Defaults to the  | 
| 
 | 
    The number of characters read into the term buffer in a single pass.
    Defaults to  | 
| 
 | 
    If set to  | 
| 
 | 
    The number of initial tokens to skip.  Defaults to  | 
In this example, we configure the path_hierarchy tokenizer to split on -
characters, and to replace them with /.  The first two tokens are skipped:
PUT my_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "my_tokenizer"
        }
      },
      "tokenizer": {
        "my_tokenizer": {
          "type": "path_hierarchy",
          "delimiter": "-",
          "replacement": "/",
          "skip": 2
        }
      }
    }
  }
}
POST my_index/_analyze
{
  "analyzer": "my_analyzer",
  "text": "one-two-three-four-five"
}The above example produces the following terms:
[ /three, /three/four, /three/four/five ]
If we were to set reverse to true, it would produce the following:
[ one/two/three/, two/three/, three/ ]