F.21. ltree

F.21. ltree
Prev	Up	Appendix F. Additional Supplied Modules	Home	Next

F.21.1. Definitions
F.21.2. Operators and Functions
F.21.3. Indexes
F.21.4. Example
F.21.5. Transforms
F.21.6. Authors

This module implements a data type ltree for representing labels of data stored in a hierarchical tree-like structure. Extensive facilities for searching through label trees are provided.

F.21.1. Definitions

A label is a sequence of alphanumeric characters and underscores (for example, in C locale the characters A-Za-z0-9_ are allowed). Labels must be less than 256 bytes long.

Examples: 42, Personal_Services

A label path is a sequence of zero or more labels separated by dots, for example L1.L2.L3, representing a path from the root of a hierarchical tree to a particular node. The length of a label path must be less than 65kB, but keeping it under 2kB is preferable.

Example: Top.Countries.Europe.Russia

The ltree module provides several data types:

ltree stores a label path.
lquery represents a regular-expression-like pattern for matching ltree values. A simple word matches that label within a path. A star symbol (*) matches zero or more labels. For example:
```
foo         Match the exact label path foo
*.foo.*     Match any label path containing the label foo
*.foo       Match any label path whose last label is foo
```
Star symbols can also be quantified to restrict how many labels they can match:
```
*{n}        Match exactly n labels
*{n,}       Match at least n labels
*{n,m}      Match at least n but not more than m labels
*{,m}       Match at most m labels — same as  *{0,m}
```
There are several modifiers that can be put at the end of a non-star label in lquery to make it match more than just the exact match:
```
@           Match case-insensitively, for example a@ matches A
*           Match any label with this prefix, for example foo* matches foobar
%           Match initial underscore-separated words
```
The behavior of % is a bit complicated. It tries to match words rather than the entire label. For example foo_bar% matches foo_bar_baz but not foo_barbaz. If combined with *, prefix matching applies to each word separately, for example foo_bar%* matches foo1_bar2_baz but not foo1_br2_baz.
Also, you can write several possibly-modified labels separated with | (OR) to match any of those labels, and you can put ! (NOT) at the start to match any label that doesn't match any of the alternatives.
Here's an annotated example of lquery:
```
Top.*{0,2}.sport*@.!football|tennis.Russ*|Spain
a.  b.     c.      d.               e.
```
This query will match any label path that:
1. begins with the label Top
2. and next has zero to two labels before
3. a label beginning with the case-insensitive prefix sport
4. then a label not matching football nor tennis
5. and then ends with a label beginning with Russ or exactly matching Spain.
ltxtquery represents a full-text-search-like pattern for matching ltree values. An ltxtquery value contains words, possibly with the modifiers @, *, % at the end; the modifiers have the same meanings as in lquery. Words can be combined with & (AND), | (OR), ! (NOT), and parentheses. The key difference from lquery is that ltxtquery matches words without regard to their position in the label path.
Here's an example ltxtquery:
```
Europe & Russia*@ & !Transportation
```
This will match paths that contain the label Europe and any label beginning with Russia (case-insensitive), but not paths containing the label Transportation. The location of these words within the path is not important. Also, when % is used, the word can be matched to any underscore-separated word within a label, regardless of position.

Note: ltxtquery allows whitespace between symbols, but ltree and lquery do not.

F.21.2. Operators and Functions

Type ltree has the usual comparison operators =, <>, <, >, <=, >=. Comparison sorts in the order of a tree traversal, with the children of a node sorted by label text. In addition, the specialized operators shown in Table F.13 are available.

Table F.13. ltree Operators

Operator	Returns	Description
`ltree` `@>` `ltree`	`boolean`	is left argument an ancestor of right (or equal)?
`ltree` `<@` `ltree`	`boolean`	is left argument a descendant of right (or equal)?
`ltree` `~` `lquery`	`boolean`	does `ltree` match `lquery`?
`lquery` `~` `ltree`	`boolean`	does `ltree` match `lquery`?
`ltree` `?` `lquery[]`	`boolean`	does `ltree` match any `lquery` in array?
`lquery[]` `?` `ltree`	`boolean`	does `ltree` match any `lquery` in array?
`ltree` `@` `ltxtquery`	`boolean`	does `ltree` match `ltxtquery`?
`ltxtquery` `@` `ltree`	`boolean`	does `ltree` match `ltxtquery`?
`ltree` `\|\|` `ltree`	`ltree`	concatenate `ltree` paths
`ltree` `\|\|` `text`	`ltree`	convert text to `ltree` and concatenate
`text` `\|\|` `ltree`	`ltree`	convert text to `ltree` and concatenate
`ltree[]` `@>` `ltree`	`boolean`	does array contain an ancestor of `ltree`?
`ltree` `<@` `ltree[]`	`boolean`	does array contain an ancestor of `ltree`?
`ltree[]` `<@` `ltree`	`boolean`	does array contain a descendant of `ltree`?
`ltree` `@>` `ltree[]`	`boolean`	does array contain a descendant of `ltree`?
`ltree[]` `~` `lquery`	`boolean`	does array contain any path matching `lquery`?
`lquery` `~` `ltree[]`	`boolean`	does array contain any path matching `lquery`?
`ltree[]` `?` `lquery[]`	`boolean`	does `ltree` array contain any path matching any `lquery`?
`lquery[]` `?` `ltree[]`	`boolean`	does `ltree` array contain any path matching any `lquery`?
`ltree[]` `@` `ltxtquery`	`boolean`	does array contain any path matching `ltxtquery`?
`ltxtquery` `@` `ltree[]`	`boolean`	does array contain any path matching `ltxtquery`?
`ltree[]` `?@>` `ltree`	`ltree`	first array entry that is an ancestor of `ltree`; NULL if none
`ltree[]` `?<@` `ltree`	`ltree`	first array entry that is a descendant of `ltree`; NULL if none
`ltree[]` `?~` `lquery`	`ltree`	first array entry that matches `lquery`; NULL if none
`ltree[]` `?@` `ltxtquery`	`ltree`	first array entry that matches `ltxtquery`; NULL if none

The operators <@, @>, @ and ~ have analogues ^<@, ^@>, ^@, ^~, which are the same except they do not use indexes. These are useful only for testing purposes.

The available functions are shown in Table F.14.

Table F.14. ltree Functions

Function	Return Type	Description	Example	Result
`subltree(ltree, int start, int end)`	`ltree`	subpath of `ltree` from position `start` to position `end`-1 (counting from 0)	`subltree('Top.Child1.Child2',1,2)`	`Child1`
`subpath(ltree, int offset, int len)`	`ltree`	subpath of `ltree` starting at position `offset`, length `len`. If `offset` is negative, subpath starts that far from the end of the path. If `len` is negative, leaves that many labels off the end of the path.	`subpath('Top.Child1.Child2',0,2)`	`Top.Child1`
`subpath(ltree, int offset)`	`ltree`	subpath of `ltree` starting at position `offset`, extending to end of path. If `offset` is negative, subpath starts that far from the end of the path.	`subpath('Top.Child1.Child2',1)`	`Child1.Child2`
`nlevel(ltree)`	`integer`	number of labels in path	`nlevel('Top.Child1.Child2')`	`3`
`index(ltree a, ltree b)`	`integer`	position of first occurrence of `b` in `a`; -1 if not found	`index('0.1.2.3.5.4.5.6.8.5.6.8','5.6')`	`6`
`index(ltree a, ltree b, int offset)`	`integer`	position of first occurrence of `b` in `a`, searching starting at `offset`; negative `offset` means start `-offset` labels from the end of the path	`index('0.1.2.3.5.4.5.6.8.5.6.8','5.6',-4)`	`9`
`text2ltree(text)`	`ltree`	cast `text` to `ltree`
`ltree2text(ltree)`	`text`	cast `ltree` to `text`
`lca(ltree, ltree, ...)`	`ltree`	longest common ancestor of paths (up to 8 arguments supported)	`lca('1.2.3','1.2.3.4.5.6')`	`1.2`
`lca(ltree[])`	`ltree`	longest common ancestor of paths in array	`lca(array['1.2.3'::ltree,'1.2.3.4'])`	`1.2`

F.21.3. Indexes

ltree supports several types of indexes that can speed up the indicated operators:

B-tree index over ltree: <, <=, =, >=, >
GiST index over ltree: <, <=, =, >=, >, @>, <@, @, ~, ?
Example of creating such an index:
```
CREATE INDEX path_gist_idx ON test USING GIST (path);
```
GiST index over ltree[]: ltree[] <@ ltree, ltree @> ltree[], @, ~, ?
Example of creating such an index:
```
CREATE INDEX path_gist_idx ON test USING GIST (array_path);
```
Note: This index type is lossy.

F.21.4. Example

This example uses the following data (also available in file contrib/ltree/ltreetest.sql in the source distribution):

CREATE TABLE test (path ltree);
INSERT INTO test VALUES ('Top');
INSERT INTO test VALUES ('Top.Science');
INSERT INTO test VALUES ('Top.Science.Astronomy');
INSERT INTO test VALUES ('Top.Science.Astronomy.Astrophysics');
INSERT INTO test VALUES ('Top.Science.Astronomy.Cosmology');
INSERT INTO test VALUES ('Top.Hobbies');
INSERT INTO test VALUES ('Top.Hobbies.Amateurs_Astronomy');
INSERT INTO test VALUES ('Top.Collections');
INSERT INTO test VALUES ('Top.Collections.Pictures');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Stars');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Galaxies');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Astronauts');
CREATE INDEX path_gist_idx ON test USING GIST (path);
CREATE INDEX path_idx ON test USING BTREE (path);

Now, we have a table test populated with data describing the hierarchy shown below:

                        Top
                     /   |  \
             Science Hobbies Collections
                 /       |              \
        Astronomy   Amateurs_Astronomy Pictures
           /  \                            |
Astrophysics  Cosmology                Astronomy
                                        /  |    \
                                 Galaxies Stars Astronauts

We can do inheritance:

ltreetest=> SELECT path FROM test WHERE path <@ 'Top.Science';
                path
------------------------------------
 Top.Science
 Top.Science.Astronomy
 Top.Science.Astronomy.Astrophysics
 Top.Science.Astronomy.Cosmology
(4 rows)

Here are some examples of path matching:

ltreetest=> SELECT path FROM test WHERE path ~ '*.Astronomy.*';
                     path
-----------------------------------------------
 Top.Science.Astronomy
 Top.Science.Astronomy.Astrophysics
 Top.Science.Astronomy.Cosmology
 Top.Collections.Pictures.Astronomy
 Top.Collections.Pictures.Astronomy.Stars
 Top.Collections.Pictures.Astronomy.Galaxies
 Top.Collections.Pictures.Astronomy.Astronauts
(7 rows)

ltreetest=> SELECT path FROM test WHERE path ~ '*.!pictures@.*.Astronomy.*';
                path
------------------------------------
 Top.Science.Astronomy
 Top.Science.Astronomy.Astrophysics
 Top.Science.Astronomy.Cosmology
(3 rows)

Here are some examples of full text search:

ltreetest=> SELECT path FROM test WHERE path @ 'Astro*% & !pictures@';
                path
------------------------------------
 Top.Science.Astronomy
 Top.Science.Astronomy.Astrophysics
 Top.Science.Astronomy.Cosmology
 Top.Hobbies.Amateurs_Astronomy
(4 rows)

ltreetest=> SELECT path FROM test WHERE path @ 'Astro* & !pictures@';
                path
------------------------------------
 Top.Science.Astronomy
 Top.Science.Astronomy.Astrophysics
 Top.Science.Astronomy.Cosmology
(3 rows)

Path construction using functions:

ltreetest=> SELECT subpath(path,0,2)||'Space'||subpath(path,2) FROM test WHERE path <@ 'Top.Science.Astronomy';
                 ?column?
------------------------------------------
 Top.Science.Space.Astronomy
 Top.Science.Space.Astronomy.Astrophysics
 Top.Science.Space.Astronomy.Cosmology
(3 rows)

We could simplify this by creating a SQL function that inserts a label at a specified position in a path:

CREATE FUNCTION ins_label(ltree, int, text) RETURNS ltree
    AS 'select subpath($1,0,$2) || $3 || subpath($1,$2);'
    LANGUAGE SQL IMMUTABLE;

ltreetest=> SELECT ins_label(path,2,'Space') FROM test WHERE path <@ 'Top.Science.Astronomy';
                ins_label
------------------------------------------
 Top.Science.Space.Astronomy
 Top.Science.Space.Astronomy.Astrophysics
 Top.Science.Space.Astronomy.Cosmology
(3 rows)

F.21.5. Transforms

Additional extensions are available that implement transforms for the ltree type for PL/Python. The extensions are called ltree_plpythonu, ltree_plpython2u, and ltree_plpython3u (see Section 46.1 for the PL/Python naming convention). If you install these transforms and specify them when creating a function, ltree values are mapped to Python lists. (The reverse is currently not supported, however.)

F.21.6. Authors

All work was done by Teodor Sigaev (<teodor@stack.net>) and Oleg Bartunov (<oleg@sai.msu.su>). See http://www.sai.msu.su/~megera/postgres/gist/ for additional information. Authors would like to thank Eugeny Rodichev for helpful discussions. Comments and bug reports are welcome.

Prev	Up	Next
F.20. lo	Home	F.22. pageinspect