16.6. Shortest path planning

Shortest path finding in Cypher and how it is planned.

Planning shortest paths in Cypher can lead to different query plans depending on the predicates that need to be evaluated. Internally, Neo4j will use a fast bidirectional breadth-first search algorithm if the predicates can be evaluated whilst searching for the path. Therefore, this fast algorithm will always be certain to return the right answer when there are universal predicates on the path; for example, when searching for the shortest path where all nodes have the Person label, or where there are no nodes with a name property.

If the predicates need to inspect the whole path before deciding on whether it is valid or not, this fast algorithm cannot be relied on to find the shortest path, and Neo4j may have to resort to using a slower exhaustive depth-first search algorithm to find the path. This means that query plans for shortest path queries with non-universal predicates will include a fallback to running the exhaustive search to find the path should the fast algorithm not succeed. For example, depending on the data, an answer to a shortest path query with existential predicates — such as the requirement that at least one node contains the property name='Charlie Sheen' — may not be able to be found by the fast algorithm. In this case, Neo4j will fall back to using the exhaustive search to enumerate all paths and potentially return an answer.

The running times of these two algorithms may differ by orders of magnitude, so it is important to ensure that the fast approach is used for time-critical queries.

When the exhaustive search is planned, it is still only executed when the fast algorithm fails to find any matching paths. The fast algorithm is always executed first, since it is possible that it can find a valid path even though that could not be guaranteed at planning time.

Shortest path with fast algorithm

Query 

MATCH (ms:Person { name:'Martin Sheen' }),(cs:Person { name:'Charlie Sheen' }), p = shortestPath((ms)-[rels:ACTED_IN*]-(cs))
WHERE ALL (r IN rels WHERE exists(r.role))
RETURN p

This query can be evaluated with the fast algorithm — there are no predicates that need to see the whole path before being evaluated.

Query plan 

+--------------------+----------------+------+---------+-------------------+--------------------------------------+
| Operator           | Estimated Rows | Rows | DB Hits | Variables         | Other                                |
+--------------------+----------------+------+---------+-------------------+--------------------------------------+
| +ProduceResults    |              1 |    1 |       0 | p                 | p                                    |
| |                  +----------------+------+---------+-------------------+--------------------------------------+
| +ShortestPath      |              1 |    1 |       9 | p, rels -- cs, ms | all(r in rels where hasProp(r.role)) |
| |                  +----------------+------+---------+-------------------+--------------------------------------+
| +CartesianProduct  |              1 |    1 |       0 | ms -- cs          |                                      |
| |\                 +----------------+------+---------+-------------------+--------------------------------------+
| | +Filter          |              1 |    1 |       5 | cs                | cs.name == {  AUTOSTRING1}           |
| | |                +----------------+------+---------+-------------------+--------------------------------------+
| | +NodeByLabelScan |              5 |    5 |       6 | cs                | :Person                              |
| |                  +----------------+------+---------+-------------------+--------------------------------------+
| +Filter            |              1 |    1 |       5 | ms                | ms.name == {  AUTOSTRING0}           |
| |                  +----------------+------+---------+-------------------+--------------------------------------+
| +NodeByLabelScan   |              5 |    5 |       6 | ms                | :Person                              |
+--------------------+----------------+------+---------+-------------------+--------------------------------------+

Total database accesses: 31

Try this query live  CREATE (charlie:Person {name:'Charlie Sheen'}), (martin:Person {name: 'Martin Sheen'}), (michael:Person {name: 'Michael Douglas'}), (oliver:Person {name: 'Oliver Stone'}), (rob:Person {name: 'Rob Reiner'}), (wallStreet:Movie {title: 'Wall Street'}), (charlie)-[:ACTED_IN {role: 'Bud Fox'}]->(wallStreet), (martin)-[:ACTED_IN {role: 'Carl Fox'}]->(wallStreet), (michael)-[:ACTED_IN {role: 'Gordon Gekko'}]->(wallStreet), (oliver)-[:DIRECTED]->(wallStreet), (thePresident:Movie {title: 'The American President'}), (martin)-[:ACTED_IN {role: 'A.J. MacInerney'}]->(thePresident), (michael)-[:ACTED_IN {role: 'President Andrew Shepherd'}]->(thePresident), (rob)-[:DIRECTED]->(thePresident) CREATE INDEX ON :Person(name) MATCH (ms:Person {name:'Martin Sheen'} ), (cs:Person {name:'Charlie Sheen'}), p = shortestPath( (ms)-[rels:ACTED_IN*]-(cs) ) WHERE ALL(r in rels WHERE exists(r.role)) RETURN p

Shortest path with additional predicate checks on the paths

Consider using the exhaustive search as a fallback

Predicates used in the WHERE clause that apply to the shortest path pattern are evaluated before deciding what the shortest matching path is.

Query 

MATCH (cs:Person { name:'Charlie Sheen' }),(ms:Person { name:'Martin Sheen' }), p = shortestPath((cs)-[*]-(ms))
WHERE length(p)> 1
RETURN p

This query, in contrast with the one above, needs to check that the whole path follows the predicate before we know if it is valid or not, and so the query plan will also include the fallback to the slower exhaustive search algorithm

Query plan 

+--------------------------+----------------+------+---------+----------------------------------+-----------------------------------------------+
| Operator                 | Estimated Rows | Rows | DB Hits | Variables                        | Other                                         |
+--------------------------+----------------+------+---------+----------------------------------+-----------------------------------------------+
| +ProduceResults          |              0 |    1 |       0 | p                                | p                                             |
| |                        +----------------+------+---------+----------------------------------+-----------------------------------------------+
| +AntiConditionalApply    |              0 |    1 |       0 | cs -- anon[93], anon[112], ms, p |                                               |
| |\                       +----------------+------+---------+----------------------------------+-----------------------------------------------+
| | +Top1                  |              0 |    0 |       0 | anon[93], anon[112], ms, p       | anon[93]                                      |
| | |                      +----------------+------+---------+----------------------------------+-----------------------------------------------+
| | +Projection            |              0 |    0 |       0 | anon[93] -- anon[112], ms, p     | length(p)                                     |
| | |                      +----------------+------+---------+----------------------------------+-----------------------------------------------+
| | +Filter                |              0 |    0 |       0 | anon[112], ms, p                 | length(p) > {  AUTOINT2}                      |
| | |                      +----------------+------+---------+----------------------------------+-----------------------------------------------+
| | +Projection            |              0 |    0 |       0 | p -- anon[112], ms               | ProjectedPath(Set(anon[112], cs),<function2>) |
| | |                      +----------------+------+---------+----------------------------------+-----------------------------------------------+
| | +VarLengthExpand(Into) |              0 |    0 |       0 | anon[112], ms                    | (cs)-[:*]->(ms)                               |
| | |                      +----------------+------+---------+----------------------------------+-----------------------------------------------+
| | +Argument              |              1 |    0 |       0 |                                  |                                               |
| |                        +----------------+------+---------+----------------------------------+-----------------------------------------------+
| +Apply                   |              1 |    1 |       0 | cs, ms -- anon[112], p           |                                               |
| |\                       +----------------+------+---------+----------------------------------+-----------------------------------------------+
| | +Optional              |              1 |    1 |       0 | anon[112], p                     |                                               |
| | |                      +----------------+------+---------+----------------------------------+-----------------------------------------------+
| | +ShortestPath          |              0 |    1 |       1 | anon[112], p                     | length(p) > {  AUTOINT2}                      |
| | |                      +----------------+------+---------+----------------------------------+-----------------------------------------------+
| | +Argument              |              1 |    1 |       0 |                                  |                                               |
| |                        +----------------+------+---------+----------------------------------+-----------------------------------------------+
| +CartesianProduct        |              1 |    1 |       0 | cs -- ms                         |                                               |
| |\                       +----------------+------+---------+----------------------------------+-----------------------------------------------+
| | +Filter                |              1 |    1 |       5 | ms                               | ms.name == {  AUTOSTRING1}                    |
| | |                      +----------------+------+---------+----------------------------------+-----------------------------------------------+
| | +NodeByLabelScan       |              5 |    5 |       6 | ms                               | :Person                                       |
| |                        +----------------+------+---------+----------------------------------+-----------------------------------------------+
| +Filter                  |              1 |    1 |       5 | cs                               | cs.name == {  AUTOSTRING0}                    |
| |                        +----------------+------+---------+----------------------------------+-----------------------------------------------+
| +NodeByLabelScan         |              5 |    5 |       6 | cs                               | :Person                                       |
+--------------------------+----------------+------+---------+----------------------------------+-----------------------------------------------+

Total database accesses: 23

Try this query live  CREATE (charlie:Person {name:'Charlie Sheen'}), (martin:Person {name: 'Martin Sheen'}), (michael:Person {name: 'Michael Douglas'}), (oliver:Person {name: 'Oliver Stone'}), (rob:Person {name: 'Rob Reiner'}), (wallStreet:Movie {title: 'Wall Street'}), (charlie)-[:ACTED_IN {role: 'Bud Fox'}]->(wallStreet), (martin)-[:ACTED_IN {role: 'Carl Fox'}]->(wallStreet), (michael)-[:ACTED_IN {role: 'Gordon Gekko'}]->(wallStreet), (oliver)-[:DIRECTED]->(wallStreet), (thePresident:Movie {title: 'The American President'}), (martin)-[:ACTED_IN {role: 'A.J. MacInerney'}]->(thePresident), (michael)-[:ACTED_IN {role: 'President Andrew Shepherd'}]->(thePresident), (rob)-[:DIRECTED]->(thePresident) CREATE INDEX ON :Person(name) MATCH (cs:Person {name:'Charlie Sheen'}), (ms:Person {name:'Martin Sheen'}), p = shortestPath( (cs)-[*]-(ms) ) WHERE length(p) > 1 RETURN p

The way the bigger exhaustive query plan works is by using Apply/Optional to ensure that when the fast algorithm does not find any results, a NULL result is generated instead of simply stopping the result stream. On top of this, the planner will issue an AntiConditionalApply, which will run the exhaustive search if the path variable is pointing to NULL instead of a path.

Prevent the exhaustive search from being used as a fallback

Query 

MATCH (cs:Person { name:'Charlie Sheen' }),(ms:Person { name:'Martin Sheen' }), p = shortestPath((cs)-[*]-(ms))
WITH p
WHERE length(p)> 1
RETURN p

This query, just like the one above, needs to check that the whole path follows the predicate before we know if it is valid or not. However, the inclusion of the WITH clause means that the query plan will not include the fallback to the slower exhaustive search algorithm. Instead, any paths found by the fast algorithm will subsequently be filtered, which may result in no answers being returned.

Query plan 

+--------------------+----------------+------+---------+-----------------------------------+-----------------------------+
| Operator           | Estimated Rows | Rows | DB Hits | Variables                         | Other                       |
+--------------------+----------------+------+---------+-----------------------------------+-----------------------------+
| +ProduceResults    |              1 |    1 |       0 | p                                 | p                           |
| |                  +----------------+------+---------+-----------------------------------+-----------------------------+
| +Filter            |              1 |    1 |       0 | anon[146], anon[112], cs, ms, p   | anon[146]                   |
| |                  +----------------+------+---------+-----------------------------------+-----------------------------+
| +Projection        |              1 |    1 |       0 | anon[146] -- anon[112], cs, ms, p | p; length(p) > {  AUTOINT2} |
| |                  +----------------+------+---------+-----------------------------------+-----------------------------+
| +ShortestPath      |              1 |    1 |       1 | anon[112], p -- cs, ms            |                             |
| |                  +----------------+------+---------+-----------------------------------+-----------------------------+
| +CartesianProduct  |              1 |    1 |       0 | cs -- ms                          |                             |
| |\                 +----------------+------+---------+-----------------------------------+-----------------------------+
| | +Filter          |              1 |    1 |       5 | ms                                | ms.name == {  AUTOSTRING1}  |
| | |                +----------------+------+---------+-----------------------------------+-----------------------------+
| | +NodeByLabelScan |              5 |    5 |       6 | ms                                | :Person                     |
| |                  +----------------+------+---------+-----------------------------------+-----------------------------+
| +Filter            |              1 |    1 |       5 | cs                                | cs.name == {  AUTOSTRING0}  |
| |                  +----------------+------+---------+-----------------------------------+-----------------------------+
| +NodeByLabelScan   |              5 |    5 |       6 | cs                                | :Person                     |
+--------------------+----------------+------+---------+-----------------------------------+-----------------------------+

Total database accesses: 23

Try this query live  CREATE (charlie:Person {name:'Charlie Sheen'}), (martin:Person {name: 'Martin Sheen'}), (michael:Person {name: 'Michael Douglas'}), (oliver:Person {name: 'Oliver Stone'}), (rob:Person {name: 'Rob Reiner'}), (wallStreet:Movie {title: 'Wall Street'}), (charlie)-[:ACTED_IN {role: 'Bud Fox'}]->(wallStreet), (martin)-[:ACTED_IN {role: 'Carl Fox'}]->(wallStreet), (michael)-[:ACTED_IN {role: 'Gordon Gekko'}]->(wallStreet), (oliver)-[:DIRECTED]->(wallStreet), (thePresident:Movie {title: 'The American President'}), (martin)-[:ACTED_IN {role: 'A.J. MacInerney'}]->(thePresident), (michael)-[:ACTED_IN {role: 'President Andrew Shepherd'}]->(thePresident), (rob)-[:DIRECTED]->(thePresident) CREATE INDEX ON :Person(name) MATCH (cs:Person {name:'Charlie Sheen'}), (ms:Person {name:'Martin Sheen'}), p = shortestPath( (cs)-[*]-(ms) ) WITH p WHERE length(p) > 1 RETURN p