Shortest path finding in Cypher and how it is planned.
Planning shortest paths in Cypher can lead to different query plans depending on the predicates that need
to be evaluated. Internally, Neo4j will use a fast bidirectional breadth-first search algorithm if the
predicates can be evaluated whilst searching for the path. Therefore, this fast algorithm will always
be certain to return the right answer when there are universal predicates on the path; for example, when
searching for the shortest path where all nodes have the Person
label, or where there are no nodes with
a name
property.
If the predicates need to inspect the whole path before deciding on whether it is valid or not, this fast
algorithm cannot be relied on to find the shortest path, and Neo4j may have to resort to using a slower
exhaustive depth-first search algorithm to find the path. This means that query plans for shortest path
queries with non-universal predicates will include a fallback to running the exhaustive search to find
the path should the fast algorithm not succeed. For example, depending on the data, an answer to a shortest
path query with existential predicates — such as the requirement that at least one node contains the property
name='Charlie Sheen'
— may not be able to be found by the fast algorithm. In this case, Neo4j will fall back to using
the exhaustive search to enumerate all paths and potentially return an answer.
The running times of these two algorithms may differ by orders of magnitude, so it is important to ensure that the fast approach is used for time-critical queries.
When the exhaustive search is planned, it is still only executed when the fast algorithm fails to find any matching paths. The fast algorithm is always executed first, since it is possible that it can find a valid path even though that could not be guaranteed at planning time.
Shortest path with fast algorithm
Query
MATCH (ms:Person { name:'Martin Sheen' }),(cs:Person { name:'Charlie Sheen' }), p = shortestPath((ms)-[rels:ACTED_IN*]-(cs)) WHERE ALL (r IN rels WHERE exists(r.role)) RETURN p
This query can be evaluated with the fast algorithm — there are no predicates that need to see the whole path before being evaluated.
Query plan
+--------------------+----------------+------+---------+-------------------+--------------------------------------+ | Operator | Estimated Rows | Rows | DB Hits | Variables | Other | +--------------------+----------------+------+---------+-------------------+--------------------------------------+ | +ProduceResults | 1 | 1 | 0 | p | p | | | +----------------+------+---------+-------------------+--------------------------------------+ | +ShortestPath | 1 | 1 | 9 | p, rels -- cs, ms | all(r in rels where hasProp(r.role)) | | | +----------------+------+---------+-------------------+--------------------------------------+ | +CartesianProduct | 1 | 1 | 0 | ms -- cs | | | |\ +----------------+------+---------+-------------------+--------------------------------------+ | | +Filter | 1 | 1 | 5 | cs | cs.name == { AUTOSTRING1} | | | | +----------------+------+---------+-------------------+--------------------------------------+ | | +NodeByLabelScan | 5 | 5 | 6 | cs | :Person | | | +----------------+------+---------+-------------------+--------------------------------------+ | +Filter | 1 | 1 | 5 | ms | ms.name == { AUTOSTRING0} | | | +----------------+------+---------+-------------------+--------------------------------------+ | +NodeByLabelScan | 5 | 5 | 6 | ms | :Person | +--------------------+----------------+------+---------+-------------------+--------------------------------------+ Total database accesses: 31
Try this query live CREATE (charlie:Person {name:'Charlie Sheen'}), (martin:Person {name: 'Martin Sheen'}), (michael:Person {name: 'Michael Douglas'}), (oliver:Person {name: 'Oliver Stone'}), (rob:Person {name: 'Rob Reiner'}), (wallStreet:Movie {title: 'Wall Street'}), (charlie)-[:ACTED_IN {role: 'Bud Fox'}]->(wallStreet), (martin)-[:ACTED_IN {role: 'Carl Fox'}]->(wallStreet), (michael)-[:ACTED_IN {role: 'Gordon Gekko'}]->(wallStreet), (oliver)-[:DIRECTED]->(wallStreet), (thePresident:Movie {title: 'The American President'}), (martin)-[:ACTED_IN {role: 'A.J. MacInerney'}]->(thePresident), (michael)-[:ACTED_IN {role: 'President Andrew Shepherd'}]->(thePresident), (rob)-[:DIRECTED]->(thePresident) CREATE INDEX ON :Person(name) MATCH (ms:Person {name:'Martin Sheen'} ), (cs:Person {name:'Charlie Sheen'}), p = shortestPath( (ms)-[rels:ACTED_IN*]-(cs) ) WHERE ALL(r in rels WHERE exists(r.role)) RETURN p
Shortest path with additional predicate checks on the paths
Consider using the exhaustive search as a fallback
Predicates used in the WHERE
clause that apply to the shortest path pattern are evaluated before deciding
what the shortest matching path is.
Query
MATCH (cs:Person { name:'Charlie Sheen' }),(ms:Person { name:'Martin Sheen' }), p = shortestPath((cs)-[*]-(ms)) WHERE length(p)> 1 RETURN p
This query, in contrast with the one above, needs to check that the whole path follows the predicate before we know if it is valid or not, and so the query plan will also include the fallback to the slower exhaustive search algorithm
Query plan
+--------------------------+----------------+------+---------+----------------------------------+-----------------------------------------------+ | Operator | Estimated Rows | Rows | DB Hits | Variables | Other | +--------------------------+----------------+------+---------+----------------------------------+-----------------------------------------------+ | +ProduceResults | 0 | 1 | 0 | p | p | | | +----------------+------+---------+----------------------------------+-----------------------------------------------+ | +AntiConditionalApply | 0 | 1 | 0 | cs -- anon[93], anon[112], ms, p | | | |\ +----------------+------+---------+----------------------------------+-----------------------------------------------+ | | +Top1 | 0 | 0 | 0 | anon[93], anon[112], ms, p | anon[93] | | | | +----------------+------+---------+----------------------------------+-----------------------------------------------+ | | +Projection | 0 | 0 | 0 | anon[93] -- anon[112], ms, p | length(p) | | | | +----------------+------+---------+----------------------------------+-----------------------------------------------+ | | +Filter | 0 | 0 | 0 | anon[112], ms, p | length(p) > { AUTOINT2} | | | | +----------------+------+---------+----------------------------------+-----------------------------------------------+ | | +Projection | 0 | 0 | 0 | p -- anon[112], ms | ProjectedPath(Set(anon[112], cs),<function2>) | | | | +----------------+------+---------+----------------------------------+-----------------------------------------------+ | | +VarLengthExpand(Into) | 0 | 0 | 0 | anon[112], ms | (cs)-[:*]->(ms) | | | | +----------------+------+---------+----------------------------------+-----------------------------------------------+ | | +Argument | 1 | 0 | 0 | | | | | +----------------+------+---------+----------------------------------+-----------------------------------------------+ | +Apply | 1 | 1 | 0 | cs, ms -- anon[112], p | | | |\ +----------------+------+---------+----------------------------------+-----------------------------------------------+ | | +Optional | 1 | 1 | 0 | anon[112], p | | | | | +----------------+------+---------+----------------------------------+-----------------------------------------------+ | | +ShortestPath | 0 | 1 | 1 | anon[112], p | length(p) > { AUTOINT2} | | | | +----------------+------+---------+----------------------------------+-----------------------------------------------+ | | +Argument | 1 | 1 | 0 | | | | | +----------------+------+---------+----------------------------------+-----------------------------------------------+ | +CartesianProduct | 1 | 1 | 0 | cs -- ms | | | |\ +----------------+------+---------+----------------------------------+-----------------------------------------------+ | | +Filter | 1 | 1 | 5 | ms | ms.name == { AUTOSTRING1} | | | | +----------------+------+---------+----------------------------------+-----------------------------------------------+ | | +NodeByLabelScan | 5 | 5 | 6 | ms | :Person | | | +----------------+------+---------+----------------------------------+-----------------------------------------------+ | +Filter | 1 | 1 | 5 | cs | cs.name == { AUTOSTRING0} | | | +----------------+------+---------+----------------------------------+-----------------------------------------------+ | +NodeByLabelScan | 5 | 5 | 6 | cs | :Person | +--------------------------+----------------+------+---------+----------------------------------+-----------------------------------------------+ Total database accesses: 23
Try this query live CREATE (charlie:Person {name:'Charlie Sheen'}), (martin:Person {name: 'Martin Sheen'}), (michael:Person {name: 'Michael Douglas'}), (oliver:Person {name: 'Oliver Stone'}), (rob:Person {name: 'Rob Reiner'}), (wallStreet:Movie {title: 'Wall Street'}), (charlie)-[:ACTED_IN {role: 'Bud Fox'}]->(wallStreet), (martin)-[:ACTED_IN {role: 'Carl Fox'}]->(wallStreet), (michael)-[:ACTED_IN {role: 'Gordon Gekko'}]->(wallStreet), (oliver)-[:DIRECTED]->(wallStreet), (thePresident:Movie {title: 'The American President'}), (martin)-[:ACTED_IN {role: 'A.J. MacInerney'}]->(thePresident), (michael)-[:ACTED_IN {role: 'President Andrew Shepherd'}]->(thePresident), (rob)-[:DIRECTED]->(thePresident) CREATE INDEX ON :Person(name) MATCH (cs:Person {name:'Charlie Sheen'}), (ms:Person {name:'Martin Sheen'}), p = shortestPath( (cs)-[*]-(ms) ) WHERE length(p) > 1 RETURN p
The way the bigger exhaustive query plan works is by using Apply
/Optional
to ensure that when the
fast algorithm does not find any results, a NULL
result is generated instead of simply stopping the result
stream.
On top of this, the planner will issue an AntiConditionalApply
, which will run the exhaustive search
if the path variable is pointing to NULL
instead of a path.
Prevent the exhaustive search from being used as a fallback
Query
MATCH (cs:Person { name:'Charlie Sheen' }),(ms:Person { name:'Martin Sheen' }), p = shortestPath((cs)-[*]-(ms)) WITH p WHERE length(p)> 1 RETURN p
This query, just like the one above, needs to check that the whole path follows the predicate
before we know if it is valid or not. However, the inclusion of the WITH
clause means that the query
plan will not include the fallback to the slower exhaustive search algorithm. Instead, any
paths found by the fast algorithm will subsequently be filtered, which may result in no answers
being returned.
Query plan
+--------------------+----------------+------+---------+-----------------------------------+-----------------------------+ | Operator | Estimated Rows | Rows | DB Hits | Variables | Other | +--------------------+----------------+------+---------+-----------------------------------+-----------------------------+ | +ProduceResults | 1 | 1 | 0 | p | p | | | +----------------+------+---------+-----------------------------------+-----------------------------+ | +Filter | 1 | 1 | 0 | anon[146], anon[112], cs, ms, p | anon[146] | | | +----------------+------+---------+-----------------------------------+-----------------------------+ | +Projection | 1 | 1 | 0 | anon[146] -- anon[112], cs, ms, p | p; length(p) > { AUTOINT2} | | | +----------------+------+---------+-----------------------------------+-----------------------------+ | +ShortestPath | 1 | 1 | 1 | anon[112], p -- cs, ms | | | | +----------------+------+---------+-----------------------------------+-----------------------------+ | +CartesianProduct | 1 | 1 | 0 | cs -- ms | | | |\ +----------------+------+---------+-----------------------------------+-----------------------------+ | | +Filter | 1 | 1 | 5 | ms | ms.name == { AUTOSTRING1} | | | | +----------------+------+---------+-----------------------------------+-----------------------------+ | | +NodeByLabelScan | 5 | 5 | 6 | ms | :Person | | | +----------------+------+---------+-----------------------------------+-----------------------------+ | +Filter | 1 | 1 | 5 | cs | cs.name == { AUTOSTRING0} | | | +----------------+------+---------+-----------------------------------+-----------------------------+ | +NodeByLabelScan | 5 | 5 | 6 | cs | :Person | +--------------------+----------------+------+---------+-----------------------------------+-----------------------------+ Total database accesses: 23
Try this query live CREATE (charlie:Person {name:'Charlie Sheen'}), (martin:Person {name: 'Martin Sheen'}), (michael:Person {name: 'Michael Douglas'}), (oliver:Person {name: 'Oliver Stone'}), (rob:Person {name: 'Rob Reiner'}), (wallStreet:Movie {title: 'Wall Street'}), (charlie)-[:ACTED_IN {role: 'Bud Fox'}]->(wallStreet), (martin)-[:ACTED_IN {role: 'Carl Fox'}]->(wallStreet), (michael)-[:ACTED_IN {role: 'Gordon Gekko'}]->(wallStreet), (oliver)-[:DIRECTED]->(wallStreet), (thePresident:Movie {title: 'The American President'}), (martin)-[:ACTED_IN {role: 'A.J. MacInerney'}]->(thePresident), (michael)-[:ACTED_IN {role: 'President Andrew Shepherd'}]->(thePresident), (rob)-[:DIRECTED]->(thePresident) CREATE INDEX ON :Person(name) MATCH (cs:Person {name:'Charlie Sheen'}), (ms:Person {name:'Martin Sheen'}), p = shortestPath( (cs)-[*]-(ms) ) WITH p WHERE length(p) > 1 RETURN p