3.8. Utilizing Data Structures

Cypher can create and consume more complex data structures out of the box. As already mentioned you can create literal lists ([1,2,3]) and maps ({name: value}) within a statement.

There are a number of functions that work with lists. They range from simple ones like size(list) that returns the size of a list to reduce, which runs an expression against the elements and accumulates the results.

Let’s first load a bit of data into the graph. If you want more details on how the data is loaded, see the section called “Importing CSV”.

LOAD CSV WITH HEADERS FROM "http://neo4j.com/docs/3.1.0-SNAPSHOT/csv/intro/movies.csv" AS line
CREATE (m:Movie { id:line.id,title:line.title, released:toInt(line.year)});
LOAD CSV WITH HEADERS FROM "http://neo4j.com/docs/3.1.0-SNAPSHOT/csv/intro/persons.csv" AS line
MERGE (a:Person { id:line.id })
ON CREATE SET a.name=line.name;
LOAD CSV WITH HEADERS FROM "http://neo4j.com/docs/3.1.0-SNAPSHOT/csv/intro/roles.csv" AS line
MATCH (m:Movie { id:line.movieId })
MATCH (a:Person { id:line.personId })
CREATE (a)-[:ACTED_IN { roles: [line.role]}]->(m);
LOAD CSV WITH HEADERS FROM "http://neo4j.com/docs/3.1.0-SNAPSHOT/csv/intro/movie_actor_roles.csv" AS
  line FIELDTERMINATOR ";"
MERGE (m:Movie { title:line.title })
ON CREATE SET m.released = toInt(line.released)
MERGE (a:Person { name:line.actor })
ON CREATE SET a.born = toInt(line.born)
MERGE (a)-[:ACTED_IN { roles:split(line.characters,",")}]->(m)

Now, let’s try out data structures.

To begin with, collect the names of the actors per movie, and return two of them:

MATCH (movie:Movie)<-[:ACTED_IN]-(actor:Person)
RETURN movie.title AS movie, collect(actor.name)[0..2] AS two_of_cast
movietwo_of_cast
4 rows

"The American President"

["Michael Douglas","Martin Sheen"]

"Back to the Future"

["Christopher Lloyd","Michael J. Fox"]

"Wall Street"

["Michael Douglas","Martin Sheen"]

"The Shawshank Redemption"

["Morgan Freeman"]

You can also access individual elements or slices of a list quickly with list[1] or list[5..-5]. Other functions to access parts of a list are head(list), tail(list) and last(list).

List Predicates

When using lists and arrays in comparisons you can use predicates like value IN list or any(x IN list WHERE x = value). There are list predicates to satisfy conditions for all, any, none and single elements.

MATCH path =(:Person)-->(:Movie)<--(:Person)
WHERE ANY (n IN nodes(path) WHERE n.name = 'Michael Douglas')
RETURN extract(n IN nodes(path)| coalesce(n.name, n.title))
extract(n IN nodes(pathcoalesce(n.name, n.title))
6 rows

["Martin Sheen","Wall S

et","Michael Douglas"]

["Charlie Sheen","Wall

eet","Michael Douglas"]

["Michael Douglas","Wal

treet","Martin Sheen"]

["Michael Douglas","Wal

treet","Charlie Sheen"]

["Martin Sheen","The Am

can President","Michael Douglas"]

["Michael Douglas","The

erican President","Martin Sheen"]

List Processing

Oftentimes you want to process lists to filter, aggregate (reduce) or transform (extract) their values. Those transformations can be done within Cypher or in the calling code. This kind of list-processing can reduce the amount of data handled and returned, so it might make sense to do it within the Cypher statement.

A simple, non-graph example would be:

WITH range(1,10) AS numbers
WITH extract(n IN numbers | n*n) AS squares
WITH filter(n IN squares WHERE n > 25) AS large_squares
RETURN reduce(a = 0, n IN large_squares | a + n) AS sum_large_squares
sum_large_squares
1 row

330

In a graph-query you can filter or aggregate collected values instead or work on array properties.

MATCH (m:Movie)<-[r:ACTED_IN]-(a:Person)
WITH m.title AS movie, collect({ name: a.name, roles: r.roles }) AS cast
RETURN movie, filter(actor IN cast WHERE actor.name STARTS WITH "M")
moviefilter(actor IN cast WHERE actor.name STARTS WITH "M")
4 rows

"The American President"

[{name -> "Michael Douglas", roles -> ["President Andrew Shepherd"]},{name -> "Martin Sheen", roles -> ["A.J. MacInerney"]}]

"Back to the Future"

[{name -> "Michael J. Fox", roles -> ["Marty McFly"]}]

"Wall Street"

[{name -> "Michael Douglas", roles -> ["Gordon Gekko"]},{name -> "Martin Sheen", roles -> ["Carl Fox"]}]

"The Shawshank Redemption"

[{name -> "Morgan Freeman", roles -> ["Ellis Boyd 'Red' Redding"]}]

Unwind Lists

Sometimes you have collected information into a list, but want to use each element individually as a row. For instance, you might want to further match patterns in the graph. Or you passed in a list of values but now want to create or match a node or relationship for each element. Then you can use the UNWIND clause to unroll a list into a sequence of rows again.

For instance, a query to find the top 3 co-actors and then follow their movies and again list the cast for each of those movies:

MATCH (actor:Person)-[:ACTED_IN]->(movie:Movie)<-[:ACTED_IN]-(colleague:Person)
WHERE actor.name < colleague.name
WITH actor, colleague, count(*) AS frequency, collect(movie) AS movies
ORDER BY frequency DESC LIMIT 3 UNWIND movies AS m
MATCH (m)<-[:ACTED_IN]-(a)
RETURN m.title AS movie, collect(a.name) AS cast
moviecast
3 rows

"The American President"

["Michael Douglas","Martin Sheen"]

"Back to the Future"

["Christopher Lloyd","Michael J. Fox"]

"Wall Street"

["Michael Douglas","Martin Sheen","Charlie Sheen","Michael Douglas","Martin Sheen","Charlie Sheen"]