29.1. CSV file header format

The header row of each data source specifies how the fields should be interpreted. The same delimiter is used for the header row as for the rest of the data.

The header contains information for each field, with the format: <name>:<field_type>. The <name> is used as the property key for values, and ignored in other cases. The following <field_type> settings can be used for both nodes and relationships:

Property value
Use one of int, long, float, double, boolean, byte, short, char, string to designate the data type. If no data type is given, this defaults to string. To define an array type, append [] to the type. By default, array values are separated by ;. A different delimiter can be specified with --array-delimiter.
IGNORE
Ignore this field completely.

See below for the specifics of node and relationship data source headers.

Nodes

The following field types do additionally apply to node data sources:

ID
Each node must have a unique id which is used during the import. The ids are used to find the correct nodes when creating relationships. Note that the id has to be unique across all nodes in the import, even nodes with different labels.
LABEL
Read one or more labels from this field. Like array values, multiple labels are separated by ;, or by the character specified with --array-delimiter.

Relationships

For relationship data sources, there are three mandatory fields:

TYPE
The relationship type to use for the relationship.
START_ID
The id of the start node of the relationship to create.
END_ID
The id of the end node of the relationship to create.

ID spaces

The import tool assumes that node identifiers are unique across node files. If this isn’t the case then we can define an id space. Id spaces are defined in the ID field of node files.

For example, to specify the Person id space we would use the field type ID(Person) in our persons node file. We also need to reference that id space in our relationships file i.e. START_ID(Person) or END_ID(Person).