LOAD CSV
is used to import data from CSV files.
-
The URL of the CSV file is specified by using
FROM
followed by an arbitrary expression evaluating to the URL in question. -
It is required to specify a variable for the CSV data using
AS
. -
LOAD CSV
supports resources compressed with gzip, Deflate, as well as ZIP archives. -
CSV files can be stored on the database server and are then accessible using a
file:///
URL. Alternatively,LOAD CSV
also supports accessing CSV files via HTTPS, HTTP, and FTP. -
LOAD CSV
will follow HTTP redirects but for security reasons it will not follow redirects that changes the protocol, for example if the redirect is going from HTTPS to HTTP.
Configuration settings for file URLs
|
See the examples below for further details.
There is also a worked example, see Section 12.8, “Importing CSV files with Cypher”.
CSV file format
The CSV file to use with LOAD CSV
must have the following characteristics:
- the character encoding is UTF-8;
-
the end line termination is system dependent, e.g., it is
\n
on unix or\r\n
on windows; -
the default field terminator is
,
; -
the field terminator character can be change by using the option
FIELDTERMINATOR
available in theLOAD CSV
command; - quoted strings are allowed in the CSV file and the quotes are dropped when reading the data;
-
the character for string quotation is double quote
"
; -
the escape character is
\
.
Import data from a CSV file
To import data from a CSV file into Neo4j, you can use LOAD CSV
to get the data into your query.
Then you write it to your database using the normal updating clauses of Cypher.
artists.csv
"1","ABBA","1992" "2","Roxette","1986" "3","Europe","1979" "4","The Cardigans","1992"
Query
LOAD CSV FROM 'http://neo4j.com/docs/3.1.0-SNAPSHOT/csv/artists.csv' AS line CREATE (:Artist { name: line[1], year: toInt(line[2])})
A new node with the Artist
label is created for each row in the CSV file.
In addition, two columns from the CSV file are set as properties on the nodes.
Result
Nodes created: 4 |
---|
Properties set: 8 |
Labels added: 4 |
|
Try this query live none LOAD CSV FROM 'http://neo4j.com/docs/3.1.0-SNAPSHOT/csv/artists.csv' AS line CREATE (:Artist {name: line[1], year: toInt(line[2])})
Import data from a CSV file containing headers
When your CSV file has headers, you can view each row in the file as a map instead of as an array of strings.
artists-with-headers.csv
"Id","Name","Year" "1","ABBA","1992" "2","Roxette","1986" "3","Europe","1979" "4","The Cardigans","1992"
Query
LOAD CSV WITH HEADERS FROM 'http://neo4j.com/docs/3.1.0-SNAPSHOT/csv/artists-with-headers.csv' AS line CREATE (:Artist { name: line.Name, year: toInt(line.Year)})
This time, the file starts with a single row containing column names.
Indicate this using WITH HEADERS
and you can access specific fields by their corresponding column name.
Result
Nodes created: 4 |
---|
Properties set: 8 |
Labels added: 4 |
|
Try this query live none LOAD CSV WITH HEADERS FROM 'http://neo4j.com/docs/3.1.0-SNAPSHOT/csv/artists-with-headers.csv' AS line CREATE (:Artist {name: line.Name, year: toInt(line.Year)})
Import data from a CSV file with a custom field delimiter
Sometimes, your CSV file has other field delimiters than commas.
You can specify which delimiter your file uses using FIELDTERMINATOR
.
artists-fieldterminator.csv
"1";"ABBA";"1992" "2";"Roxette";"1986" "3";"Europe";"1979" "4";"The Cardigans";"1992"
Query
LOAD CSV FROM 'http://neo4j.com/docs/3.1.0-SNAPSHOT/csv/artists-fieldterminator.csv' AS line FIELDTERMINATOR ';' CREATE (:Artist { name: line[1], year: toInt(line[2])})
As values in this file are separated by a semicolon, a custom FIELDTERMINATOR
is specified in the LOAD CSV
clause.
Result
Nodes created: 4 |
---|
Properties set: 8 |
Labels added: 4 |
|
Try this query live none LOAD CSV FROM 'http://neo4j.com/docs/3.1.0-SNAPSHOT/csv/artists-fieldterminator.csv' AS line FIELDTERMINATOR ';' CREATE (:Artist {name: line[1], year: toInt(line[2])})
Importing large amounts of data
If the CSV file contains a significant number of rows (approaching hundreds of thousands or millions), USING PERIODIC COMMIT
can be used to instruct Neo4j to perform a commit after a number of rows.
This reduces the memory overhead of the transaction state.
By default, the commit will happen every 1000 rows.
For more information, see Section 12.9, “Using Periodic Commit”.
Query
USING PERIODIC COMMIT LOAD CSV FROM 'http://neo4j.com/docs/3.1.0-SNAPSHOT/csv/artists.csv' AS line CREATE (:Artist { name: line[1], year: toInt(line[2])})
Result
Nodes created: 4 |
---|
Properties set: 8 |
Labels added: 4 |
|
Try this query live none USING PERIODIC COMMIT LOAD CSV FROM 'http://neo4j.com/docs/3.1.0-SNAPSHOT/csv/artists.csv' AS line CREATE (:Artist {name: line[1], year: toInt(line[2])})
Setting the rate of periodic commits
You can set the number of rows as in the example, where it is set to 500 rows.
Query
USING PERIODIC COMMIT 500 LOAD CSV FROM 'http://neo4j.com/docs/3.1.0-SNAPSHOT/csv/artists.csv' AS line CREATE (:Artist { name: line[1], year: toInt(line[2])})
Result
Nodes created: 4 |
---|
Properties set: 8 |
Labels added: 4 |
|
Try this query live none USING PERIODIC COMMIT 500 LOAD CSV FROM 'http://neo4j.com/docs/3.1.0-SNAPSHOT/csv/artists.csv' AS line CREATE (:Artist {name: line[1], year: toInt(line[2])})
Import data containing escaped characters
In this example, we both have additional quotes around the values, as well as escaped quotes inside one value.
artists-with-escaped-char.csv
"1","The ""Symbol""","1992"
Query
LOAD CSV FROM 'http://neo4j.com/docs/3.1.0-SNAPSHOT/csv/artists-with-escaped-char.csv' AS line CREATE (a:Artist { name: line[1], year: toInt(line[2])}) RETURN a.name AS name, a.year AS year, length(a.name) AS length
Note that strings are wrapped in quotes in the output here. You can see that when comparing to the length of the string in this case!
Result
name | year | length |
---|---|---|
1 row | ||
Nodes created: 1 | ||
Properties set: 2 | ||
Labels added: 1 | ||
|
|
|
Try this query live none LOAD CSV FROM 'http://neo4j.com/docs/3.1.0-SNAPSHOT/csv/artists-with-escaped-char.csv' AS line CREATE (a:Artist {name: line[1], year: toInt(line[2])}) return a.name as name, a.year as year, length(a.name) as length