23.7. Usage Data Collector

The Neo4j Usage Data Collector is a sub-system that gathers usage data, reporting it to the UDC-server at udc.neo4j.org. It is easy to disable, and does not collect any data that is confidential. For more information about what is being sent, see below.

The Neo4j team uses this information as a form of automatic, effortless feedback from the Neo4j community. We want to verify that we are doing the right thing by matching download statistics with usage statistics. After each release, we can see if there is a larger retention span of the server software.

The data collected is clearly stated here. If any future versions of this system collect additional data, we will clearly announce those changes.

The Neo4j team is very concerned about your privacy. We do not disclose any personally identifiable information.

Technical Information

To gather good statistics about Neo4j usage, UDC collects this information:

  • Kernel version: The build number, and if there are any modifications to the kernel.
  • Store id: A randomized globally unique id created at the same time a database is created.
  • Ping count: UDC holds an internal counter which is incremented for every ping, and reset for every restart of the kernel.
  • Source: This is either "neo4j" or "maven". If you downloaded Neo4j from the Neo4j website, it’s "neo4j", if you are using Maven to get Neo4j, it will be "maven".
  • Java version: The referrer string shows which version of Java is being used.
  • Registration id: For registered server instances.
  • Tags about the execution context (e.g. test, language, web-container, app-container, spring, ejb).
  • Neo4j Edition (community, enterprise).
  • A hash of the current cluster name (if any).
  • Distribution information for Linux (rpm, dpkg, unknown).
  • User-Agent header for tracking usage of REST client drivers
  • MAC address to uniquely identify instances behind firewalls.
  • The number of processors on the server.
  • The amount of memory on the server.
  • The JVM heap size.
  • The number of nodes, relationships, labels and properties in the database.

After startup, UDC waits for ten minutes before sending the first ping. It does this for two reasons; first, we don’t want the startup to be slower because of UDC, and secondly, we want to keep pings from automatic tests to a minimum. The ping to the UDC servers is done with a HTTP GET.

How to disable UDC

UDC is easily turned off by disabling it in the database configuration, in neo4j.conf for Neo4j server or in the configuration passed to the database in embedded mode. See UDC Configuration in the configuration section for details.