How to change the memory config for the Neo4j-shell command tool of Cypher batch import - neo4j

I am trying to use neo4j-shell command tool to do Cypher batch import. I followed the instructions described in Import data into your neo4j database from the neo4j-shell command. Here was the command that I ran:
import-cypher -d "," -i c://temp//neo//import.csv -o c://temp//neo//out.csv start n=node:employee_idx(EmpID={emp_id}), m=node:permit_idx(PmtID={pmtid}) create n<-[:Assign{AssID:{assid}}]-m
If there were only 100000 records in import.csv file, it ran perfectly. But if there were 200000 records in import.csv file, I got error: Error occurred in server thread; nested exception is: java.lang.OutOfMemory Error: Java heap space.
How to change the default memory config of this tool?

You need to set the environment variable JAVA_OPTS to appropriate values, e.g. on Linux it can be done using
JAVA_OPTS="-Xmx4G" bin/neo4j-shell

Related

Running CQL file in Neo4j

I have a CQL file called Novis.cql. Its somewhere random on my harddrive, but I want to run it in Neo4J to create my graph (it contains 500+ lines of code).
Where do I have to place it? And what command do I have to run nowadays to get it working? I've read and searched for answers, but some of the commands like Neo4jshell dont seem to work any longer...
Any help would be very appreciated!
The cypher-shell tool has been available for a while (starting with version 3.0, if not earlier), and you can use it to execute a Cypher query from a file that can be anywhere in your file system.
For example (on a linux/unix system), a command line like this will work (if you are in the neo4j home directory):
cat /my/full/path/my_code.cql | bin/cypher-shell -u neo4j -p secret
In neo4j 4.0 a new -f option was added to make it simpler:
bin/cypher-shell -u neo4j -p secret -f /my/full/path/my_code.cql

neo4j-admin command is not found when importing files using import tool

I am trying to import nodes and relationship into neo4j graphDB using neo4j import tool. so far I have written a script import.sh and it looks as document follows
neo4j-admin import \
--id-type=string \
--nodes:AGENT="import/nodes_1.csv,import/nodes_2.csv" \
--nodes:CUSTOMER="import/nodes_C1.csv,import/nodes_CUSTOMER_C2.csv" \
--relationship:related="import/rel_HEADER.csv,import/rel_test.csv"
I can run my neo4j service from console by ./bin/neo4j console
when I am running my script ./import.sh I am getting
./import.sh: line 1: neo4j-admin: command not found
I am running neo4j 3.5.6 community edition in MacOS . Am I missing something here . kindly help me to sort it out
The error is telling you that your computer doesn't know what you mean by neo4j-admin. You need to tell it where this program is located.
If I had to guess, you need to replace neo4j-admin with ./bin/neo4j-admin in your command.

Issue Importing the Paradise Papers Dataset into Neo4j

Hej all,
I am having an issue with importing the Paradise Papers dataset into a Neo4j (3.3.2) database.
It seems that the data is imported correctly into the database, as reported by neo4j-admin import.
...
IMPORT DONE in 1m 4s 889ms.
Imported:
867931 nodes
1657838 relationships
17838925 properties
Peak memory usage: 488.28 MB
...
However, after importing the data, the database seems to be empty, as reported by the Cypher queries MATCH (n) RETURN count(n); and CALL apoc.meta.graph();
...
count(n)
0
nodes, relationships
[], []
...
The following link points to a script, which should reproduce my issue. It is a Bash script for OS X/BSD (I think the -E switch for sed does not exist on Linux). Additionally, the script requires Docker to be installed and running on the system.
https://github.com/HelgeCPH/cypher_kernel/blob/master/example/import_data.sh
To run the script quickly:
wget https://raw.githubusercontent.com/HelgeCPH/cypher_kernel/master/example/import_data.sh
chmod u+x import_data.sh
./import_data.sh
I cannot see what I am doing wrong. Do I have to point to the database explicitely when running cypher-shell?
Checking on the container, the database files exist (ls -ltrh data/databases/graph.db) and their timestamps correspond to the time when importing the data.
Thanks in advance for your help!
You had multiple errors on your script :
Nodes were not loaded, because in the CSV the :ID column is not set. That's why I have added this part :
for file in import/csv_paradise_papers/.nodes..csv
do
sed -i -E '1s/node_id/node_id:ID/' $file
done
Labels of node were also not set. It's possible to set them directly in the command line like this : --nodes:MyLabel
If you do a query on Neo4j when the server is restarting, you will probably receive an error because the server is not yet ready. That's why I have added a sleep 5 at the end.
A better approach would be to wait until you have response from the server with someting like this :
until $(curl --output /dev/null --silent --head --fail http://localhost:7474); do
printf '.'
sleep 1
done
Last point, I don't know why, but if you do the restart of neo4j inside the container, you will not see the imported data. But if you restart the container itself it's OK ...

neo4j-shell example of running a Cypher script

I need to run a Cypher query against a Neo4J database, from a command line (for batch scheduling purposes).
When I run this:
./neo4j-shell -file /usr/share/neo4j/scripts/query.cypher -path /usr/share/neo4j/neo4j-community-3.1.1/data/databases/graph.db
I get this error:
ERROR (-v for expanded information):
Error starting org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory, /usr/share/neo4j/neo4j-community-3.1.1/data/databases/graph.db
There is a running Neo4J instance on that database (localhost:7474). I need the script to perform queries against it.
NOTE: this is a split of the original question, for the sake of tidiness.
To execute (one or more) Cypher statements from a file while the neo4j server is running, you can use the APOC procedure apoc.cypher.runFile(file or url).
Since you mention "batch scheduling", the Job management and periodic execution APOC procedures may be helpful. Those procedures could, in turn, execute calls to apoc.cypher.runFile.
Okay I just spun up a fresh instance of Neo4j-community-3.1.1 today and ran into the exact same problem. Note that I had already created a database using the bulk import tool, so one might need to make a directory for a database (mkdir data/databases/graph.db) before using a shell.
I believe your problem might be that you have an instance of Neo4j process running against the database you are trying to access.
For me, shutting down Neo4j, and then starting the shell with an explicit path worked:
cd /path/to/neo4j-community-3.1.1/
bin/neo4j stop ## assuming it is already running (may need a port specifier)
bin/neo4j-shell -path data/databases/graph.db
For some reason I thought you could have both the shell and the server running, but apparently that is not the case. Hopefully someone will correct me if I am wrong.

Where to find dumped data (using dump command in Neo4j Shell) in Neo4j

Wondering where to find the dumped file of neo4j dump command. I was running Neo4j v2.1.3 in windows operating system. Please help thanks.
I believe the dump command just outputs to the console, so you need to redirect the output. Something like:
Neo4jShell.bat -c "dump match (n:People)-[:Shared]-(m) return n,m;" > social.connection.db\test2.cql
Edited with the Windows version of the command
For UNIX systems a similar command works:
neo4j-shell -c "dump match (n:People)-[:Shared]-(m) return n,m;" > social.connection.db/test2.cql

Resources