I had used Tinkerpop and openrdf sail to connect the neo4j at local well.
String dB_DIR = "neo4j//data";
Sail sail = new GraphSail(new Neo4jGraph(dB_DIR));
sail.initialize();
that I can import ttl or rdf file then query0
but now I want to connect the remote neo4j.
How can I use neo4j jdbc in this case ?
or the Tinkerpop blueprint has the way can do it ?
( I did some searching work but no good answer )
You should use the Sesame Repository API for accessing a Sail object.
Specifically, what you need to do is wrap your Sail object in a Repository object:
Sail sail = new GraphSail(new Neo4jGraph(dB_DIR));
Repository rep = new SailRepository(sail);
rep.initialize();
After this, use the Repository object to connect to your store and perfom actions, e.g. to load a Turtle file and then do a query:
RepositoryConnection conn = rep.getConnection();
try {
// load data
File file = new File("/path/to/file.tll");
conn.add(file, file.getAbsolutePath(), RDFFormat.TURTLE);
// do query and print result to STDOUT
String query = "SELECT * WHERE {?s ?p ?o} LIMIT 10";
TupleQueryResult result =
conn.prepareTupleQuery(QueryLanguage.SPARQL, query).evaluate();
while (result.hasNext()) {
System.out.println(result.next().toString());
}
}
finally {
conn.close();
}
See the Sesame documentation or Javadoc for more info and examples of how to use the Repository API.
(disclosure: I am on the Sesame development team)
Related
Hi I have a couple of queries I want to run & save results in sequence one after another using Apache Beam, I've seen some similar questions but couldn't find an answer. I'm used to designing pipelines using Airflow and I'm fairly new to Apache Beam. I'm using the Dataflow runner. Here's my code right now: I would like query2 to run only after query1 results are saved to the corresponding table. How do I chain them?
PCollection<TableRow> resultsStep1 = getData("Run Query 1",
"Select * FROM basetable");
resultsStep1.apply("Save Query1 data",
BigQueryIO.writeTableRows()
.withSchema(BigQueryUtils.toTableSchema(resultsStep1.getSchema()))
.to("resultsStep1")
.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
.withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_TRUNCATE)
);
PCollection<TableRow> resultsStep2 = getData("Run Query 2",
"Select * FROM resultsStep1");
resultsStep2.apply("Save Query2 data",
BigQueryIO.writeTableRows()
.withSchema(BigQueryUtils.toTableSchema(resultsStep2.getSchema()))
.to("resultsStep2")
.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
.withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_TRUNCATE)
);
And here's my getData function definition:
private PCollection<TableRow> getData(final String taskName, final String query) {
return pipeline.apply(taskName,
BigQueryIO.readTableRowsWithSchema()
.fromQuery(query)
.usingStandardSql()
.withCoder(TableRowJsonCoder.of()));
}
Edit (Update): Turns out:
You can’t sequence the completion of a BigQuery write with other steps of your pipeline.
Which I think is a big limitation for designing pipelines.
Source: https://beam.apache.org/documentation/io/built-in/google-bigquery/#limitations
You can use the Wait method to do this. A contrived example is below
PCollection<Void> firstWriteResults = data.apply(ParDo.of(...write to first database...));
data.apply(Wait.on(firstWriteResults))
// Windows of this intermediate PCollection will be processed no earlier than when
// the respective window of firstWriteResults closes.
.apply(ParDo.of(...write to second database...));
You can find more details in the API documentation present here - https://beam.apache.org/releases/javadoc/2.17.0/index.html?org/apache/beam/sdk/transforms/Wait.html
I want to send an email when my db is down. I don't know how to check neo4j is running or not from php. I am using neoxygen neoclient library to connect to neo4j. Is there any way around to do this ? I am using neo4j 2.3.2
As neo4j is operated by HTTP REST interface, you just need to check if the appropriate host is reachable:
if (#fopen("http://localhost:7474/db/data/","r")) {
// database is up
}
(assuming it's running on localhost)
a) Upgrade to graphaware neo4j-php-client, neoxygen is deprecated since months and has been ported there since more than a year.
b) You can just do a try/catch on a query :
try {
$result = $client->run('RETURN 1 AS x');
if (1 === $result->firstRecord()->get('x') { // db is running // }
} catch(\Exception $e) {
// db is not running or connection cannot be made
}
Is it possible to import data on Neo4J using the automatic indexing feature? I'm trying to import data using BatchInserter and BatchInserterIndex like the following example:
BatchInserter inserter = BatchInserters.inserter("/home/fmagalhaes/Neo4JDatabase");
BatchInserterIndexProvider indexProvider = new LuceneBatchInserterIndexProvider(inserter);
BatchInserterIndex nodeIndex = indexProvider.nodeIndex("node_auto_index", MapUtil.stringMap("type","exact"));
BatchInserterIndex relIndex = indexProvider.relationshipIndex("relationship_auto_index", MapUtil.stringMap("type","exact"));
...
inserter.createNode(vertexId, properties);
nodeIndex.add(vertexId, properties);
...
The problem is that when batch processing is completed, I'm trying to open this database with Blueprints generic API by doing the following:
Graph g = new Neo4jGraph("/home/fmagalhaes/Neo4JDatabase");
Set<String> nodeIndices = ((KeyIndexableGraph)g).getIndexedKeys(Vertex.class);
Set<String> relIndices = ((KeyIndexableGraph)g).getIndexedKeys(Edge.class);
and both nodeIndices and relIndices are empty. Auto indexing feature is disabled when I open the graph database on Blueprints API. Is it possible to create an automatic index during the batch processing such that this index will be visible (and will continue to index data automatically as properties are added to vertices and edges) when I open the database with Blueprints API?
you have to cleanly shut down both the batch-index as well as the batch inserter
you probably don't want to index all properties, just the key ones that you use to look up nodes
you have to enable auto-indexing in the neo4j config for the database you start afterwards, and for the same properties that you indexed during batch-insertion
I am able to create nodes and relationships through Java on a Neo4j database. When I try to access the created nodes in the next run I get this error:
Exception in thread "main" org.neo4j.graphdb.NotFoundException: Node 27 not found
In webadmin interface the dashboard shows the number of nodes/relationships created through Java, but when I issue this query: START n=node(*) RETURN n; I get only 1 node in the ouput.
(FYI I have installed Ne04j in my windows machine(local) and using embedded database java code to create nodes.)
Java code I used to connect to db:
final String dbpath = "C:\\neo4j-community-1.9.4\\data\\graph.db";
GraphDatabaseService graphdb = new GraphDatabaseFactory().newEmbeddedDatabase(dbpath);
The settings I have used in ne04j-server.properties are:
org.neo4j.server.database.location=/C:/neo4j-community-1.9.4/data/graph.db/
org.neo4j.server.webserver.https.keystore.location=data/keystore
org.neo4j.server.webadmin.rrdb.location=data/rrd
org.neo4j.server.webadmin.data.uri=/C:/neo4j-community-1.9.4/data/graph.db/
org.neo4j.server.webadmin.management.uri=/db/manage/
When I create node through Java the data/keystore file does not get populated, and only gets populated when creating a node through webadmin interface. Changing the path of keystore file to absolute path also did not work.
Can anybody point the mistake in this scenario, Thanks .
The problem was the nodes created were not comitted. To commit the nodes we got to give finish() :
try{
Transaction tx = graphdb.beginTx();
final String dbpath = "/C:/neo4j-community-1.9.4/data/graph.db/";
GraphDatabaseService graphdb = new GraphDatabaseFactory().newEmbeddedDatabase(dbpath);
Node n1 = graphdb.createNode();
n1.setProperty("type", "company");
n1.setProperty("location", "india");
....
...
}} catch(Exception e){
tx.failure();
} finally {
tx.success();
**tx.finish();**
}
Ranjith's answer was correct until recently, but tx.finish() has now been deprecated.
tx.close(); is now the correct way to commit or rollback the transaction - it will do one or the other depending on whether you've previously called tx.success().
They changed this so the transaction is autocloseable in a try with resources block.
Have you tried:
String dbpath = "C:/neo4j-community-1.9.4/data/graph.db";
My question is basically how to properly execute a SPARQL update using SailGraph created by Tinkerpop.
DELETE { ?i_id_iri rdfs:label "BII-I-1" }
INSERT { ?i_id_iri rdfs:label "BII-I-4" }
WHERE
{
?investigation rdf:type obi:0000011.
?i_id_iri rdf:type iao:0000577.
?i_id_iri iao:0000219 ?investigation.
}
I have this query so far with the prefixes added on top from another file but it does not work.
The code i run is as follows
query = parser.parseUpdate(queryString, baseURI);
UpdateExpr expr = query.getUpdateExprs().get(0);
Dataset dataset = query.getDatasetMapping().get(expr);
GraphDatabase.getSailConnection().executeUpdate(expr, dataset, new EmptyBindingSet(), false);
I'm not particularly familiar with the Tinkerpop GraphSail, but assuming it implements the Sesame SAIL API correctly, executing a SPARQL query is far easier if you just wrap it in a SailRepository, like so:
TinkerGraph graph = new TinkerGraph();
Sail sail = new GraphSail(graph);
Repository rep = new SailRepository(sail);
rep.initialize();
You can then use Sesame's Repository API, which is far more user-friendly than trying to do operations directly on the SAIL (which is not designed for that purpose).
To open a connection and execute a SPARQL update, for example, you do:
RepositoryConnection conn = rep.getConnection();
try {
String sparql = "INSERT {?s a <foo:example> . } WHERE {?s ?p ?o } ";
Update update = conn.prepareUpdate(QueryLanguage.SPARQL, sparql);
update.execute();
}
finally {
conn.close();
}
See link above for more details on Sesame's Repository API, or check out the Javadoc.