I'm using the Neo4j JDBC driver 2.0.1.
When I run the following query on by browser, I get the right data back.
MATCH (u:Person) RETURN u.name, u.lastname
I am executing this statement with the NeoJDBC driver (I am connected to the db, otherwise I would not have been able to create nodes before):
public static ResultSet executeCypher(String query)
{
try (Statement stmt = TestUtils.connection.createStatement())
{
return stmt.executeQuery(query);
}
catch (SQLException e)
{
System.out.println(e.getMessage() + "\n\n" + e.getCause().toString() + "\n\n" + e.getErrorCode());
}
}
When I'm iterating over the result set, there are 2 colums as expected. But only one row in the result set. I wrote this according to the minimum viable snippet:
//create some users here
//...
//check the database content
ResultSet rs = TestUtils.executeCypher("MATCH (u:Person) RETURN u.name, u.lastname");
while(rs.next())
{
System.out.println(rs.getString("u.name"));
System.out.println("print anything for test purposes");
}
Output:
Executing query: MATCH (u:Person) RETURN u.name as name, u.lastname
as lastname with params {} Starting the Apache HTTP client
John
print anything for test purposes
Why do I only get one row back although it should return multiple rows? I found some data in the "data" field of the ResultSet while debugging (see also Michael Hunger's answer here) where the "last" property is false. So I guess there is more data. But I don't know how to extract it.
How can I get all the data that is in the ResultSet (using an iterator)?
Related
I am creating nodes in Neo4j using neo4j-java driver with the help of following Cipher Query.
String cipherQuery = "CREATE (n:MLObsTemp { personId: " + personId + ",conceptId: " + conceptId
+ ",obsId: " + obsId + ",MLObsId: " + mlObsId + ",encounterId: " + encounterId + "}) RETURN n";
Function for creating query
createNeo4JObsNode(String cipherQuery);
Implementation of the Function
private void createNeo4JObsNode(String cipherQuery) throws Exception {
try (ConNeo4j greeter = new ConNeo4j("bolt://localhost:7687", "neo4j", "qwas")) {
System.out.println("Executing query : " + cipherQuery);
try (Session session = driver.session()) {
StatementResult result = session.run(cipherQuery);
} catch (Exception e) {
System.out.println("Error" + e.getMessage());
}
} catch (Exception e) {
e.printStackTrace();
}
}
Making relation for the above nodes using below code
String obsMatchQuery = "MATCH (m:MLObsTemp),(o:Obs) WHERE m.obsId=o.obsId CREATE (m)-[:OBS]->(o)";
createNeo4JObsNode(obsMatchQuery);
String personMatchQuery = "MATCH (m:MLObsTemp),(p:Person) WHERE m.personId=p.personId CREATE (m)-[:PERSON]->(p)";
createNeo4JObsNode(personMatchQuery);
String encounterMatchQuery = "MATCH (m:MLObsTemp),(e:Encounter) WHERE m.encounterId=e.encounterId CREATE (m)-[:ENCOUNTER]->(e)";
createNeo4JObsNode(encounterMatchQuery);
String conceptMatchQuery = "MATCH (m:MLObsTemp),(c:Concept) WHERE m.conceptId=c.conceptId CREATE (m)-[:CONCEPT]->(c)";
createNeo4JObsNode(conceptMatchQuery);
It is taking me 13 seconds on average for creating nodes and 12 seconds for making relations. I have 350k records in my database for which I have to create nodes and their respective relations.
How can I improve my code? Moreover, is this the best way for creating nodes in Neo4j using bolt server and neo4j-java driver?
EDIT
I am now using the query parameter in my code
HashMap<String, Object> parameters = new HashMap<String, Object>();
((HashMap<String, Object>) parameters).put("personId", 1390);
((HashMap<String, Object>) parameters).put("obsId", 14001);
((HashMap<String, Object>) parameters).put("conceptId", 5978);
((HashMap<String, Object>) parameters).put("encounterId", 10810);
((HashMap<String, Object>) parameters).put("mlobsId", 2);
String cypherQuery=
"CREATE (m:MLObsTemp { personId: $personId, ObsId: $obsId, conceptId: $conceptId, MLObsId: $mlobsId, encounterId: $encounterId}) "
+ "WITH m MATCH (p:Person { personId: $personId }) CREATE (m)-[:PERSON]->(p) "
+ "WITH m MATCH (e:Encounter {encounterId: $encounterId }) CREATE (m)-[:Encounter]->(e) "
+ "WITH m MATCH (o:Obs {obsId: $obsId }) CREATE (m)-[:OBS]->(o) "
+ "WITH m MATCH (c:Concept {conceptId: $conceptId }) CREATE (m)-[:CONCEPT]->(c) "
+ " RETURN m";
Creating Node function
try {
ConNeo4j greeter = new ConNeo4j("bolt://localhost:7687", "neo4j", "qwas");
try {
Session session = driver.session();
StatementResult result = session.run(cypherQuery, parameters);
System.out.println(result);
} catch (Exception e) {
System.out.println("[WARNING] Null Row");
}
} catch (Exception e) {
e.printStackTrace();
}
I am also performing the indexing in order to speed up the process
CREATE CONSTRAINT ON (P:Person) ASSERT P.personId IS UNIQUE
CREATE CONSTRAINT ON (E:Encounter) ASSERT E.encounterId IS UNIQUE
CREATE CONSTRAINT ON (O:Obs) ASSERT O.obsId IS UNIQUE
CREATE CONSTRAINT ON (C:Concept) ASSERT C.conceptId IS UNIQUE
Here is the plan for 1 cypher query-profile
Now the performance has improved but not significant. I am using neo4j-java-driver version 1.6.1. How can I batch my cipher queries to improve the performance further.
You should try to minimize redundant work in your cyphers.
MLObsTemp has a lot of redundant properties, and you are searching for it to create every link. Relationships defeat the need to create properties for foreign keys (node ids)
I would recommend a Cypher that does everything together, and uses parameters like this...
CREATE (m:MLObsTemp)
WITH m MATCH (p:Person {id:"$person_id"}) CREATE (m)-[:PERSON]->(p)
WITH m MATCH (e:Encounter {id:"$encounter_id"}) CREATE (m)-[:Encounter]->(e)
WITH m MATCH (c:Concept {id:"$concept_id"}) CREATE (m)-[:CONCEPT]->(c)
// SNIP more MATCH/CREATE
RETURN m
This way, Neo4j doesn't have to find m repeatedly for every relationship. You don't need the ID properties, because that is effectively what the relationship you just created is. Neo4j is very efficient at walking edges (relationships), so just follow the relationship if you need the id value.
TIPS: (mileage may very across Neo4j versions)
Inline is almost always more efficent than WHERE (MATCH (n{id:"rawr"}) vs MATCH (n) WHERE n.id="rawr")
Parameters make frequent, similar queries more efficient, as Neo4j will cache how to do it quickly (the $thing_id syntax used in the above query.) Also, It protects you from Cypher injection (See SQL injection)
From a Session, you can create a Transaction (Session.run() actually creates a transaction for each run call). You can batch multiple Cyphers using a single transaction (Even using the results of previous Cyphers from the same transaction), because transactions live in memory until you mark it a success and close it. Note that if you are not careful, your transaction can fail with "outofmemory". So remember to commit periodically/between batches. (commit batches of 10k records seems to be the norm when ingesting large data sets)
I am trying to query my database to get a specific data from my database. however when I convert the query to string it doesn't return the select value, instead it returns the whole SQL Query in a string. I am stumped on why this is happening
public ActionResult StudiedModules()
{
Model1 studiedModules = new Model1();
List<StudiedModulesModel> listModules = new List<StudiedModulesModel>();
using (EntityOne context = new EnitityOne())
{
foreach(var module in context.StudiedModules){
studiedModules.School = context.ModuleDatas.Where(p=>p.ModuleCode == module.ModuleCode).Select(u=>u.School).ToString();
studiedModules.Subject = context.ModuleDatas.Where(p=>p.ModuleCode == module.ModuleCode).Select(u=>u.Subject).ToString();
}
}
var data = listModules;
return View(data);
}
Calling ToString() on an Entity Framework Linq query like that will in fact return the SQL Query. So as it's written, your code is doing exactly what you wrote it to do.
If you want to select the first result from the IQueryable<T>, then you need to call First() before your ToString(). So, try changing
studiedModules.School = context.ModuleDatas.Where(p=>p.ModuleCode == module.ModuleCode).Select(u=>u.School).ToString();
studiedModules.Subject = context.ModuleDatas.Where(p=>p.ModuleCode == module.ModuleCode).Select(u=>u.Subject).ToString()
to
studiedModules.School = context.ModuleDatas.Where(p=>p.ModuleCode == module.ModuleCode).Select(u=>u.School).First().ToString();
studiedModules.Subject = context.ModuleDatas.Where(p=>p.ModuleCode == module.ModuleCode).Select(u=>u.Subject).First().ToString()
There are a whole lot more methods available depending on what you're trying to accomplish. If you want to get a list, use ToList(), or as Uroš Goljat pointed out, you can get a comma-separated list of values using the Aggregate( (a, b)=> a + ", " + b) method.
How about using Aggregate( (a, b)=> a + ", " + b) instead of ToString().
Regards,
Uros
I have a sample database with 8 million users where the manage account page takes 8 seconds to render. It boils down to the method GetUserId calling Membership GetUser.
The GetUser sql looks like this:
SELECT [UserId] FROM [Users] WHERE (UPPER([UserName]) = #0)
When I run the following questions in the query analyzer I get the following results
SELECT [UserId] FROM [Users] WHERE [UserName] = 'CARL'
-- This question takes 11 milliseconds on my dev machine
SELECT [UserId] FROM [Users] WHERE UPPER([UserName]) = 'CARL'
-- This question takes 3.5 seconds on my dev machine
The UserName column has the following index:
CREATE NONCLUSTERED INDEX IX_Users_UserName ON dbo.Users (UserName)
Can the sql query be changed? Can the query performance be improved in any other way?
As per MS recommendation run the following SQL to improve your performance:
TIL while trying to solve this same problem. The call to UPPER does not use the index.
Try this in the short term if you can afford the resource:
ALTER TABLE Users ADD NormalizedName AS UPPER(UserName);
CREATE NONCLUSTERED INDEX [IX_NormalizedName] ON [Users] ([NormalizedName] ASC);
After doing this I got very reasonable performance out of simple membership (enough to last me till I replace it with identity or the next best thing.)
http://i1.blogs.msdn.com/b/webdev/archive/2015/02/11/improve-performance-by-optimizing-queries-for-asp-net-identity-and-other-membership-providers.aspx
And modify the code yourself in the long run and replace the compiled version. Carl R pointed out this project is now open source too. So now you can rewrite it to taste.
https://aspnetwebstack.codeplex.com/SourceControl/latest#src/WebMatrix.WebData/SimpleMembershipProvider.cs
Can the sql query be changed?
No, the SQL query is burnt into the code of the simple membership provider. Checkout with reflector the code of the WebMatrix.WebData.SimpleMembershipProvider.GetUserId method which looks like this:
internal static int GetUserId(IDatabase db, string userTableName, string userNameColumn, string userIdColumn, string userName)
{
object obj2 = db.QueryValue("SELECT " + userIdColumn + " FROM " + userTableName + " WHERE (UPPER(" + userNameColumn + ") = #0)", new object[] { userName.ToUpperInvariant() });
if (<GetUserId>o__SiteContainer5.<>p__Site6 == null)
{
<GetUserId>o__SiteContainer5.<>p__Site6 = CallSite<Func<CallSite, object, bool>>.Create(Binder.UnaryOperation(CSharpBinderFlags.None, ExpressionType.IsTrue, typeof(SimpleMembershipProvider), new CSharpArgumentInfo[] { CSharpArgumentInfo.Create(CSharpArgumentInfoFlags.None, null) }));
}
if (<GetUserId>o__SiteContainer5.<>p__Site7 == null)
{
<GetUserId>o__SiteContainer5.<>p__Site7 = CallSite<Func<CallSite, object, object, object>>.Create(Binder.BinaryOperation(CSharpBinderFlags.None, ExpressionType.NotEqual, typeof(SimpleMembershipProvider), new CSharpArgumentInfo[] { CSharpArgumentInfo.Create(CSharpArgumentInfoFlags.None, null), CSharpArgumentInfo.Create(CSharpArgumentInfoFlags.Constant, null) }));
}
if (!<GetUserId>o__SiteContainer5.<>p__Site6.Target(<GetUserId>o__SiteContainer5.<>p__Site6, <GetUserId>o__SiteContainer5.<>p__Site7.Target(<GetUserId>o__SiteContainer5.<>p__Site7, obj2, null)))
{
return -1;
}
if (<GetUserId>o__SiteContainer5.<>p__Site8 == null)
{
<GetUserId>o__SiteContainer5.<>p__Site8 = CallSite<Func<CallSite, object, int>>.Create(Binder.Convert(CSharpBinderFlags.ConvertExplicit, typeof(int), typeof(SimpleMembershipProvider)));
}
return <GetUserId>o__SiteContainer5.<>p__Site8.Target(<GetUserId>o__SiteContainer5.<>p__Site8, obj2);
}
You will have to write a custom membership provider if you want to change this behavior.
I have a query that bring back a cell in my table the has all xml in it. I have it so I can spit out what is in the cell without any delimiters. Now i need to actually take each individual element and link them with my object. Is there any easy way to do this?
def sql
def dataSource
static transactional = true
def pullLogs(String username, String id) {
if(username != null && id != null) {
sql = new Sql(dataSource)
println "Data source is: " + dataSource.toString()
def schema = dataSource.properties.defaultSchema
sql.query('select USERID, AUDIT_DETAILS from DEV.AUDIT_LOG T WHERE XMLEXISTS(\'\$s/*/user[id=\"' + id + '\" or username=\"'+username+'\"]\' passing T.AUDIT_DETAILS as \"s\") ORDER BY AUDIT_EVENT', []) { ResultSet rs ->
while (rs.next()) {
def auditDetails = new XmlSlurper().parseText(rs.getString('AUDIT_EVENT_DETAILS'))
println auditDetails.toString
}
}
sql.close()
}
}
now this will give me that cell with those audit details in it. Bad thing is that is just puts all the information from the field in on giant string without the element tags. How would I go through and assign the values to a object. I have been trying to work with this example http://gallemore.blogspot.com/2008/04/groovy-xmlslurper.html with no luck since that works with a file.
I have to be missing something. I tried running another parseText(auditDetails) but haven't had any luck on that.
Any suggestions?
EDIT:
The xml int that field looks like
<user><username>scottsmith</username><timestamp>tues 5th 2009</timestamp></user>
^ simular to how it is except mine is ALOT longer. It comes out as "scottsmithtue 5th 2009" so on and so forth. I need to actually take those tags and link them to my object instead of just printing them in one conjoined string.
Just do
auditDetails.username
Or
auditDetails.timestamp
To access the properties you require
I am try to perform a query on my NameIndex in Neo4j using the Neo4jClient for .NET but i get this error:
{"Received an unexpected HTTP status when executing the request.\r\n\r\nThe response status was: 500 Internal Server Error\r\n\r\nThe raw response body was: {\"exception\":\"NullPointerException\",\"stacktrace\":[\"org.apache.lucene.util.SimpleStringInterner.intern(SimpleStringInterner.java:54)\",\"org.apache.lucene.util.StringHelper.intern(StringHelper.java:39)\",\"org.apache.lucene.index.Term.<init>(Term.java:38)\",\"org.apache.lucene.queryParser.QueryParser.getFieldQuery(QueryParser.java:643)\",\"org.apache.lucene.queryParser.QueryParser.Term(QueryParser.java:1421)\",\"org.apache.lucene.queryParser.QueryParser.Clause(QueryParser.java:1309)\",\"org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:1237)\",\"org.apache.lucene.queryParser.QueryParser.TopLevelQuery(QueryParser.java:1226)\",\"org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:206)\",\"org.neo4j.index.impl.lucene.IndexType.query(IndexType.java:300)\",\"org.neo4j.index.impl.lucene.LuceneIndex.query(LuceneIndex.java:227)\",\"org.neo4j.server.rest.web.DatabaseActions.getIndexedNodesByQuery(DatabaseActions.java:977)\",\"org.neo4j.server.rest.web.DatabaseActions.getIndexedNodesByQuery(DatabaseActions.java:960)\",\"org.neo4j.server.rest.web.RestfulGraphDatabase.getIndexedNodesByQuery(RestfulGraphDatabase.java:692)\",\"java.lang.reflect.Method.invoke(Unknown Source)\"]}"}
My method looks as follows:
public IEnumerable GraphGetNodeByName(string NodeName)
{
GraphOperationsLogger.Trace("Now entering GraphGetNodeByName() method");
IEnumerable QueryResult = null;
GraphOperationsLogger.Trace("Now performing the query");
var query = client_connection.QueryIndex<GraphNode>("NameIndex", IndexFor.Node,
//Here I want to pass in the NodeName into the query
//#"Start n = node:NameIndex(Name = '"+ NodeName +"') return n;");
//Here I am hard-coding the NodeName
#"Start n = node:NameIndex(Name = ""Mike"") return n;");
QueryResult = query.ToList();
return QueryResult;
}
I ideally would like to pass in the NodeName into the query but that is not working therefore I have tried hard-coding it in and that also doesn't work. Both scenarios produce the same error message?
The method you are calling, IGraphClient.QueryIndex is not a Cypher method. It's a wrapper on http://docs.neo4j.org/chunked/milestone/rest-api-indexes.html#rest-api-find-node-by-query. It's an older API, from before Cypher existed.
You're already half way there though, because your code comments include the Cypher query:
Start n = node:NameIndex(Name = "Mike")
return n;
So, let's just translate that into C#:
client
.Cypher
.Start(new CypherStartBitWithNodeIndexLookup("n", "NameIndex", "Name", "Mike"))
.Return<Node<Person>>("n");
Always start your Cypher queries from IGraphClient.Cypher or NodeReference.StartCypher (which is just a shortcut to the former).
There are some other issues with your method:
You're returning a raw IEnumerable. What is in it? You should return IEnumerable<T>.
You're calling query.ToList(). I'd be surprised if that even compiles. You want to call ToList on the results so that the enumerable is hit.
In C#, your local variables should be in camelCase not PascalCase. That is, queryResult instead of QueryResults.
Combining all of those points, your method should be:
public IEnumerable<Person> GetPeopleByName(string name)
{
return graphClient
.Cypher
.Start(new CypherStartBitWithNodeIndexLookup("n", "NameIndex", "Name", "Mike"))
.Return<Node<Person>>("n")
.Results
.ToList();
}