Longest path based on relationship property

Longest path based on relationship property - neo4j

I need to find the longest path in Neo4j graph based on order of specific property of the relationship.
Imagine that the relationship has property date of integer type like that
[:Like {date:20140812}]
Consider the path:
A-[r1:Like]->B-[r2:Like]->C
This is valid path only if r1.date < r2.date
The question is how to find the longest path with Cypher.

You can do something like this but it won't be very efficient in cypher.
I would still limit the max-path-length to prevent it from exploding.
I'd probably use the Java API to achieve something like that.
You can try to run it with the new query planner in Neo4j 2.1 with the cypher 2.1.experimental prefix.
MATCH (a:Label {prop:value})
MATCH path = (a)-[r:Like*10]->(b)
WHERE ALL(idx in range(1,length(path)-1) WHERE r[idx-1].date < r[idx].date)
RETURN path, length(path) as len
ORDER BY len DESC
LIMIT 1
In Java it would be something like this:
TraversalDescription traversal = db.traversalDescription().depthFirst().evaluator(new PathEvaluator.Adapter<Long>() {
#Override
public Evaluation evaluate(Path path, BranchState<Long> state) {
if (path.length() == 0) return Evaluation.INCLUDE_AND_CONTINUE;
Long date = (Long) path.lastRelationship().getProperty("date");
Long stateDate = state.getState();
state.setState(date);
if (stateDate != null) {
return stateDate < date ? Evaluation.INCLUDE_AND_CONTINUE : Evaluation.EXCLUDE_AND_PRUNE;
}
return Evaluation.INCLUDE_AND_CONTINUE;
}
});
int length = 0;
Path result;
for (Path path : traversal.traverse(startNode)) {
if (path.length() > length) {
result = path;
length = path.length();
}
}

Related

Base on condition, create node and return value

In my neo4j database, I have 2 node types:
public class Warrior
{
public string Id{get;set}
public string Name{get;set;}
}
public class Weapon
{
public string Id{get;set;}
public string Name{get;set;}
}
And I want my query has the same function like this :
var warrior = Match(warrior.Id == 1)
var weapon = Match(weapon.Id == 2)
if (warrior == null)
return 0;
if (weapon == null)
return 1;
CreateConnection(warrior-[:HAS]->Weapon)
return 2;
Thank you.
P/S : Neo4j or Neo4jClient query is accepted, I can convert them to each other.

The following query should do what you require.
optional match (warrior:Warrior) where warrior.id=1 with warrior
optional match (weapon:Weapon) where weapon.id=2 with warrior,weapon,
case when exists(warrior.id) and exists(weapon.id) then 2 when exists(weapon.id) then 1 else 0 end as output
foreach(temp in case when output=2 then [1] else [] end |
create (warrior)-[:Has]->(weapon)
)
return warrior,weapon
we use optional match instead of match so that the query does not fail if match is not succeeded.
We use a case statement to check if the ID value of the variable for warrior or weapon exists and store the resulting value (0 or 1 or 2) in an output variable.
Next we use the foreach loop case trick to check if the output has the value 2 and execute the create statement.
(reference for the trick. http://www.markhneedham.com/blog/2014/08/22/neo4j-load-csv-handling-empty-columns/)

Batching Neo4J updates for speed

My goal is to have a list of words in the Oxford dictionary with a relationship between them called IS_ONE_STEP_AWAY_FROM. Each word in relationship is the same length and varies by only one letter.
I am currently able to batch insert the words themselves, but how can I batch insert these relationships?
class Word
{
public string Value { get; set; }
}
public void SeedDatabase()
{
var words = new Queue<Word>();
EnqueueWords(words);
//Create the words as a batch
GraphClient.Cypher
.Create("(w:Word {words})")
.WithParam("words", words)
.ExecuteWithoutResults();
//Add relationships one word at a time
while (words.Count > 0)
{
var word = words.Dequeue();
var relatedWords = WordGroups[word.Value].Except(Enumerable.Repeat(word.Value, 1)).ToList();
if (relatedWords.Count > 0)
{
foreach (string relatedWord in relatedWords)
{
GraphClient.Cypher
.Match("(w1 :Word { Value : {rootWord} }), (w2 :Word { Value : {relatedWord} })")
.Create("(w1)-[r:IS_ONE_STEP_AWAY_FROM]->(w2)")
.WithParam("rootWord", word.Value)
.WithParam("relatedWord", relatedWord)
.ExecuteWithoutResults();
}
}
}
}

Peter do you have an index or constraint on :Word(Value) ?
I also don't fully understand how this batches:
GraphClient.Cypher
.Create("(w:Word {words})")
.WithParam("words", words)
.ExecuteWithoutResults();
And where you define the property-name (aka Value).
Do you run this currently concurrently?
For the relationships I'd recommend to group them by start-word.
then you could do something like:
MATCH (w1:Word {Value:{rootWord}})
UNWIND {relatedWords} as relatedWord
MATCH (w2:Word {Value:relatedWord}}
CREATE (w1)-[r:IS_ONE_STEP_AWAY_FROM]->(w2);
Neo4jClient also still have to learn to use the new transactional endpoint.

breeze query where clause is not executed

I'm quite new to using breeze and at the moment stuck with something which seems very simple.
I have a API call which returns 4 locations. Then using breeze, I'm trying to filter it down using a where clause as follows:
function getLocations(clientId) {
var self = this;
return EntityQuery.from('GetLocations')
.withParameters({ clientId: clientId })
.where("activeStatus", "==", "0")
.expand('LocationType')
.using(self.manager)
.execute()
.then(querySucceeded, this._queryFailed);
function querySucceeded(data) {
if (data.results.length > 1) {
locations = data.results;
}
return locations;
}
}
Ideally, this should give me 0 rows, because in all 4 rows the 'activeStatus' is 1. However, it still shows me all 4 results. I tried with another filter for locationType, and it's the same result. The breeze side where clause does not get executed.
Update to answer the questions:
Following is how the API call in my controller looks like:
public object GetLocations(int clientId) {
}
As you see it only accepts the clientId as a parameter hence I use the with parameter clause. I was thinking that breeze will take care of the activeStatus where clause and I don't have to do the filter on that in the back-end. Is that wrong?
Can someone help with this?

The Breeze documentation indicates that the withParameters is usually used with non-.NET backends or servers which do not recognize oData URIs. Is it possible that the where clause is being ignored because of .withParameters? Can't you rewrite the where clause using the clientID filter?
function getLocations(clientId) {
var self = this;
var p1 = new breeze.Predicate("activeStatus","==","0");
var p2 = new breeze.Predicate("clientId","==",clientId);
var p = p1.and(p2)
return EntityQuery.from('GetLocations')
.where(p)
.expand('LocationType')
.using(self.manager)
.execute()
.then(querySucceeded, this._queryFailed);
function querySucceeded(data) {
if (data.results.length > 1) {
locations = data.results;
}
return locations;
}
}
I'd try this first. Or put the where clause in the withParameters statement, depending on your backend. If that doesn't work, then there might be other options.
Good Luck.
EDIT: An example that I use:
This is the API endpoint that I query against:
// GET: breeze/RST_ClientHistory/SeasonClients
[HttpGet]
[BreezeQueryable(MaxExpansionDepth = 10)]
public IQueryable<SeasonClient> SeasonClients()
{
return _contextProvider.Context.SeasonClients;
}
And here is an example of a query I use:
// qFilters is object. Values are arrays or strings, keys are id fields. SeasonClients might also be Clients
// Setup predicates
var p, p1;
// link up the predicates for passed data
for (var f in qFilters) {
var compareOp = Object.prototype.toString.call(qFilters[f]) === '[object Array]' ? 'in' : '==';
if (!qFilters[f] || (compareOp == 'in' && qFilters[f].length == 0)) continue;
fLC = f.toLowerCase();
if (fLC == "countryid") {
p1 = breeze.Predicate("District.Region.CountryId", compareOp, qFilters[f]);
} else if (fLC == "seasonid") {
p1 = breeze.Predicate("SeasonId", compareOp, qFilters[f]);
} else if (fLC == "districtid") {
p1 = breeze.Predicate("DistrictId", compareOp, qFilters[f]);
} else if (fLC == "siteid") {
p1 = breeze.Predicate("Group.Site.SiteId", compareOp, qFilters[f]);
} else if (fLC == "groupid") {
p1 = breeze.Predicate("GroupId", compareOp, qFilters[f]);
} else if (fLC == "clientid" || fLC == 'seasonclientid') {
p1 = breeze.Predicate("ClientId", compareOp, qFilters[f]);
}
// Setup predicates
if (p1) {
p = p ? p.and(p1) : p1;
}
}
// Requires [BreezeQueryable(MaxExpansionDepth = 10)] in controller
var qry = breeze.EntityQuery
.from("SeasonClients")
.expand("Client,Group.Site,Season,VSeasonClientCredit,District.Region.Country,Repayments.RepaymentType")
.orderBy("DistrictId,SeasonId,GroupId,ClientId");
// Add predicates to query, if any exist
if (p) qry = qry.where(p);
return qry;
That's longer than it needs to be, but I wanted to make sure a full working example is in here. You will notice that there is no reason to use .withParameters. As long as the Context is set up properly on the server, chaining predicates (where clauses) should work fine. In this case, we are creating where clauses with up to 10 ANDs filtering with strict equality or IN a collection, depending on what is passed in the qFilters Object.
I think you should probably get rid of the parameter in your backend controller, make the method parameterless, and include the clientId match as an additional predicate in your query.
This approach also makes your API endpoint much more flexible -- you can use it for a wide variety of queries, even if the ClientId has nothing to do with them.
Does this help? Let me know if you have any more questions?

Out of Memory using Java API but Cypher query can work although it is slow

I have a graph database with 150 million nodes and a few hundred million relationships.
There are two types of nodes in the network: account node and transaction node. Each account node has a public key and each transaction node has a number (the amount of total bitcoin involved in this transaction).
There are also two types of relationships in the network. Each relationship connects an account node with a transaction node. One type of relationships is "send" and the other type is "receive". Each relationship also has a number to represent how much bitcoin it sends or receives.
This is an example:
(account: publickey = A)-[send: bitcoin=1.0]->(transaction :id = 1, Tbitcoin=1.0)-[receive: bitcoin=0.5]->(account: publickey = B)
(account: publickey = A)-[send: bitcoin=1.0]->(transaction :id = 1, Tbitcoin=1.0)-[receive: bitcoin=0.5]->(account: publickey = C)
As you can imagine, B or C can also send or receive bitcoins to or from other accounts which involves many different transactions.
What I wants to do is to find all paths with depth equaling to 4 between two accounts, e.g. A and C. I can do this by Cypher although it is slow. It takes about 20mins. My cypher is like this:
start src=node:keys(PublicKey="A"),dest=node:keys(PublicKey="C")
match p=src-->(t1)-->(r1)-->(t2)-->dest
return count(p);
However, when I try to do that using Java API, I got the OutOfMemoryError. Here is my function:
public ArrayList<Path> getPathsWithConditionsBetweenNodes(String indexName, String sfieldName, String sValue1, String sValue2,
int depth, final double threshold, String relType){
ArrayList<Path> res = null;
if (isIndexExistforNode(indexName)) {
try (Transaction tx = graphDB.beginTx()) {
IndexManager index = graphDB.index();
Index<Node> accounts = index.forNodes(indexName);
IndexHits<Node> hits = null;
hits = accounts.get(sfieldName, sValue1);
Node src = null, dest = null;
if(hits.iterator().hasNext())
src = hits.iterator().next();
hits = null;
hits = accounts.get(sfieldName, sValue2);
if(hits.iterator().hasNext())
dest = hits.iterator().next();
if(src==null || dest==null){
System.out.println("Either src or dest node is not avaialble.");
}
TraversalDescription td = graphDB.traversalDescription()
.depthFirst();
if (relType.equalsIgnoreCase("send")) {
td = td.relationships(Rels.Send, Direction.OUTGOING);
td = td.relationships(Rels.Receive, Direction.OUTGOING);
} else if (relType.equalsIgnoreCase("receive")) {
td= td.relationships(Rels.Receive,Direction.INCOMING);
td = td.relationships(Rels.Send,Direction.INCOMING);
} else {
System.out
.println("Traverse Without Type Constrain Because Unknown Relationship Type is Provided to The Function.");
}
td = td.evaluator(Evaluators.includingDepths(depth, depth))
.uniqueness(Uniqueness.RELATIONSHIP_PATH)
.evaluator(Evaluators.returnWhereEndNodeIs(dest));
td = td.evaluator(new Evaluator() {
#Override
public Evaluation evaluate(Path path) {
if (path.length() == 0) {
return Evaluation.EXCLUDE_AND_CONTINUE;
} else {
Node node = path.endNode();
if (!node.hasProperty("TBitcoin"))
return Evaluation.INCLUDE_AND_CONTINUE;
double coin = (double) node.getProperty("TBitcoin");
if (threshold!=Double.MIN_VALUE) {
if (coin<=threshold) {
return Evaluation.EXCLUDE_AND_PRUNE;
} else {
return Evaluation.INCLUDE_AND_CONTINUE;
}
} else {
return Evaluation.INCLUDE_AND_CONTINUE;
}
}
}
});
res = new ArrayList<Path>();
int i=0;
for(Path path : td.traverse(src)){
i++;
//System.out.println(path);
//res.add(path);
}
System.out.println();
tx.success();
} catch (Exception e) {
e.printStackTrace();
}
} else {
;
}
return res;
}
Can someone take a look at my function and give me some ideas why it is so slow and will cause out-of-memory error? I set Xmx=15000m while runing this program.

My $0.02 is that you shouldn't do this with java, you should do it with Cypher. But your query needs some work. Here's your basic query:
start src=node:keys(PublicKey="A"),dest=node:keys(PublicKey="C")
match p=src-->(t1)-->(r1)-->(t2)-->dest
return count(p);
There are at least two problems with this:
The intermediate r1 could be the same as your original src, or your original dest (which probably isn't what you want, you're looking for intermediaries)
You don't specify that t1 or t2 are send or receive. Meaning that you're forcing cypher to match both kinds of edges. Meaning cypher has to look through a lot more stuff to give you your answer.
Here's how to tighten your query so it should perform much better:
start src=node:keys(PublicKey="A"),dest=node:keys(PublicKey="C")
match p=src-[:send]->(t1:transaction)-[:receive]->(r1)-[:send]->(t2:transaction)-[:receive]->dest
where r1 <> src and
r1 <> dest
return count(p);
This should prune out a lot of possible edge and node traversals that you're currently doing, that you don't need to be doing.

If I have understood what you are trying to achieve and because you have a direction on your relationship I think that you can get away with something quite simple:
MATCH (src:keys{publickey:'A')-[r:SEND|RECEIVE*4]->(dest:keys{publickey:'C'})
RETURN COUNT(r)
Depending on your data set #FrobberOfBits makes a good point regarding testing equality of intermediaries which you cannot do using this approach, however with just the two transactions you are testing for cases where a Transaction source and destination are the same (r1 <> src and r1 <> dest), which may not even be valid in your model. If you were testing 3 or more transactions then things would get more interesting as you might want to exclude paths like (A)-->(T1)-->(B)-->(T2)-->(A)-->(T3)-->(C)
Shameless theft:
MATCH path=(src:keys{publickey:'A')-[r:SEND|RECEIVE*6]->(dest:keys{publickey:'C'})
WHERE ALL (n IN NODES(path)
WHERE (1=LENGTH(FILTER(m IN NODES(path)
WHERE m=n))))
RETURN COUNT(path)
Or traversal (caveat, pseudo code, never used it):
PathExpander expander = PathExapnder.forTypesAndDirections("SEND", OUTGOING, "RECEIVE", OUTGOING)
PathFinder<Path> finder = GraphAlgoFactory.allSimplePaths(expander, 6);
Iterable<Path> paths = finder.findAllPaths(src, dest);

neo4j cypher - how to find all nodes that have a relationship to list of nodes

I have nodes- named "options". "Users" choose these options. I need a chpher query that works like this:
retrieve users who had chosen all the options those are given as a list.
MATCH (option:Option)<-[:CHOSE]-(user:User) WHERE option.Key IN ['1','2','2'] Return user
This query gives me users who chose option(1), option(2) and option(3) and also gives me the user who only chose option(2).
What I need is only the users who chose all of them -option(1), option(2) and option(3).

For an all cypher solution (don't know if it's better than Chris' answer, you'll have to test and compare) you can collect the option.Key for each user and filter out those who don't have a option.Key for each value in your list
MATCH (u:User)-[:CHOSE]->(opt:Option)
WITH u, collect(opt.Key) as optKeys
WHERE ALL (v IN {values} WHERE v IN optKeys)
RETURN u
or match all the options whose keys are in your list and the users that chose them, collect those options per user and compare the size of the option collection to the size of your list (if you don't give duplicates in your list the user with an option collection of equal size has chosen all the options)
MATCH (u:User)-[:CHOSE]->(opt:Option)
WHERE opt.Key IN {values}
WITH u, collect(opt) as opts
WHERE length(opts) = length({values}) // assuming {values} don't have duplicates
RETURN u
Either should limit results to users connected with all the options whose key values are specified in {values} and you can vary the length of the collection parameter without changing the query.

If the number of options is limited, you could do:
MATCH
(user:User)-[:Chose]->(option1:Option),
(user)-[:Chose]->(option2:Option),
(user)-[:Chose]->(option3:Option)
WHERE
option1.Key = '1'
AND option2.Key = '2'
AND option3.Key = '3'
RETURN
user.Id
Which will only return the user with all 3 options.
It's a bit rubbishy as obviously you end up with 3 lines where you have 1, but I don't know how to do what you want using the IN keyword.
If you're coding against it, it's pretty simple to generate the WHERE and MATCH clause, but still - not ideal. :(
EDIT - Example
Turns out there is some string manipulation going on here (!), but you can always cache bits. Importantly - it's using Params which would allow neo4j to cache the queries and supply faster responses with each call.
public static IEnumerable<User> GetUser(IGraphClient gc)
{
var query = GenerateCypher(gc, new[] {"1", "2", "3"});
return query.Return(user => user.As<User>()).Results;
}
public static ICypherFluentQuery GenerateCypher(IGraphClient gc, string[] options)
{
ICypherFluentQuery query = new CypherFluentQuery(gc);
for(int i = 0; i < options.Length; i++)
query = query.Match(string.Format("(user:User)-[:CHOSE]->(option{0}:Option)", i));
for (int i = 0; i < options.Length; i++)
{
string paramName = string.Format("option{0}param", i);
string whereString = string.Format("option{0}.Key = {{{1}}}", i, paramName);
query = i == 0 ? query.Where(whereString) : query.AndWhere(whereString);
query = query.WithParam(paramName, options[i]);
}
return query;
}

MATCH (user:User)-[:CHOSE]->(option:Option)
WHERE option.key IN ['1', '2', '3']
WITH user, COUNT(*) AS num_options_chosen
WHERE num_options_chosen = LENGTH(['1', '2', '3'])
RETURN user.name
This will only return users that have relationships with all the Options with the given keys in the array. This assumes there are not multiple [:CHOSE] relationships between users and options. If it is possible for a user to have multiple [:CHOSE] relationships with a single option, you'll have to add some conditionals as necessary.
I tested the above query with the below dataset:
CREATE (User1:User {name:'User 1'}),
(User2:User {name:'User 2'}),
(User3:User {name:'User 3'}),
(Option1:Option {key:'1'}),
(Option2:Option {key:'2'}),
(Option3:Option {key:'3'}),
(Option4:Option {key:'4'}),
(User1)-[:CHOSE]->(Option1),
(User1)-[:CHOSE]->(Option4),
(User2)-[:CHOSE]->(Option2),
(User2)-[:CHOSE]->(Option3),
(User3)-[:CHOSE]->(Option1),
(User3)-[:CHOSE]->(Option2),
(User3)-[:CHOSE]->(Option3),
(User3)-[:CHOSE]->(Option4)
And I get only 'User 3' as the output.

For shorter lists, you can use path predicates in your WHERE clause:
MATCH (user:User)
WHERE (user)-[:CHOSE]->(:Option { Key: '1' })
AND (user)-[:CHOSE]->(:Option { Key: '2' })
AND (user)-[:CHOSE]->(:Option { Key: '3' })
RETURN user
Advantages:
Clear to read
Easy to generate for dynamic length lists
Disadvantages:
For each different length, you will have a different query that has to be parsed and cached by Cypher. Too many dynamic queries will watch your cache hit rate go through the floor, query compilation work go up, and query performance go down.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Longest path based on relationship property - neo4j

Related

Base on condition, create node and return value

Batching Neo4J updates for speed

breeze query where clause is not executed

Out of Memory using Java API but Cypher query can work although it is slow

neo4j cypher - how to find all nodes that have a relationship to list of nodes

Categories

Resources