Implement breadth-first search using Hama - breadth-first-search

I've done some research, and I seem to be missing one small part.I understand how a Breadth-First Search works, but I don't understand how to partition nodes so that it can be computed parallely using hama.Is there any method to do it?

Related

why wrong to store a dictionary/map in neo4j as a property

I have a data structure in the form of a tree. So it has vertices within vertices. Neo4j would be a perfect match but alas someone has made a decision that a property can not be a dictionary/map.
I find this strange. Neo4j is all about vertices. So why not accept tree shaped data?
It would seem so intuitive.
I guess it must be for a good reason. Can it be difficult to manage updates? Or handling memory?
Does anyone know?
And does anyone know an alternative to Neo4j that can store a tree-structure? Or maybe an addon or something that handles that?
The presence of a map in your properties implies that the data structure is not fully converted to a graph. The node (:N {p: map}) implies the structure: (:N)-->(:P {map}). With the former structure you'd need to query items in the map using something like match (n:N) where n.p.k = v which I imagine would be a nightmare for indexing, etc. With the latter you can simply match (:N)-->(p:P) where p.k = v.

How does path search work on Cypher and what types of filtering can be done during expansion?

I'm trying to understand Neo4j's mechanics when dealing with path searches. I studied the query patterns and execution plan operators in the developer manual, but I still have some questions.
Please correct me if I'm wrong, but from the content I read, and from some posts on Neo4j's blog, I understood that Cypher and Java traversals generally perform depth-first searches, more specifically informed searches, and that variable-length queries fit into it. I also read that shortest path planning uses a breadth-first bidirectional search, and a depth-first search as a fallback.
Is there any way to perform breadth-first searches in Neo4j other than that?
I know the APOC procedures library allows this kind of search through path expanders, but I'm limiting my scope to just the Cypher language for now.
Also, does the variable-length pattern run recursively?
And what kinds of filtering are executed during expansion? I read that functions like ALL normally are checked during expansion, but some are executed later.
The reason for these questions is to see to what extent I would be able to manipulate the data and make complex traversals using only Cypher and what already comes with Neo4j, without external libraries and without having to write procedures through the API.
Forgive me if these questions are trivial. Thanks in advance.
Being a declarative query language Cypher doesnot controls how the underlying engine will search the pattern. So by just specifying a pattern we cannot specify which pattern matching algorithm shall be used for finding patterns. So we cannot perform breath first algorithm with just using Cypher.
For running a variable length pattern recursively we can use kleene star over single edge labels for example :KNOWS* or we can use the or operator in cypher for example :KNOWS|FOLLOWS* in cypher. Essentially this means (KNOWS|FOLLOWS)* in Regular Path Query (RPQ) format. Which is equivalent to (a|b)* in regular expression syntax. We can also have more edge labels such as (a|b|c|....)* in cypher.
We can also specify a varibale length over the pattern as ()-[:KNOWS|FOLLOWS*1..6]->() in cypher.
However we cannot do recursive iterations over a few pattern for example a sub-graph which has a node with one incoming edge and atleast two out going edges from the same node. for example a pattern such as this (a)-[r]->(b)-[s]->(c), (b)-[v]->(d). This is an example of a conjunctive pattern.
Filtering can be is normally performed in the WHERE clause based on the property key value pairs. We can set the labels in the pattern expressed in the MATCH clause so that the underlying engine knows the start node. For example as done in the following query
match (a:Person{name:'Keanu Reeves'})-[r]->(b) return * limit 5

How to implement fuzzy search

I'm using Neo4j 3 REST API and i have node named customer it has properties like name etc i need to get search results of name of customer eg i should get results for name "john" for my input "joan".how to implement fuzzy search to get my desired results.
Thanks in advance
First off, I want to make that you know that if you're using Neo4j 3.x that 3.x is currently in beta and isn't considered stable yet.
You have two options to implement a fuzzy search in Neo4j. You can use the legacy indexes to implement Lecene-based indexing. That should provide anything that Lucene can do, though you'd probably need to do a bit more work. You can also implement your own unmanaged extension which will allow you to use Lucene a bit more directly.
Perhaps the easier alternative is to use elasticsearch with Neo4j and have elasticsearch do your full-text indexing. You might take a look at the Neo4j and ElasticSearch page on neo4j.com. There they provide a link to a GitHub repository which is a plugin for Neo4j which automagically updates ElasticSearch with data from Neo4j and which provides and endpoint for querying your graph fuzzily. There is also a video tutorial on how to do this.
You will have to try using https://neo4j.com/developer/kb/how-to-perform-a-soundex-search/ which in this case will work. If your input is Joan you will not get John as the response, unless you just give jo as input in which you will get both. To get what you are expecting you will have to use the soundex search.
Stepping back a little, what is the problem you are trying to solve with fuzzy matching?
My experience has been that misspellings and typos are far less common than you might think, and humans prefer exact matches whenever possible. If there is no exact match (often just missing a space between words), that's a good time to use a spellchecker, and that's where the fuzzy matching should kick in.
In addition, your example would match "joan" to "john", but some synonyms like "joanie" would be more useful. If you have a big corpus of content to work with, you may be able to extract some relationships, using fuzzy & machine learning to identify "joanne" and "joni" as possible synonyms and then submit that to a human curator. "Jon" looks like a related name but it's not, while "jo" and even "nonie" may or may not be nicknames in these groupings.

How to programmatically add constraints to Neo4J Cypher queries

I am writing a sever plugin for Neo4J. The plugin receives a cypher query, and executes it. Currently, my implementation uses a CypherExecutor.
I now need to further constrain the results. (For example, imagine that the results need to be filtered by ACLs.)
One approach is to filter the results after executing the query. I'd rather not do this, for performance reasons as well as other limitations (for example, any aggregate results would be wrong.)
I considered adding the constraints to the query itself. I've looked at the command.AbstractQuery subclasses produced using the CypherParser. That object model is immutable.
I am wondering whether I will need to resort to cloning Neo4J's ExecutionEngine and CypherCompiler, just to extend the ExecutionPlanBuilder... I would like to avoid this option if at all possible.
Any recommendations about how this can be done?
In my case, I am simply trying to simulate multiple isolated graphs. I am OK with how this might be modeled -- whether I add a 'tenantId' to each node, or maintain a tenant node and add (:Tenant)<-[:scopedTo]-(n) relationships to every node.

Finding distinct node groupings based on mutual relations in neo4j

Is there a query for a Neo4J graph that could traverse said graph and find nodes based on mutual relationships? For example, if Node A is related to Node B (bidirectionally), B is related to C, C is related to D, D is related to A, A is related to C, and B is related to D, such that there is a subgraph in which every node is connected to every other node, is there an efficient way to return that subgraph or group of Nodes?
I realize my explanation is poor, so I provide an example graph in the console: http://console.neo4j.org/r/qb2xmp
Here, I have created a graph, and I would like to return groups that are mutually related of 3 or more - so, in this case, I would ideally like to be returning the group of Scott, Josh, Frank, and Ben, as well as the group of Frank, Ben, and Eric. If possible, I would like to be able to identify who composes those individual groups.
This is an instance of the Clique Problem and is NP-Complete.
Here is a related question on SO with a good explanation!
Sorry I hit enter too soon. So there is no "efficient" way to do this. It is not unattemptable though in certain cases, and you will have the best luck looking for an algorithm that solves this general problem and implementing it in Neo4J.
did you find any solutions for this?
I implemented something like this on my project for text network visualization but it launches Gephi Toolkit (on Java) to perform some metric calculations on the graph, detect communities, etc. But that's too heavy...
You might be interested to look into Gephi's algorithms though, especially the Force Atlas layout implemented in Sigma.Js and especially the modularity algorithm used in Gephi itself. This might give you some clues as to how to proceed...

Resources