I understand there are many advantages in using Spring data neo4j's advanced mapping rather than the simple mapping.
My question is what are the cons of using advanced mapping over the simple mapping?
I feel that there are almost no drawbacks of using the advanced mode. The only thing that's been bugging me, is the relatively poor IDE support of AspectJ. This was initially a hell to configure and get it right. Apart from that, our application is a lot faster with the advanced mapping mode so we never looked back.
According to Q5 in this post, simple mapping is favoured if you are talking to Neo4J via REST style while advanced mapping if you are using Neo4J embedded
Related
My goal is to build an automated Knowledge Graph. I have decided to use Neo4j as my database. I am intending to load a json file from my local directory to Neo4j. The data I will be using are the yelp datasets(the json files are quite large).
I have seen some Neo4j examples with Graphaware and OpenNLP. I read that Neo4j has a good support for JAVA apps. I have also read that Neoj supports python(I am intending to use nltk). Is it advisable to use Neo4j with JAVA maven/gradle and OpenNLP? Or should I use it with py2neo with nltk.
I am really sorry that I don't have any prior experience with these tools. Any advice or recommendation will be greatly appreciated. Thank you so much!
Welcome to Stack Overflow! Unfortunately, this question is a suggestion/opinion question so isn't appropriate for this forum.
However, this is an area I have worked in so I can confidently say that Java (or Kotlin) is the best way to go for Neo. The reason being, it is the native language for Neo and there is significantly more support in terms of the community for questions and libraries available out there.
However, NLTK is much more powerful than OpenNLP. So, if your usecase is simple enough for OpenNLP, then purely Java/Kotlin is a perfect approach. Alternatively, you can utilize java as an interfacing layer for the stored graph, but use python with NLTK for language work feeding into the graph. This would, of course, dramatically increase the complexity of your project.
Ultimately, the best approach depends on your exact use-case and which trade-offs make the most sense for you.
I'm working on a project (a social network) which use Neo4j (v1.9) as the underlying datastore and Spring Data Neo4j.
I'm trying to add a tag system to the project and I'm searching for ways to efficiently implement tag recommendation using collaborative filtering strategies.
After a lot of researches, I've come with these options:
Cypher. It is the embedded query language used by Neo4j. No other framework needed, maybe the computational times are better than the others. Maybe I can easily implement the queries using Spring Data Neo4j.
Apache Mahout. It offers machine learning algorithms focused primarly in the areas of collaborative filtering, clustering and classification. However, it isn't designed for graph databases and could be potentially slow.
Apache Giraph. Open source counterpart of Google Pregel.
Apache Spark. It is a fast and general engine for large-scale data processing.
reco4j. It is the best suited solution until now, but the project seems dead.
Apache Spark GraphX + Mazerunner. Suggested by the answer of #johnymontana. I'm documenting on it. The main issue is that I don't know if it supports collaborative filtering.
Graphaware Reco. Suggested by #ChristopheWillemsen in a comment. From the official site
is an extensible high-performance recommendation engine skeleton for
Neo4j, allowing for computing and serving real-time as well as
pre-computed recommendations.
However, I haven't understand yet if it works with old version of Neo4j (I can't upgrade the Neo4j version at the moment).
So, what do you suggest and why? Feel free to suggest other interesting frameworks not listed above.
Cypher is very fast when it comes to local traversals, but is not optimized for global graph operations. If you want to do something like compute similarity metrics between all pairs of users then using a graph processing framework (like Apache Spark GraphX) would be better. There is a project called Mazerunner that connects Neo4j and Spark that you might want to take a look at.
For a pure Cypher approach, here and here are a couple of recent blog posts demonstrating Cypher queries for recommendations.
We need to write some restful services to access / creates data on Neo4j. I have found many examples in Traverser Framework but I would like to explore Java CORE API as it is mentioned that the performance of Java Core API is far better than Traverser as per this link
Is it true? that Java CORE API is better than Traverser? Can someone guide me with useful tutorials of Java Core API for Neo4j?
Consider asking a different question here.
I don't dispute the performance finding that the traverser API is slower than the core API, but keep in mind that it's only for the kinds of things they were trying to do in that test.
Which API you should use depends on what you're trying to do. Without providing information on that, we can't suggest which will be the fastest for you.
Here are your tradeoff options: if you use the core API, then you can perform exactly the low-level operations on the graph that you want. On the flipside, you have to do all of the work. If the operations you're trying to do are complex, far reaching, or order-sensitive, you'll find yourself writing so much code that you'll re-implement a buggy version of the Traversal API on you own. Avoid this at all costs! The performance of the Traversal API is almost certainly better than what you'll write on your own.
On the other hand, if the operations you're performing are very simple (look up a node, grab its immediate neighbors by some edge type, then return them) then the core API is an excellent choice. In this (very simple) case, you don't need all the bells and whistles that Traversal gives you.
Bigger than just your question though: in general it's good to avoid "premature optimization". If a library or framework gives you a technique like the Traversal API, as a starting point it's a good bet to learn that abstraction and use it, because the developers gave it to you to make your life easier, not to make your code slower. If it turns out to be more than you need, or performance is really lagging -- then consider using the core API.
In the long run, if you're going to write RESTful services over top of Neo4J, you'll probably end up knowing both APIs. Bottom line - it's not a matter of choosing which one you should use, it's a matter of understanding their differences and which situations play to their strengths.
I'm in two minds at the moment in terms of adopting a persistence framework for CRUD operation (for MySQL) with JSF 2. I've googled and read comments from pro-JPA vs pro-JDO groups but I still can't decide what to adopt.
Is there any good step-by-step tutorials (similar to those provided at balusc.blogspot.com) of JPA implementations (Hibernate, etc) and JDO implementations (DataNucleas, etc) available online? Perhaps going through these examples might help me to understand a bit more about these two Java specifications and decide which to adopt finally.
I'm a newbie when it comes to JPA and JDO implementations so please be kind in your responses -:)...
P.S: Please no reference to roseindia.net. Sorry, that site is just too crappy for me!...
Here is the link with maven archetype that should get you up to speed really fast.
http://www.andygibson.net/blog/projects/knappsack/
I suggest trying to use this one:
jee6-servlet-minimal-archetype
It gets you CDI+2.0+JPA
The question is in the title.
It doesn't favor one over the other at all. It's just common to use LINQ to SQL examples because they are simpler to setup and deploy, so it's easier to digest the sample code without getting distracted by something which deserves its own learning path.
I agree that it does not favor one over the other. I always assumed that Linq to SQL tended to be used in examples because it was released about a year earlier. Therefore, book writers were more familiar with Linq to SQL and/or felt it was more stable.
I agree with Rex in that it makes more sense, when giving a tutorial about ASP.NET MVC, to keep other technology decisions simple. Since either DAL implementation can be used, it is easiest to teach MVC by using Linq to SQL (the simpler of the two). Linq to SQL is also widely considered to be more light-weight.
I must admit, it would be nice to have more open-source examples of projects using ASP.NET MVC along with Entity Framework. I can tell you that it works fine, because I am using it on one project. However, it can be a bit more difficult to figure out some of the ideosyncrasies. Here is another question that shows some links to examples.
I think this tendency to use the path of least resistance in example is a diservice to new developers. How many times have you seen an example, with the caveat that it is not production worthy code, with no reason as to why it is not appropriate, or good direction on how to find what is best? Personally, I appreciate longer examples that actually lead me to discover how something should be used are more helpful.
In this particular case, using Linq to Entities would be much more useful, as it is seemingly the future.
In my opinion, it doesn't favor it. It's what you see in most examples, because Linq to Sql is the fastest way to get examples up and running. Rails follows the same convention of many examples using features (scaffolding for example) that you would rarely see used in a production site.
As all the other posters have said - L2S samples are just a lot easier to put together hence you'll see them quoted more. In reality your MVC models may not use L2S directly - they could be hooking up to a separate services tier or some data transfer objects exposed by another system entirely.