I am looking to build a relatively complex Neo4J application, which I intend to split up in two separete projects, namely frontend and backend. The frontend will be HTML5 and is not relevant for this question, the backend will have a REST interface with Jersey, but it's the structure behind that REST interface that I have questions about.
Atm, this is how I envisioned it :
RESTimpl <-DATA-> Service <-DTO-> Repository <-NODE-> DAO <--> Neo4j Singleton
The general flow would be that the RESTimpl receives JSON and converts it to simple java objects like Strings, int, ... Those are then passed on to the service who creates a DTO with them. That DTO is passed on to the repository which performs all the DAO calls needed to write such a DTO to the database (one DTO may require several nodes and relations to be created). For DAO I was thinking of creating both a Core API and Cypher implementation, which has just very basic graph functions like creating a node, creating a relation, deleting a node, ... Methods which are useful for all repositories basically. The Neo4j singleton would then contain my GraphDatabaseService instance and some configuration stuff.
This is a relatively complex structure but I want the project to be very modular. Makes it easy to do Dependency Injection. (Everything will be written against an interface as well)
However, all the examples on the internet have a different implementation. They actually make their DTO's a wrapper around the Neo4J node or at least store the underlying node in the DTO. But it does allow for just a REST-Service-DAO structure.
But it doesn't allow me to change the repository implementations and put a different database behind the application.
What would be the "most correct way" to do what I want to do?
I use exactly what you've described above and I find that it works very well(one project in production, one almost there) and does not mix concerns. I don't use Spring Data but it is an option to consider.
I have defined my domain objects as standard POJO's- no Neo4j stuff in them at all. Their persistence is managed by DAO's- which contain mostly Cypher queries, and in some cases some core API work depending on complexity.
The GraphDatabase is a injected (I have two contexts- the EmbeddedGraph implementation is injected for production and the ImpermanentGraph for tests).
A REST service is exposed using Jersey which deals with domain objects and/or DTO's.
So Neo4j code is only seen in the persistence layer.
Got a couple of general convenience methods exposed to index/fetch by index etc.
I wouldn't go the wrap-Node way- tried it but found that it brings along its own set of problems and results in a somewhat smelly design. Looks like you're on the right track (to me at least)
Related
Is it possible to use Grails to provide Controllers and Views, Neo4j as the database and (self written) domain classes that wrap the database access and CRUD operations without the neo4j plugin?
The data I have (~10^6 Nodes, 10^7 Relationships) are very well suited to be modeled by a graph DB. The Nodes and the relationships both need to have labels and properties so they can be accessed through traversal methods that only go via certain paths in the graph. I want to use grails for the web interface because I just starting learning programming a few weeks ago and it appears to be a pretty good point to begin.
From what I understand until know is that with the Grails Neo4j-plugin, it is not possible to set relationships with properties and labels. It seems very appealing and easy to write the classes that relate to the data using the plain Neo4j-Java-API.
Additionally, if my database is already structured in a way that directly relates to Objects, what is the benefit of using ORM (or object-graph-mapping in this case)?
Unless you require Grails scaffolding and you're not depending on domain classes in Grails you can go without the GORM plugin and do the dirty work on your own.
Add the neo4j jar dependencies to your BuildConfig.groovy and expose the GraphDatabaseService and optionally the ExecutionEngine to your application context, see http://grails.org/doc/latest/guide/spring.html#springdslAdditional.
In the near future there will be 2.0 version of the Neo4j GORM plugin that uses labels and relies solely on Cypher. Relationship properties is high on the list after this release.
In Grails there is a plug-in compile ":dto:0.2.4" to transfer Domain objects to DTOs. When using that plug-in the DTOs are created as Java classes.
For an example if there is Domain Class like Person.groovy the DTO is created like PersonDTO.java
What is the intention of this kind of a behavior ? Any comment would be appreciated.
Peter Ledbrook answer your question in this blog post.
Despite that, DTOs still persist (pardon the pun). When you want to
serialise data over RPC, they’re often one of the few options
available to you. GWT-RPC is a case in point, and the reason for the
Grails DTO plugin. Gilead allows you to transparently serialise
Hibernate domain instances, but this only works if the domain class
can be loaded by the client. Since GORM domain classes are typically
Groovy, that’s not an option with GWT. Your typical Grails domain
class also includes a bunch of stuff that the client is hardly going
to be interested in, like the custom mappings.
So, basically it can be an lightweight version of your domain class, only with the data that your client needs.
Not the case of Grails, that have static methods to database query's, but if you have a DAO class, the DTO pattern can be used to ensure that your client will not be allowed to perform the methods that touch the database. This can be good to ensure inappropriate use of this objects in your presentation layer.
I'm getting started on a new MVC project where there are some peculiar rules and a little bit of strangeness, and it has me puzzled. Specifically, I have access to a database containing all of my data, but it has to be handled entirely through an external web service. Don't ask me why, I don't understand the reasons. That's just how it is.
So the CRUD will be handled via this API. I'm planning on creating a service layer that will wrap up all the calls, but I'm having trouble wrapping my head around the model... To create my model-based domain objects (customers, orders, so on..) should I:
Create them all manually
Create a dummy database and point an ORM at it
Point an ORM at the existing database but ignore the ORM's persistence in lieu of the API.
I feel like I've got all the information I need to build this out, but I'm getting caught up with the API. Any pointers or advice would be greatly appreciated.
Depending on the scale of what you're doing option 3 is dangerous as you're assuming the database model is the same as that exposed by the external service. Options 1 and 2 aren't IMHO much different from each other - in either case you'll have to decide what your objects, properties and behaviours are going to be - it just boils down to whether you're more comfortable doing it in classes or database tables.
The key thing is to make sure that the external service calls are hidden behind some sort of wrapper. Personally I'd then put a repository on top of that to handle querying the external service wrapper and return domain objects.
In general, ORMs are not known for their ability to generate clean domain model classes. ORMs are known for creating data layers, which you don't appear to need in this case.
You could probably use a code generation tool like T4 to code generate a first pass at your domain model classes, based on either the web service or the database, if that would save you time. Otherwise, you probably would just manually create the domain objects. Even if you code generate a first pass at your domain objects, it's unlikely there is a clean 1-1 mapping to your domain objects from either the database or web service, so you will likely need to spend significant time manually editing your code generated domain classes anyway.
I am confused as to the limitations of what gets defined in the Repositories and what to leave to the Services. Should the repository only create simple entities matching tables from the database or can it create complex custom object with combinations of those entities?
in other words: Should Services be making various Linq to SQL queries on the Repository? Or should all the queries be predefined in the Repository and the business logic simply decide which method to call?
You've actually raised a question here that's currently generating a lot of discussion in the developer community - see the follow-up comments to Should my repository expose IQueryable?
The repository can - and should - create complex combination objects containing multiple associated entities. In domain-driven design, these are called aggregates - collections of associated objects organized into some cohesive structure. Your code doesn't have to call GetCustomer(), GetOrdersForCustomer(), GetInvoicesForCustomer() separately - you just call myCustomerRepository.Load(customerId), and you get back a deep customer object with those properties already instantiated. I should also add that if you're returning individual objects based on specific database tables, then that's a perfectly valid approach, but it's not really a repository per sé - it's just a data-access layer.
On one hand, there is a compelling argument that Linq-to-SQL objects, with their 'smart' properties and their deferred execution (i.e. not loading Customer.Orders until you actually use it) are a completely valid implementation of the repository pattern, because you're not actually running database code, you're running LINQ statements (which are then translated into DB code by the underlying LINQ provider)
On the other hand, as Matt Briggs' post points out, L2S is fairly tightly coupled to your database structure (one class per table) and has limitations (no many-many mappings, for example) - and you may be better off using L2S for data access within your repository, but then map the L2S objects onto your own domain model objects and return those.
A repository should be the only piece of your application that knows anything about your data access technology. So it should not be returning objects generated by L2S at all, but map those properties to model objects of your own.
If you are using this sort of pattern, you may want to re think L2S. It generates up a data access layer for you, but doesn't really handle impedance mismatch, you have to do that manually. If you look at something like NHibernate, that mapping is done in a more robust fashion. L2S is more for a 2 tier application, where you want a quick and dirty DAL that you can extend on easily.
If you're using LINQ then my belief is that the repository should be a container for your LINQ syntax. This gives you a level of abstraction of the database access routines from your model object interfacing. As Dylan mentions above there are other views on the matter, some people like to return the IQueryable so they can continue to query the database at a later point after the repository. There is nothing wrong with either of these approaches, as long as you're clear in your standards for your application. There is more informaiton on the best practices I use for LINQ repositories here.
Have been looking at the MVC storefront and see that IQueryable is returned from the repository classes. Wondering if you are not using LINQ does it makes sense to return that object? In the case of LINQ in makes sense because of deferred execution, so adding filtering in the service layer makes sense, but if you don't use LINQ you would want to filter in the DB in many cases. In this case would I just add methods that do the filtering to the repository? If I do, is the service layer really useful?
Arguments can be made either way, see this recent blog posting: Should my repository expose IQueryable?
The IQueryable stuff that Rob Conery put into the MVC Storefront is innovative, but by no means the norm when it comes to creating repositories. Usually, a repository is responsible for mapping your domain to and from the database. Returning IQueryable doesn't really perform any mapping and relies on the services layer to do that. This has its advantages and disadvantages, but suffice it to say that it's not the only way to do it.
You will notice, however, that your services end up becoming a little smelly because of all the duplicate code. For instance, if you just want to get a list of all the users in your database you'd have to define that function in both the repository and the service layer. Where the service layer shines, however, is when multiple transactions to/from the database are needed for one operation.
The issue I have with exposing IQueryable to the Service Layer is that if you ever wanted to wrap the Repository layer behind a Web Service without breaking the Service Layer code you couldn't, well not without using ADO.NET Data Services but then all your Repository code would essentially become redundant.
Whilst I think it can be pretty productive for small apps, when you start looking at scaling and distribution it does more bad than good.