Can someone tell me when the expand command in breeze will be available in combination with MongoDB?
Kind regards
Dominik
The EntityQuery 'expand' function is not likely to be implemented for MongoDB because 'expand' conceptually requires a 'join' which is a feature that Mongo does not implement.
However, the idea within MongoDB is that an object's children ( or relations if you are coming from a relational background) are actually stored and returned with the parents. From a breeze perspective this means that we treat all of these related children objects as complex objects that are automatically returned when you query the parent. In other words, all of the "expands" that you are likely to want are automatically part of the results of your queries.
The only problem occurs when you actually try to use MongoDB in a relational manner, i.e. where you store the ID of an object in one collection as a property of an object in another collection. From a MongoDB ( and breeze) perspective this would mean you would need to perform another query to get this related data.
We did think about translating breeze 'expand's into a series of nested queries but it really does go against the "MongoDB" mindset and the performance of such queries can be terrible. ... and we weren't sure that it would be that useful or desirable to the majority of MongoDB developers.
In general, if this occurs a lot in your data, then MongoDB is probably not the right database to use, because you will end up manually "joining" your data, which is a very tedious process in Mongo. This is one of the cases where a relational database really is a better choice.
Related
I'm new to Neo4j and still experimenting / changing up my whole understanding of building databases with it.
My question is, given an object of type X that needs data from another object of type X (e.g. object 2 has a comment that we want to get when querying for object 1), is it faster to just store a duplicate of that comment in object 1, or does Neo4j work faster with relationships (maybe "faster" isn't the right term. Can it scale?). Which would be better if I want it to be possible for the "chain" of relationships (object 1 needing the comment of object 2 and the comment of the object that object 2 is pointing to... so object 3).
Sorry if that's confusing.
Thanks!
is it faster to just store a duplicate of that comment in object 1?
Don't do that.
May you imagine the effect on maintainability to duplicate such data ?
The essence and the whole benefit of Neo4j is to traverse nodes through relationships.
You thought like if Neo4j was just a document-oriented database.
It's a graph database.
In 95% of cases, you should model your neo4j data as they are linked in real life, since the benefit of the graph is to "model" the real life.
I have a relational database (about 30 tables) and I would like to transpose it in a neo4j graph database, and I don't know where to start...
Is there a general way to transpose tables and/or tuples into a graph model ? (relations properties, one or more graphs ?) What are the best sources of documentation ?
Thanks for any help,
Best regards
First, if at all possible, I'd suggest NOT using your relational DB as your "reference" for transposing to a graph model. All too often, mistakes and pitfalls from relational modelling get transferred over to the graph model and introduce other oddities. In fact, if you have a source ER diagram, that might be an even better starting point as it's really already a graph. And maybe even consider a re-modelling exercise for your domain!
That said, from a basic point of view, you can think of most tables as representing a node type (e.g. "User" or "Movie") with join tables and keys representing relationship types.
A great starting point, from my perspective anyway, is to determine some questions your graph/data source should answer. Write those questions down, and try to come up with Cypher queries that represent the questions. Often times, a graph model naturally arises from such an effort, and it's really not that difficult.
If you haven't already, I'd strongly recommend picking up a (free) copy of the Graph Databases ebook from here: http://graphdatabases.com/
It's jam-packed with a lot of good info on where to start with modelling your domain and even things to consider when you're used to doing things in a relational manner. It also contains some material on Cypher, although the Neo4j site (neo4j.org) has a reference manual with plenty of up-to-date info on Cypher.
Hope this helps!
There's not going to be a one-stop-shop for this kind of conversion, as not all data models are appropriate for graph modeling, and every application is a unique special snowflake...but with that said.....
Generally, your 'base' tables (e.g. User, Role, Order, Product) would become nodes, and your 'join tables' (a.k.a. buster tables) would be candidates for your relationships (e.g. UserRole, OrderLineItem). The key thing to remember that in a graph, generally, you can only have one relationship of a given type between two specific nodes - so in the above example, if your system allows the same product to be in an order twice - it would cause issues.
Foreign keys are your second source of relationships, look to them to see if it makes sense to be a relationship or just a property.
Just keep in mind what you are trying to solve by your data model - if it's traversing your objects to find relationships and distance, etc... then graphs may be a good fit. If you are modeling an eCommerce app, where you are dealing with manipulating a single nested object (e.g. order -> line item -> product -> sku), then a relational model may be the right fit.
Hope my $0.02 helps...
As has been already said, there is no magical transformation from a relational database model to a graph database model.
You should look for the original entities and how they are related in order to find your nodes, properties and relations. And always keeping in mind what type of queries you are going to perform.
As BtySgtMajor said, "Graph Databases" is a good book to start, and it is free.
I have the following SQL query that I want to do using Core Data:
SELECT t1.date, t1.amount + SUM(t2.amount) AS importantvalue
FROM specifictable AS t1, specifictable AS t2
WHERE t1.amount < 0 AND t2.amount < 0 AND t1.date IS NOT NULL AND t2.date IS NULL
GROUP BY t1.date, t1.amount;
Now, it looks like CoreData fetch requests can only fetch from a single entity. Is there a way to do this entire query in a single fetch request?
The best way I know is to crate an abstract parent entity for entities you wish to fetch together.
So if you have - 'Meat' 'Vegetables' and 'Fruits' entities, you can create a parent abstract entity for 'Food' and then fetch for all the sweet entities in the 'Food' entity.
This way you will get all the sweet 'Meat' 'Vegetables' and 'Fruits'.
Look here:
Entity Inheritance in Apple documentation.
Nikolay,
Core Data is not a SQL system. It has a more primitive query language. While this appears to be a deficit, it really isn't. It forces you to bring things into RAM and do your complex calculations there instead of in the DB. The NSSet/NSMutableSet operations are extremely fast and effective. This also results in a faster app. (This is particularly apparent on iOS where the flash is slow and, hence, big fetches are to be preferred.)
In answer to your question, yes, a fetch request operates on a single entity. No, you are not limited to data on that entity. One uses key paths to traverse relationships in the predicate language.
Shannoga's answer is one good way to solve your problem. But I don't know enough about what you are actually trying to accomplish with your data model to judge whether using entity inheritance is the right path for your app. It may not be.
Your SQL schema from a server may not make sense in a CD app. Both the query language and how the data is used in the UI probably force a different structure. (For example, using a fetched results controller on iOS can force you to denormalize your data differently than you would on a server.)
Entity inheritance, like inheritance in OOP, is a stiff technology. It is hard to change. Hence, I use it carefully. When I do use it, I gain performance in some fetches and some simplification in other calculations. At other times, it is the wrong answer, performance wise.
The real answer is a question: what are you really trying to do?
Andrew
I use sfPropelORMPlugin.
Lazyload is ok if I operate on one object per web page. But if there are hundreds I get hundreds of separate DB queries. I'd like to completely disable lazyload or disable it for needed columns on those particularly heavy pages but couldn't find a way so far.
You should join all your relations when you build your query, that way you'll get all data in a single query. Note, you have to use joinWithRelation() where Relation is a related table name.
Elaborating on William Durand's answer, perhaps you should also look at the Propel function doSelectjoinAll(), which should pre-load all of the objects related to your relations. Just keep in mind this can be expensive as it relates to memory.
Another technique is to create a custom criteria with your needed joins, then use a manual hydrate technique to add on to your base object. I do this often when the data I need is using aggregates or other columns that are not exactly mapped to objects. There are plenty of hydrate() examples around.
Added utility method to peer to be able to set what columns I want to load. Using "pseudo columns" for this type of DB queries. Also I have overridden hydrate() to understand this "markup". All were good until I found out that even though data is hydrated symfony won't understand it and won't let you use it as intended.
PS join was never considered as an option because site is kind of high load.
I've been seeing a lot of commentary (from an NHibernate perspective) about using Guid as opposed to an int (and presumably auto-id in the database), with the conclusion that using auto-identity breaks the UoW pattern.
This post has a short description of the issue, but it doesn't really tell me "why" it breaks the pattern (unless I'm misunderstanding which is likely the case.
Can someone enlighten me?
There are a few major reasons.
Using a Guid gives you the ability to identify a single entity across many databases, including six relational databases with the same schema but different data, a document database, etc. This becomes important any time you have more than one single place where data goes - and that means your case too: you have a dev database and a prod database, right?
Using a Guid gives NHibernate the ability to batch more statements together, perform more database work at the very end of the unit of work / transaction, and reduce the total number of roundtrips to the database, increasing performance as well as conferring other benefits.
Comment:
Random Guids do not create poor indexes - natively, they create poor clustered indexes. There are two solutions.
Use a partially sequential Guid. With NHibernate, this means using the guid.comb id generator rather than the guid id generator. guid.comb is partially sequential for good performance, but retains a very high degree of randomness.
Have your Guid primary key be a nonclustered index, and put a clustered index on another auto-incrementing column. You may decide to map this column, in which case you lose the benefit of better batching and fewer roundtrips, but you regain all the benefits of short numbers that fit easily in a URL. Or you may decide not to map this column and have it remain completely within the database, in which case you gain better performance for Guids as primary keys as well as better performance for NHibernate doing fewer roundtrips.
My take would be that the key breaking factor is that getting the auto-incremented value requires an actual write to the database, which nHibernate would have deferred or possibly never performed.
Using identity and in a parent-child scenario the database has round trip the database to get the ID of a parent so that it can associate the child correctly. This means that the parent has to be committed at this time. Should there be a problem with the child you would then need to delete the parent in order to exit the UoW correctly.