entity framework ref, deref, createref [duplicate] - entity-framework-4

This question is about why I would use the above keywords. I've found plenty of MSDN pages that explain how. I'm looking for the why.
What query would I be trying to write that means I need them? I ask because the examples I have found appear to be achievable in other ways...
To try and figure it out myself, I created a very simple entity model using the Employee and EmployeePayHistory tables from the AdventureWorks database.
One example I saw online demonstrated something similar to the following Entity SQL:
SELECT VALUE
DEREF(CREATEREF(AdventureWorksEntities3.Employee, row(h.EmployeeID))).HireDate
FROM
AdventureWorksEntities3.EmployeePayHistory as h
This seems to pull back the HireDate without having to specify a join?
Why is this better than the SQL below (that appears to do exactly the same thing)?
SELECT VALUE
h.Employee.HireDate
FROM
AdventureWorksEntities3.EmployeePayHistory as h
Looking at the above two statements, I can't work out what extra the CREATEREF, DEREF bit is adding since I appear to be able to get at what I want without them.
I'm assuming I have just not found the scenarios that demostrate the purpose. I'm assuming there are scenarios where using these keywords is either simpler or is the only way to accomplish the required result.
What I can't find is the scenarios....
Can anyone fill in the gap? I don't need entire sets of SQL. I just need a starting point to play with i.e. a brief description of a scenario or two... I can expand on that myself.

Look at this post
One of the benefits of references is that it can be thought as a ‘lightweight’ entity in which we don’t need to spend resources in creating and maintaining the full entity state/values until it is really necessary. Once you have a ref to an entity, you can dereference it by using DEREF expression or by just invoking a property of the entity

TL;DR - REF/DEREF are similar to C++ pointers. It they are references to persisted entities (not entities which have not be saved to a data source).
Why would you use such a thing?: A reference to an entity uses less memory than having the DEFEF'ed (or expanded; or filled; or instantiated) entity. This may come in handy if you have a bunch of records that have image information and image data (4GB Files stored in the database). If you didn't use a REF, and you pulled back 10 of these entities just to get the image meta-data, then you'd quickly fill up your memory.
I know, I know. It'd be easier just to pull back the metadata in your query, but then you lose the point of what REF is good for :-D

Related

Question regarding role-playing dimension

I hope you can be helpful in answering one question in regards to role-playing dimensions.
When using views for a role playing dimension, Does it then matter which view is referred to later in the analysis. Especially, when sorting on the role playing dimension, can this be done no matter which view is used?
Hope the question is clear enough. If not, let me know and I will elaborate.
Thanks in advance.
Do you mean you have created a view similar to "SELECT * FROM DIM" for each role the Dim plays? If that's all you've done then you could use any of these views in a subsequent SQL statement that joins the DIM to a FACT table - but obviously if you use the "wrong" view it's going to be very confusing for anyone trying to read your SQL (or you trying to understand what've you've written in 3 months time!)
For example, if you have a fact table with keys OrderDate and ShipDate that both reference your DateDim then you could create vwOrderDate and vwShipDate. You could then join FACT.OrderDate to vwShipDate and FACT.ShipDate to vwOrderDate and it will make no difference to the actual resultset your query produces (apart from, possibly, column names).
However, unless the applicable attributes are very different for different roles, I really wouldn't bother creating views for role-playing Dims as it's an unnecessary overhead and just going to cause confusion to anyone you've given access to at this level of the DB (who presumably have pretty strong SQL skills to be given this level of access?).
If you are trying to make life easier for end-users then either create these types of "views" in the models of the BI tool(s) they are using - and not directly in the DB - or, if they are being given access to the DB, then create View(s) across the Fact(s) and all their joined Dimensions

Find changes quickly in larger SQL database?

There is a Java Swing application which uses an Informix database. I have user rights granted for the Swing application (i.e. no source code), and read only access to a mirror of the database.
Sometimes I need to find a database column, which is backing a GUI element (TextBox, TableField, Label...). What would be best approach to find out which database column and table is holding the data shown e.g. in a TextBox?
My general approach is to capture the state of the database. Commit a change using the GUI and then capture the state of the database again. Then I need to examine the difference. I've already tried:
Use the nrows field of systables: Didn't work, because the number in nrows does not seem to be a realtime representation of the row count.
Create a script with SELECT COUNT(*) ... for all tables: didn't work because too many tables (> 5000). Also tried to optimize by removing empty tables, but there are still too many left.
Is there a simple solution that I'm missing?
Please look at the Change Data Capture API and check if this suits your needs
There probably isn't a simple solution.
You probably need to build yourself a map of the database, or a data dictionary for it. It sounds as though you can eliminate many of the tables from consideration since they're empty — at least for a preliminary pass. If you're dealing with information in a text box, the chances are it is some sort of character data; you can analyze which (non-empty) tables which contain longer character strings, and they'd be the primary targets of your searches. If the schema is badly designed with lots of VARCHAR(255) columns even though the columns normally only hold short strings, life is more difficult. Over time, you can begin to classify tables and columns so that you end up knowing where to look for parts of the application.
One problem to beware of: the tabid in informix.systables isn't necessarily as stable as you'd like. Your data dictionary needs to record its own dd_tabid for the table it describes, and can store the last known tabid from informix.systables, but it needs to be ready to find a new tabid value on occasion. You should probably only mark data in your dictionary for logical deletion.
To some extent, this assumes you can create a database in which to record this information. If you can't create an Informix database, you may have to use something else (MySQL, or SQLite, perhaps) to store the data dictionary. Alternatively, go to your DBA team and ask them for the information. Unless you're trying something self-evidently untoward, they're likely to help (but politics can get in the way — I've no idea how collegial your teams are).

SQL SELECT with table aliases in Core Data

I have the following SQL query that I want to do using Core Data:
SELECT t1.date, t1.amount + SUM(t2.amount) AS importantvalue
FROM specifictable AS t1, specifictable AS t2
WHERE t1.amount < 0 AND t2.amount < 0 AND t1.date IS NOT NULL AND t2.date IS NULL
GROUP BY t1.date, t1.amount;
Now, it looks like CoreData fetch requests can only fetch from a single entity. Is there a way to do this entire query in a single fetch request?
The best way I know is to crate an abstract parent entity for entities you wish to fetch together.
So if you have - 'Meat' 'Vegetables' and 'Fruits' entities, you can create a parent abstract entity for 'Food' and then fetch for all the sweet entities in the 'Food' entity.
This way you will get all the sweet 'Meat' 'Vegetables' and 'Fruits'.
Look here:
Entity Inheritance in Apple documentation.
Nikolay,
Core Data is not a SQL system. It has a more primitive query language. While this appears to be a deficit, it really isn't. It forces you to bring things into RAM and do your complex calculations there instead of in the DB. The NSSet/NSMutableSet operations are extremely fast and effective. This also results in a faster app. (This is particularly apparent on iOS where the flash is slow and, hence, big fetches are to be preferred.)
In answer to your question, yes, a fetch request operates on a single entity. No, you are not limited to data on that entity. One uses key paths to traverse relationships in the predicate language.
Shannoga's answer is one good way to solve your problem. But I don't know enough about what you are actually trying to accomplish with your data model to judge whether using entity inheritance is the right path for your app. It may not be.
Your SQL schema from a server may not make sense in a CD app. Both the query language and how the data is used in the UI probably force a different structure. (For example, using a fetched results controller on iOS can force you to denormalize your data differently than you would on a server.)
Entity inheritance, like inheritance in OOP, is a stiff technology. It is hard to change. Hence, I use it carefully. When I do use it, I gain performance in some fetches and some simplification in other calculations. At other times, it is the wrong answer, performance wise.
The real answer is a question: what are you really trying to do?
Andrew

How to change model lazyloadness at runtime in Symfony?

I use sfPropelORMPlugin.
Lazyload is ok if I operate on one object per web page. But if there are hundreds I get hundreds of separate DB queries. I'd like to completely disable lazyload or disable it for needed columns on those particularly heavy pages but couldn't find a way so far.
You should join all your relations when you build your query, that way you'll get all data in a single query. Note, you have to use joinWithRelation() where Relation is a related table name.
Elaborating on William Durand's answer, perhaps you should also look at the Propel function doSelectjoinAll(), which should pre-load all of the objects related to your relations. Just keep in mind this can be expensive as it relates to memory.
Another technique is to create a custom criteria with your needed joins, then use a manual hydrate technique to add on to your base object. I do this often when the data I need is using aggregates or other columns that are not exactly mapped to objects. There are plenty of hydrate() examples around.
Added utility method to peer to be able to set what columns I want to load. Using "pseudo columns" for this type of DB queries. Also I have overridden hydrate() to understand this "markup". All were good until I found out that even though data is hydrated symfony won't understand it and won't let you use it as intended.
PS join was never considered as an option because site is kind of high load.

SPROC to update record: how to handle unchanged values

I'm calling a update SPROC from my DAL, passing in all(!) fields of the table as parameters. For the biggest table this is a total of 78.
I pass all these parameters, even if maybe just one value changed.
This seems rather inefficent to me and I wondered, how to do it better.
I could define all parameters as optional, and only pass the ones changed, but my DAL does not know which values changed, cause I'm just passing it the model - object.
I could make a select on the table before updateing and compare the values to find out which ones changed but this is probably way to much overhead, also(?)
I'm kinda stuck here ... I'm very interested what you think of this.
edit: forgot to mention: I'm using C# (Express Edition) with SQL 2008 (also Express). The DAL I wrote "myself" (using this article).
Its maybe not the latest state of the art way (since its from 2006, "pre-Linq" so to say but Linq works only for local SQL instances in Express anyways) of doing it, but my main goal was learning C#, so I guess this isn't too bad.
If you can change the DAL (without changes being discarded once the layer is "regenerated" from the new schema when changes are made), i would recomend passing a structure containing the column being changed with values, and a structure kontaing key columns and values for the update.
This can be done using hashtables, and if the schema is known, should be fairly easy to manipulate this in the "new" update function.
If this is an automated DAL, these are some of the drawbacks using DALs
You could implement journalized change tracking in your model objects. This way you could keep track of any changes in your objects by saving the previous value of a property every time a new value is set.This information could be stored in one of two ways:
As part of each object's own private state
Centrally in a "manager" class.
In the first solution, you could easily implement this functionality in a base class and have it run in all model objects through inheritance.
In the second solution, you need to create some kind of container class that will keep a reference and a unique identifier to any model object that is created and record all changes in its state in a central store.This is similar to the way many ORM (Object-Relational Mapping) frameworks achieve this kind of functionality.
There are off the shelf ORMs that support these kinds of scenarios relatively well. Writing your own ORM will leave you without many features like this.
I find the "object.Save()" pattern leads to this kind of behavior, but there is no reason you need to follow that pattern (while I'm not personally a fan of object.Save(), I feel like I'm in the minority).
There are multiple ways your data layer can know what changed and most of them are supported by off the shelf ORMs. You could also potentially make the UI and/or business layer's smart enough to pass that knowledge into the data layer.
Two options that I prefer:
Generating/hand coding update
methods that only take the set of
parameters that tend to change.
Generating the update statements
completely on the fly.

Resources