Non-unicode columns using Entity Framework 4 - entity-framework-4

I am very new to EF so my descriptions may not make sense. Please ask me to clarify anything and I'll do my best to find the answer. In an existing application we are using EF4. 95% of our string columns in our db are varchar, but we do have 5% being nvarchar. In the edmx file, I see the columns have the proper Unicode property set to true or false. Then we use .tt file to generate our entity classes. The problem is that the generated queries are trying to convert everything to unicode which is obviously slowing down all of our queries.
I found the following answers here but I don't believe they will help me. The first is using ColumnAttribute but from what I can see, this was not available until v4.1. The second seems like it overrides on a global level (although I don't understand where). Because we do have some nvarchar columns, I don't think this will work either. I've also seen use of AsNonUnicode() method. I have not fully researched if this is available in v4 because that seems like it needs to be used specifically every time we send a query. This is a large application so this would be a huge undertaking. Are these my only options here? Am I missing something? Any advice is appreciated.
Entity Framework Data Annotations Set StringLength VarChar
EF Code First - Globally set varchar mapping over nvarchar

Related

Find changes quickly in larger SQL database?

There is a Java Swing application which uses an Informix database. I have user rights granted for the Swing application (i.e. no source code), and read only access to a mirror of the database.
Sometimes I need to find a database column, which is backing a GUI element (TextBox, TableField, Label...). What would be best approach to find out which database column and table is holding the data shown e.g. in a TextBox?
My general approach is to capture the state of the database. Commit a change using the GUI and then capture the state of the database again. Then I need to examine the difference. I've already tried:
Use the nrows field of systables: Didn't work, because the number in nrows does not seem to be a realtime representation of the row count.
Create a script with SELECT COUNT(*) ... for all tables: didn't work because too many tables (> 5000). Also tried to optimize by removing empty tables, but there are still too many left.
Is there a simple solution that I'm missing?
Please look at the Change Data Capture API and check if this suits your needs
There probably isn't a simple solution.
You probably need to build yourself a map of the database, or a data dictionary for it. It sounds as though you can eliminate many of the tables from consideration since they're empty — at least for a preliminary pass. If you're dealing with information in a text box, the chances are it is some sort of character data; you can analyze which (non-empty) tables which contain longer character strings, and they'd be the primary targets of your searches. If the schema is badly designed with lots of VARCHAR(255) columns even though the columns normally only hold short strings, life is more difficult. Over time, you can begin to classify tables and columns so that you end up knowing where to look for parts of the application.
One problem to beware of: the tabid in informix.systables isn't necessarily as stable as you'd like. Your data dictionary needs to record its own dd_tabid for the table it describes, and can store the last known tabid from informix.systables, but it needs to be ready to find a new tabid value on occasion. You should probably only mark data in your dictionary for logical deletion.
To some extent, this assumes you can create a database in which to record this information. If you can't create an Informix database, you may have to use something else (MySQL, or SQLite, perhaps) to store the data dictionary. Alternatively, go to your DBA team and ask them for the information. Unless you're trying something self-evidently untoward, they're likely to help (but politics can get in the way — I've no idea how collegial your teams are).

Should I use a rails integer id for an imported public data table that uses fixed length strings for id?

I need to use a public government data table, imported via csv as a table in my rails application (using postgresql). The table in question uses a fixed length 12 digit numeric string (often with at least one leading zero) as its primary key. I feel as if I have three choices here:
Add a rails-generated integer primary key upon import
Ask rails to interpret the string as an integer and use that as my primary key.
Force rails to use a string as the primary key (and then, subsequently as the foreign key in other associated tables as well)
I'm worried about doing choice 1 because I will likely need to re-import this government data wholesale at least yearly as it gets updated in order to keep my database current. It seems like it would be really complicated to ensure that the rails primary keys stay with the correct record after a re-import if records have been added and deleted
Choice 2 seems like the way to go, (it would solve the re-import problem) but I'm not clear how to go about it. Is it as simple as telling rails to import that column as an integer?
Choice 3 seems doable, but in posts I've read elsewhere, it's not a very "railsy" way to go about it.
I'd welcome any advice or out and out solutions on this.
Update
I ended up using choice 1, and it was the right choice. That's becauseI am new to rails, and had thought I'd be doing some direct, bulk import (on the back end, directly into Postgresql) would leave me with the problem I described above. However, I looked through Railscast 369 which explains how to build csv import capabilities into your application as an end-user function. It's remarkably easy and that's what I did. As a result of doing things this way, the application does a row-by-row import, and can thus have the appropriate checks built in at that level.

entity framework ref, deref, createref [duplicate]

This question is about why I would use the above keywords. I've found plenty of MSDN pages that explain how. I'm looking for the why.
What query would I be trying to write that means I need them? I ask because the examples I have found appear to be achievable in other ways...
To try and figure it out myself, I created a very simple entity model using the Employee and EmployeePayHistory tables from the AdventureWorks database.
One example I saw online demonstrated something similar to the following Entity SQL:
SELECT VALUE
DEREF(CREATEREF(AdventureWorksEntities3.Employee, row(h.EmployeeID))).HireDate
FROM
AdventureWorksEntities3.EmployeePayHistory as h
This seems to pull back the HireDate without having to specify a join?
Why is this better than the SQL below (that appears to do exactly the same thing)?
SELECT VALUE
h.Employee.HireDate
FROM
AdventureWorksEntities3.EmployeePayHistory as h
Looking at the above two statements, I can't work out what extra the CREATEREF, DEREF bit is adding since I appear to be able to get at what I want without them.
I'm assuming I have just not found the scenarios that demostrate the purpose. I'm assuming there are scenarios where using these keywords is either simpler or is the only way to accomplish the required result.
What I can't find is the scenarios....
Can anyone fill in the gap? I don't need entire sets of SQL. I just need a starting point to play with i.e. a brief description of a scenario or two... I can expand on that myself.
Look at this post
One of the benefits of references is that it can be thought as a ‘lightweight’ entity in which we don’t need to spend resources in creating and maintaining the full entity state/values until it is really necessary. Once you have a ref to an entity, you can dereference it by using DEREF expression or by just invoking a property of the entity
TL;DR - REF/DEREF are similar to C++ pointers. It they are references to persisted entities (not entities which have not be saved to a data source).
Why would you use such a thing?: A reference to an entity uses less memory than having the DEFEF'ed (or expanded; or filled; or instantiated) entity. This may come in handy if you have a bunch of records that have image information and image data (4GB Files stored in the database). If you didn't use a REF, and you pulled back 10 of these entities just to get the image meta-data, then you'd quickly fill up your memory.
I know, I know. It'd be easier just to pull back the metadata in your query, but then you lose the point of what REF is good for :-D

Moving lookup / reference tables to a new schema

We are building ASP.NET MVC3 web applications using Visual Studio, SQL Server 2008 R2 & EF Code First 4.1.
Quite often we have smaller, what we call, "lookup" tables. For example a "Status" table contain an "Id" and a "Name". As the application grows these tables become quite frequent and I would like to know the best way to "group" these lesser important tables away from the crux of the application.
It has been suggest to me to add a prefix like "LkStatus" to help me but what about moving all the lookup tables out of dbo and into there own schema?
Can anyone see any drawbacks in this method?
Thanks Paul
No drawbacks with this method. I'm a fan of schemas personally. I'd use Lookup though
To change your table schema, you have two ways:
ALTER SCHEMA Lookup TRANSFER dbo.SomeTable
or
ALTER AUTHORIZATION ON dbo.SomeTable TO Lookup
This is going to be down to preference. There really isn't a "gotcha" either way. I prefer a table prefix but wouldn't be bothered either way. We use LU_*. As long as either option is enforced that maintenance down the line will be easy.
Since the tables are small, what about grouping them together into a single table? Instead of using the table name as a pseudo-key, use a real key. For example, you could have a table called Lookup, with an Id, Type, Name and Value, where Type = 'Status' for your status values. Seting the clustered index to (Type, Name) would physically group all rows of the same type together, which would make it fast to read them all as a group, if needed.
If your Names can have different data types, add an extra column for each required type: one for integers, one for strings, one for floats, etc. You can do something similar using an XML column; the T-SQL takes just a little more effort.

EF4 QueryView or DefiningQuery?

I am in the middle of trying to complete a design for a project and have basically come to a fork in the road. I have made up my mind that I want to use EF4 as my data persistence layer, but my existing database is causing me some pains. Changing or augmenting the database is not an option. I have a single table that really serves multiple purposes and contains 120 columns (I didn't design this table!!! - it is a DB2 carryover after a SQL Server conversion long ago). I have designed a class diagram that creates five entities from this table, at varying levels of aggregation. In my research of what to do in these situations, I have narrowed it down to either using a “QueryView” in my MSL layer or a “DefiningQuery” in my SSDL layer to create the entities I need from this monolith table. The resultant data will only need to be read-only. I’d prefer getting back a proper entity, but anonymous types or dbdatarecord would be okay.
I have attempted to use a QueryView in MSL with my entity defined in my CSDL but the MSL keeps getting regenerated and my changes lost when I compile. Why?
Can anyone provide input as to what I should do here? Is using a DefiningQuery or QueryView preferable in this situation? Any input as to keeping these changes after updating my model from the database or compiling would be also very appreciated.
QueryView should not be regenerated. I'm not sure how QueryView behaves when you do update from database. I'm sure that DefiningQuery will be deleted when doing Update from database because DefiningQuery is defined in SSDL which is completely deleted during Update from database. I have some workaround for custom DefiningQueries by using two different EDMXs - one just for queries and second for entities updated from database. General concept is described here.
Difference between QueryView and DefiningQuery is the level where these constructs are included. QueryView is MSL element built as custom ESQL query on top of existing entity so your 120 columns entity must exists in EDMX. From unknown reason QueryView has no support for aggregations. DefiningQuery is SSDL element build as custom SQL query. It is by default used for database views (btw. probably best choice for you).

Resources