Many Duplicate Queries in Entity Framework 5 Code-First(n+1) - asp.net-mvc

One of our contractors implemented a repository pattern with code first approach. We use Service Locator as DI pattern. what we do when we retrieve data from DB, we pass interface to GetQueryable function and get the data. However, I see serious performance issues on our application. I implemented MiniProfiler and MiniProfiler.EF to see where the bottleneck is.
We have a case table which has quite a few fields(around 25) and some of those fields are associated to other tables as one to one and one to many(only one field has many relation to other table). when I try to see the case detail, it runs around 400 SQL queries and SQL takes around 40 percent of the load time as far as the miniprofiler concerned. Here our GetQueryable and Find methods
public IQueryable<T> GetQueryable<T>(params string[] includes)
{
Type type = _impls.Value[typeof (T).Name].GetType();
DbSet dbSet = Db.Set(type);
foreach (var include in includes)
{
dbSet.Include(include);
}
return ((IQueryable<T>) dbSet);
}
I added included to this method to attach other related tables, but it did not make any difference. and here is the Find Method
public T Find<T>(long? id)
{
Type type = _impls.Value[typeof(T).Name].GetType();
return (T) Db.Set(type).Find(id);
}
I pretty much tried to apply all the performance improvements, but the number of the SQL queries has not gone down. I tried to disable lazy loading, but it caused many problems in other parts of the application.
Just some additional information, in case table, there are 70000 rows and in out dialogs table, there are 500000 rows. Case and Dialog are associates as one-to-many. and each case has 20-40 dialog entries.
My questions are;
Why does include not make any difference when I use?
Is there any other way to crop number of the queries run?
Do you think the implementation is the problem?
Thanks

Include returns a new IQueryable and does not modify the source query. In addition you can use the generic version of Set which simplifies the code a bit:
public IQueryable<T> GetQueryable<T>(params string[] includes)
{
IQueryable<T> query = Db.Set<T>();
foreach (var include in includes)
{
query = query.Include(include);
}
return query;
}

Step 1: Fire your contractor. Seriously. Like right now. That is some awful code. Not only did they miss something as simple and basic as using the generic version of Set, but they've successfully only made working with Entity Framework more complex, because all the repository does is proxy Entity Framework methods with its own unique and bastardized API.
That said, there's really not enough here to diagnose what your problem is. The use of Include may give you larger queries, but it should actually serve to reduce the overall number of queries issued. It's possible, you're just not using includes where you should be.
Now, the fact that you "tried to disable lazy loading, but it caused many problems in other parts of the application", means that you're relying too heavily on lazy-loading. Basically, you're loading in stuff you don't even know about, which is the antithesis of optimization. Ironically, you'd actually be best served by going ahead and disabling lazy-loading, and then tracking down where your code fails because of that. If you want to actually lazy-load that thing, you can use .Load (see: Explicit Loading). But, if you want to eager-load to reduce queries, then you know what includes you need to add.

Related

IEnumerable vs IQueryable in OData and Repository Pattern

I watched this video and read this blog post. There is something in this post confused me; The last part of the post. In the last part Mosh emphasized, Repository should never return IQueryable, because it results in performance issue. But I read something that sounds contradictory.
This is the confusing part:
IEnumerable: While querying data from database, IEnumerable executes select query on server side, load data in-memory on client side and then filter data. Hence does more work and becomes slow.
IQueryable: While querying data from database, IQueryable executes select query on server side with all filters. Hence does less work and becomes fast.
this is another answer about IQueryable vs IEnumerable in Repository pattern.
These are opposite of Mosh's advice. If these are true, why we should not use IQueryable instead of IEnumerable.
And something else, What about situations that we want to use OData; As you know it’s better to use IQueryable instead of IEnumerable when querying by OData.
one more thing, is it good or bad to use OData for querying e-Commerce website APIs.
please let me know your opinion.
Thank you
A repository should never return a IQueryable. But not due to performance. It's due to complexity. A repository is about reducing the complexity in the business layer.
Buy exposing an IQueryable you increase the complexity in two ways:
You leak persistence knowledge to the business domain. There is things that you must know about the underlying Linq to Sql provider to write effective queries.
You must design the business entities so that querying them is possible (i.e. not pure business entities).
Examples:
var blockedUsers = _repository.GetBlockedUsers();
//vs
var blockUsers = _dbContext.Users.Where(x => x.State == 1);
var user = _repos.GetById(1);
//and an enum is used internally in the user class
user.Block();
_repos.Update(user);
// vs
var user = _dbContext.Users.FirstOrDefault(x => x.Id == 1);
user.State = 1;
_dbContext.SaveChanges();
By wrapping everything behind your repository, you design your business entities in a way that make it easy to work with them (child entites, enums, date management etc). And you design the repository so that those entities can be stored in an efficient way. No compromises and code that is more easily maintained.
Regarding OData: Do not use the repository pattern. It doesn't add any value in that case.
If you insist on using IQueryable in your business domain, do not use the repository pattern. It would only complicate things without adding any value.
Finally:
Business logic that uses properly designed repositories is so much easier to test (unit tests). Code where LINQ and business logic is mixed must ALWAYS be integration tests (against a DB) since Linq to Sql differs from Linq to Objects.

MVC 4 Strongly Typed Data to View without ORM

I've seen other questions regarding using straight SQL for MVC data access, for example here. Most responses don't answer the question but ask
"Why would you not want to use ORM, EF, Linq, etc?"
My group does custom reporting out of a data warehouse that requires a lot of complex, highly tuned Oracle queries that are manipulated based on user GUI parameter selections.
My newest project is to develop a SQL plugin reporting tool for SQL report developers. They would create a pre-tuned SQL for the report with pseudo parameters and enter (and store) via the GUI. Then the GUI would prompt them for the parameter definitions (name and type) that need to be displayed/requested at run time to ultimately replace the pseudo variables.
So a SQL statement may look like:
SELECT * FROM orders WHERE order date BETWEEN '<Date1>' AND '<Date2>'
And the report developer would then, via the GUI, add two parameters named Date1 and Date2, and flag them as date fields.
End users would then select the report, get prompted for Date1 and Date2, and the GUI would do the substitution and run the SQL.
As you can see, I have no choice but to use straight SQL (especially in the 2nd example and understand I would have to forgo strongly typed in the 2nd also).
So my questions are:
When is it necessary to bypass EF/Linq (and there are definitely reasons to), what is best practice in MVC 4?
And how best to strongly type when I do know the output columns ahead of time?
And CRUD processing?
Can anyone point me to any examples of non-EF/Linq based coding in this regard?
I think this is a bit open ended question, so here's my 2c. (If tl dr, go to last section)
To me, it's not so much "by passing EF/Linq", but rather, the need to choose the appropriate data persistence library. I have used PetaPoco, Ado.Net, NHibernate/ActiveRecord, Linq2Sql, EF (My main choice) with MVC.
Best practice actually comes from realising that Controllers are STILL a part of presentation layer, and that it should not deal with anything other than HttpContext related operations + calling business logic service classes.
I arrange my classes as:
Presentation (MVC) -> Logic Services (Simple classes) -> Data Access (Context wrapped in "repositories").
So I can't quite imagine whether to use EF or not would have any implication on asp.net MVC.
For me, Data Access returns data in DTO, e.g.
public List GetAllFoos()
Whether that method string concatenate from a xml, etc or do a simple Context.Foos.ToList() is irrelevant to the rest of the application. All I care is Data Access do NOT return me a DataSet with string matching for columns. Those stay in DAL.
See point 1 and 2. My repositories have CRUD methods on it. How it's done is irrelevant to the rest of application. Take one of the most basic interface to my repositories classes:
public interface IFooRepository
{
void Save(Foo foo)
Foo Get(int id)
void Create(Foo foo)
void Delete(int id)
}
One point not mentioned yet, DI is also crucial. The concrete implimentation "FooRepository" may choose to request dependencies such as Web services, context classes, etc. Those are, however, again, completely irrelevant to the caller who depends on the interface.
If you still require an example after the 3 points above, drop a comment and I'll whip up something extremely simple using Ado.net.
===========================================================================
To EF or not to EF.
For me, if starting a new project with new schema, I use EF code first.
Fitting new code to old database + old project has no ORM mapping I can reuse = PetaPoco.
===========================================================================
In the context of your project:
The "SQL plugin reporting tool for SQL report developers". "The" sql reporting service? I'm not sure why you need to do anything? Doesn't SSRS already do that? (Enter sql statement/data source, generate form for parameter, etc).
If not I'd question the design decision. IMVHO, the need for users of an application (I don't care if it's "report developer" or w/e) to enter SQL statements is usually stemmed from "architectural astronauts". How do you debug the SQL statement when you enter via GUI as a string? How do you know the tables and the relationships? You either dig into SSMS and come back to gui, or you build complex UI (aka rebuild SSMS).
At the end of day, if you want bazillion reports for gazillion different users, you have to pay for it. I see too many "architectural astronauts" who exposes application to accept SQL statements only to make everyone waste time guessing what should be put into it. No cost saving at all.
Ok, if you must do that, well eh... Good luck. Best bet is to return as a DataTable and dump the rows/columns/data on to the view with nested foreach looping through rows then columns.

Proper implementation of Repository Pattern using IQueryables?

I'm building a Repository layer for my MVC application with methods like GetObject, UpdateObject, DeleteObject, etc.
This is what I have now:
public List<Object> GetObjects()
{
return _db.Objects.Where(o => o.IsArchived == false).ToList();
}
But I'm wondering if it would be better to return IQueryables for lists so that the least amount of data gets sent to the client when filters are applied in the UoW or Service layers. Would it be best to do something like this?
public IQueryable<Object> GetObjects()
{
return _db.Objects.Where(o => o.IsArchived == false);
}
The not nice thing about returning IQueryable, is that if you ever have a different implementation of repository, say using different ORM, storing data in non-SQL database, cloud or XML file, it would be hard to implement same interface. It would be much easier to implement if you return more generic colections of domain objects. For example IEnumerable. You can always pass filtering criteria in.
The other drawback of returning IQueryable, is that it may happen, that when you actually run the query your object context may be already disposed (Depending on your implementation) or may be kept in memory longer than required.
A leaky abstraction such as IQueryable could cause problems, for example imagine you want to get some data from database and order it by Guid. If you enumerate the query by calling ToList() prior to sorting, you'll get different results if you do it after. The reason is that in first case the sorting will happen in .NET, but in other case it will happen in SQL which uses completely different order.
The nice thing about returning IQueryable here is that you can continue to build up your query further without hitting the db. Once you call ToList it will hit the db and you can't customize your query further without hitting the database a second time.

ASP Mvc Nhibernate Issue

I am experiencing some bizarre problems with Nhibernate within my MVC web application.
There is not 1 consistent error, I keep getting loads of random ones:
Transaction not successfully started
New request is not allowed to start because it should come with valid transaction descriptor
Unexpected row count: -1; expected: 1
To give a little context to the setup, I am using Ninject to DI the sessions and other Nhibernate related objects, currently I am using RequestScope however I have tried SingletonScope. I have a large and complicated data model, which is read out as a whole, but persisted back in separate parts, as these can all be edited and saved individually.
An example would be having a Customer object, which contains a address object, a contact object, friends object, previous orders object etc etc...
So the whole object is read out, then mapped to the UI domain models and then displayed in different partials within the page. Each partial can be updated individually via ajax, so you may update 1 section or you could update them all together. It seems mainly to give me the problems when I try to persist them all together (so 2-4 simultanious ajax requests to persist chunks of the model).
Now I have integration tests that work fine, which just test the persistence and retrieval of entities. As a whole and individually and all pass fine, however in the web app they just seem to keep throwing random exceptions, and originally refused to persist outside of the Nhibernate cache. I found a way round this by wrapping most units of work within transactions, which got the data persisting but started adding new errors to the mix.
Originally I was thinking of just scrapping Nhibernate from the project, as although I really want its persistance/caching layer, it just didnt seem to be flexible enough for my domain, which seems odd as I have used it before without much problem, although it doesn't like 1-1 mappings.
So has anyone else had flakey transaction/nhibernate issues like this within an ASP MVC app... I know this may be a bit vague as the errors dont point to one thing, and it doesn't always error, so its like stabbing in the dark, but I am out of ideas so any help would be great!
-- Update --
I cannot post all relevant code as the project is huge, but the transaction bit looks like:
using (var transaction = sessionManager.Session.BeginTransaction(IsolationLevel.ReadUncommitted))
{
try
{
// Do unit of work
transaction.Commit();
}
catch (Exception)
{
transaction.Rollback();
throw;
}
}
Some of the main problems I have had on this project have stemmed from:
There are some 1-1 relationships with composite keys, but logically it makes sense
The Nhibernate domain entities go through a mapping layer to become the UI domain entities, then vice versa when saving. Problem here is that with the 1-1 mappings, when persisting the example Address I have to make a Surrogate Customer object with the correct Id then merge.
There is ALOT of Ajax that deals with chunks of the overall model (I talk like there is one single model, but there are quite a few top level models, just one that is most important)
Some notes that may help. I use windsor but imagine the concepts are the same. Sounds like there may be a combination of things.
SessionFactory should be created as singleton and session should be per web request. Something like:
Bind<ISessionFactory>()
.ToProvider<SessionFactoryBuilder>()
.InSingletonScope();
Bind<ISession>()
.ToMethod( context => context.Kernel.Get<ISessionFactory>().OpenSession() )
.InRequestScope();
Be careful of keeping transactions open for too long, keep them as short lived as possible to avoid deadlocks.
Check your queries are running as as expected by using a tool like NHProf. Often people load up too much of the graph which impacts performance and can create deadlocks.
Check your mappings for things like not.lazyload() and see if you actually need the additional data in the queries and keep results returned to a min. Check your queries execution plans and ensure adequate indexes are in place.
I have had issues with mvc3 action filters being cached, which meant transactions were not always started, but would attempt to be closed causing issues. Moved all my transaction commits into ActionResults in the controllers to keep transaction as short as possible and close to the action.
Check your cascades in your mappings and keep the updates to a minimum.

How to clean/Reset ObjectContext in EF 4

I have an hierarchical structure with millions of records.
I'm doing a recursive scan on the DB in order to update some of the connections and some of the data.
the problem is that I get an outofmemory exception since the entire DB is eventually loaded to the context (lazy). data that I no longer need stays in the context without any way of removing it.
I also can't use Using(context...) since I need the context alive because I'm doing a recursive scan.
Please take the recursion as a fact.
Thanks
This sort of an operation is really not handled well nor does it scale well using entities. I tend to resort to stored procedures for batch ops.
If you do want to remove/dump objects from context, I believe this post has some info (solution to your problem at the bottom).
just ran into the same problem. I've used NHibernate before I used EF as ORM tool and it had exactly the same problem. These frameworks just keep the objects in memory as long as the context is alive, which has two consequences:
serious performance slowdown: the framework does comparisons between the objects in memory (e.g. to see if an object exists or not). You will notice a gradual degradation of performance when processing many records
you will eventually run out of memory.
If possible I always try to do large batch operation on the database using pure SQL (as the post above states clearly), but in this case that wasn't an option. So to solve this, what NHibernate has is a 'Clear' method on the session, which throws away all object in memory that refer to database records (new ones, added ones, corrupt ones...)
I tried to mimic this method in entity framework as follows (using the post described above):
public partial class MyEntities
{
public IEnumerable<ObjectStateEntry> GetAllObjectStateEntries()
{
return ObjectStateManager.GetObjectStateEntries(EntityState.Added |
EntityState.Deleted |
EntityState.Modified |
EntityState.Unchanged);
}
public void ClearEntities()
{
foreach (var objectStateEntry in GetAllObjectStateEntries())
{
Detach(objectStateEntry.Entity);
}
}
}
The GetAllObjectStateEntries() method is taken separately because it's useful for other things. This goes into a partial class with the same name as your Entities class (the one EF generates, MyEntities in this example), so it is available on your entities instance.
I call this clear method now every 1000 records I process and my application that used to run for about 70 minutes (only about 400k entities to process, not even millions) does it in 25mins now. Memory used to peak to 300MB, now it stays around 50MB

Resources