IEnumerable vs IQueryable in OData and Repository Pattern - odata

I watched this video and read this blog post. There is something in this post confused me; The last part of the post. In the last part Mosh emphasized, Repository should never return IQueryable, because it results in performance issue. But I read something that sounds contradictory.
This is the confusing part:
IEnumerable: While querying data from database, IEnumerable executes select query on server side, load data in-memory on client side and then filter data. Hence does more work and becomes slow.
IQueryable: While querying data from database, IQueryable executes select query on server side with all filters. Hence does less work and becomes fast.
this is another answer about IQueryable vs IEnumerable in Repository pattern.
These are opposite of Mosh's advice. If these are true, why we should not use IQueryable instead of IEnumerable.
And something else, What about situations that we want to use OData; As you know it’s better to use IQueryable instead of IEnumerable when querying by OData.
one more thing, is it good or bad to use OData for querying e-Commerce website APIs.
please let me know your opinion.
Thank you

A repository should never return a IQueryable. But not due to performance. It's due to complexity. A repository is about reducing the complexity in the business layer.
Buy exposing an IQueryable you increase the complexity in two ways:
You leak persistence knowledge to the business domain. There is things that you must know about the underlying Linq to Sql provider to write effective queries.
You must design the business entities so that querying them is possible (i.e. not pure business entities).
Examples:
var blockedUsers = _repository.GetBlockedUsers();
//vs
var blockUsers = _dbContext.Users.Where(x => x.State == 1);
var user = _repos.GetById(1);
//and an enum is used internally in the user class
user.Block();
_repos.Update(user);
// vs
var user = _dbContext.Users.FirstOrDefault(x => x.Id == 1);
user.State = 1;
_dbContext.SaveChanges();
By wrapping everything behind your repository, you design your business entities in a way that make it easy to work with them (child entites, enums, date management etc). And you design the repository so that those entities can be stored in an efficient way. No compromises and code that is more easily maintained.
Regarding OData: Do not use the repository pattern. It doesn't add any value in that case.
If you insist on using IQueryable in your business domain, do not use the repository pattern. It would only complicate things without adding any value.
Finally:
Business logic that uses properly designed repositories is so much easier to test (unit tests). Code where LINQ and business logic is mixed must ALWAYS be integration tests (against a DB) since Linq to Sql differs from Linq to Objects.

Related

Many Duplicate Queries in Entity Framework 5 Code-First(n+1)

One of our contractors implemented a repository pattern with code first approach. We use Service Locator as DI pattern. what we do when we retrieve data from DB, we pass interface to GetQueryable function and get the data. However, I see serious performance issues on our application. I implemented MiniProfiler and MiniProfiler.EF to see where the bottleneck is.
We have a case table which has quite a few fields(around 25) and some of those fields are associated to other tables as one to one and one to many(only one field has many relation to other table). when I try to see the case detail, it runs around 400 SQL queries and SQL takes around 40 percent of the load time as far as the miniprofiler concerned. Here our GetQueryable and Find methods
public IQueryable<T> GetQueryable<T>(params string[] includes)
{
Type type = _impls.Value[typeof (T).Name].GetType();
DbSet dbSet = Db.Set(type);
foreach (var include in includes)
{
dbSet.Include(include);
}
return ((IQueryable<T>) dbSet);
}
I added included to this method to attach other related tables, but it did not make any difference. and here is the Find Method
public T Find<T>(long? id)
{
Type type = _impls.Value[typeof(T).Name].GetType();
return (T) Db.Set(type).Find(id);
}
I pretty much tried to apply all the performance improvements, but the number of the SQL queries has not gone down. I tried to disable lazy loading, but it caused many problems in other parts of the application.
Just some additional information, in case table, there are 70000 rows and in out dialogs table, there are 500000 rows. Case and Dialog are associates as one-to-many. and each case has 20-40 dialog entries.
My questions are;
Why does include not make any difference when I use?
Is there any other way to crop number of the queries run?
Do you think the implementation is the problem?
Thanks
Include returns a new IQueryable and does not modify the source query. In addition you can use the generic version of Set which simplifies the code a bit:
public IQueryable<T> GetQueryable<T>(params string[] includes)
{
IQueryable<T> query = Db.Set<T>();
foreach (var include in includes)
{
query = query.Include(include);
}
return query;
}
Step 1: Fire your contractor. Seriously. Like right now. That is some awful code. Not only did they miss something as simple and basic as using the generic version of Set, but they've successfully only made working with Entity Framework more complex, because all the repository does is proxy Entity Framework methods with its own unique and bastardized API.
That said, there's really not enough here to diagnose what your problem is. The use of Include may give you larger queries, but it should actually serve to reduce the overall number of queries issued. It's possible, you're just not using includes where you should be.
Now, the fact that you "tried to disable lazy loading, but it caused many problems in other parts of the application", means that you're relying too heavily on lazy-loading. Basically, you're loading in stuff you don't even know about, which is the antithesis of optimization. Ironically, you'd actually be best served by going ahead and disabling lazy-loading, and then tracking down where your code fails because of that. If you want to actually lazy-load that thing, you can use .Load (see: Explicit Loading). But, if you want to eager-load to reduce queries, then you know what includes you need to add.

MVC Pattern - Is this the correct approach for Repository / Unit of Work

I have been doodling and reading and just want to ensure the approach I am taking is correct. I am using MVC5 with EF, implementing the Repository and Unit of Work patterns.
EntityModel -> <- SomeRepository
SomeRepository -> <- SomeController
SomeController -> SomeViewModel
SomeViewModel -> SomeView
SomeView -> SomeController
SomeController -> <- SomeRepository
etc ..
In the controller I am planning on using something like AutoMapper to map the ViewModel to the EntityModel (and vice versa) which can then be passed to my repository / view.
Also, with this approach I am not 100% sure where my business logic should go. For instance, if I have an EntityModel for products and I wanted to add a GetAssociatedProducts method, would this go against the EntityModel or should another tier be introduced so the EntityModel is just a straightforward mapping class to the DB?
Should the ViewModel contain any logic at all? i.e Creating a Dictionary to populate a dropdown on the view based on values from the EntityModel?
I am trying to avoid the issues associated with just starting to code without thinking to much into how which is the reason for this question.
Note: I am also implementing IoC with Autofac but I don't think that's relevant at this point (saying just in case it is).
Well, you're already thinking too much.
First, since you specifically mention MVC, let me just say that the vast majority of what you're talking about is not MVC. MVC stands for Model-View-Controller. In the strictest sense, your model is the haven of all business logic for your application. The controller merely connects your model to your view, and your view merely presents the data to the client in a readable format.
Despite its name, ASP.NET MVC does not truly follow the MVC pattern. You could call it Microsoft's take on MVC. The controller and views track pretty closely (though there is some very noticeable and repugnant bleed-over, such as ViewBag). But, the "model" bit is very unclearly defined. Since, Entity Framework is integrated, most latch on to the entity and call this the model, but entities are really bad models. They're just (or at least should just be) a programmatic representation of a database table: a way for Entity Framework to take data from your table rows and put it into some structure that lets you get at it easily.
If you look at other MVC implementations such as Ruby on Rails or Django, their "model" is more of a database-backed factory. Instead of the class simply holding data returned from the database, it is itself the gateway to the database for that type. It can create itself, update itself, query itself and its colleagues, etc. This allows you to add much more robust business logic to the class than you can or should with an "entity" in C#. Because of that, the closest you can get a true MVC model is your domain or service layer, which isn't really factored in at all by default in ASP.NET MVC.
That said, though, if you're implementing a repository / unit of work pattern with Entity Framework, you're probably making a mistake. Entity Framework already does this so you don't have to. The DbContext is your Unit of Work and each DbSet is a repository. Any repository you create, dimes to dollars, will simply end up proxying your repository methods to methods on your DbSet, which is your first sign that something's not right. Now, that's not to say that a certain amount of abstraction isn't still a good idea, but go with something like a service pattern instead: something lightweight and flexible that will truly abstract logic instead of just creating a matryoshka doll of code that will only serve to make your application harder to maintain.
Finally, your view model (which is actually a rip from the MVVM pattern) should simply be whatever your view needs it to be. If your view needs a drop down list, then your view model should contain that. Whether your view model should generate it, is a slight different question that depends on the complexity of the data involved. I don't think your view model should know how to query a database, so if you need to pull the data from a database then you should let the controller handle that and just feed it to the view model. But, if it's something like a list of months, a enum structure, a numerically static list, etc., it might be appropriate for the view model to have the logic to construct that list.
UPDATE
No, they are actually implementing a repository. I'm not sure why in the world the introductory MVC articles on MSDN advocate this, but as one who fell into the same trap early on, I can say from personal experience, and many other long-time MVC developers will tell you the same, you don't want to actually follow this advice. Like I said, most of your repository methods end up just proxying to Entity Framework methods, and you end up having to add a ton of boilerplate code for each new entity. And, the further you go down the rabbit hole, the harder it is to recover, leading inevitably to some major refactoring once you finally grow tired of the repetitive code.
A service pattern is a lot simpler. There may still be some proxying for things like updates and deletes, where there's very little unique from one entity to another, but the real difference will be seen with selects. With a repository, you'd do something like the following in your controller:
repo.Posts.Where(m => m.BlogId = blog.Id && m.PublishDate <= DateTime.Now && m.Status == PostStatus.Published).OrderByDescending(o => o.PublishDate).Take(10).ToList();
While with a service you would do:
service.Posts.GetPublishedPostsForBlog(blog, limit: 10);
And all that logic about what is a "published" post, how blog is connected to post, etc., goes into your service method instead of your controller. The other big difference is that service methods should return fully-baked data, i.e. a list type rather than a queryable. The goal with a service is to return exactly what you need, while the goal with a repository is to provide an endpoint to query into.

MVC 4 Strongly Typed Data to View without ORM

I've seen other questions regarding using straight SQL for MVC data access, for example here. Most responses don't answer the question but ask
"Why would you not want to use ORM, EF, Linq, etc?"
My group does custom reporting out of a data warehouse that requires a lot of complex, highly tuned Oracle queries that are manipulated based on user GUI parameter selections.
My newest project is to develop a SQL plugin reporting tool for SQL report developers. They would create a pre-tuned SQL for the report with pseudo parameters and enter (and store) via the GUI. Then the GUI would prompt them for the parameter definitions (name and type) that need to be displayed/requested at run time to ultimately replace the pseudo variables.
So a SQL statement may look like:
SELECT * FROM orders WHERE order date BETWEEN '<Date1>' AND '<Date2>'
And the report developer would then, via the GUI, add two parameters named Date1 and Date2, and flag them as date fields.
End users would then select the report, get prompted for Date1 and Date2, and the GUI would do the substitution and run the SQL.
As you can see, I have no choice but to use straight SQL (especially in the 2nd example and understand I would have to forgo strongly typed in the 2nd also).
So my questions are:
When is it necessary to bypass EF/Linq (and there are definitely reasons to), what is best practice in MVC 4?
And how best to strongly type when I do know the output columns ahead of time?
And CRUD processing?
Can anyone point me to any examples of non-EF/Linq based coding in this regard?
I think this is a bit open ended question, so here's my 2c. (If tl dr, go to last section)
To me, it's not so much "by passing EF/Linq", but rather, the need to choose the appropriate data persistence library. I have used PetaPoco, Ado.Net, NHibernate/ActiveRecord, Linq2Sql, EF (My main choice) with MVC.
Best practice actually comes from realising that Controllers are STILL a part of presentation layer, and that it should not deal with anything other than HttpContext related operations + calling business logic service classes.
I arrange my classes as:
Presentation (MVC) -> Logic Services (Simple classes) -> Data Access (Context wrapped in "repositories").
So I can't quite imagine whether to use EF or not would have any implication on asp.net MVC.
For me, Data Access returns data in DTO, e.g.
public List GetAllFoos()
Whether that method string concatenate from a xml, etc or do a simple Context.Foos.ToList() is irrelevant to the rest of the application. All I care is Data Access do NOT return me a DataSet with string matching for columns. Those stay in DAL.
See point 1 and 2. My repositories have CRUD methods on it. How it's done is irrelevant to the rest of application. Take one of the most basic interface to my repositories classes:
public interface IFooRepository
{
void Save(Foo foo)
Foo Get(int id)
void Create(Foo foo)
void Delete(int id)
}
One point not mentioned yet, DI is also crucial. The concrete implimentation "FooRepository" may choose to request dependencies such as Web services, context classes, etc. Those are, however, again, completely irrelevant to the caller who depends on the interface.
If you still require an example after the 3 points above, drop a comment and I'll whip up something extremely simple using Ado.net.
===========================================================================
To EF or not to EF.
For me, if starting a new project with new schema, I use EF code first.
Fitting new code to old database + old project has no ORM mapping I can reuse = PetaPoco.
===========================================================================
In the context of your project:
The "SQL plugin reporting tool for SQL report developers". "The" sql reporting service? I'm not sure why you need to do anything? Doesn't SSRS already do that? (Enter sql statement/data source, generate form for parameter, etc).
If not I'd question the design decision. IMVHO, the need for users of an application (I don't care if it's "report developer" or w/e) to enter SQL statements is usually stemmed from "architectural astronauts". How do you debug the SQL statement when you enter via GUI as a string? How do you know the tables and the relationships? You either dig into SSMS and come back to gui, or you build complex UI (aka rebuild SSMS).
At the end of day, if you want bazillion reports for gazillion different users, you have to pay for it. I see too many "architectural astronauts" who exposes application to accept SQL statements only to make everyone waste time guessing what should be put into it. No cost saving at all.
Ok, if you must do that, well eh... Good luck. Best bet is to return as a DataTable and dump the rows/columns/data on to the view with nested foreach looping through rows then columns.

Proper implementation of Repository Pattern using IQueryables?

I'm building a Repository layer for my MVC application with methods like GetObject, UpdateObject, DeleteObject, etc.
This is what I have now:
public List<Object> GetObjects()
{
return _db.Objects.Where(o => o.IsArchived == false).ToList();
}
But I'm wondering if it would be better to return IQueryables for lists so that the least amount of data gets sent to the client when filters are applied in the UoW or Service layers. Would it be best to do something like this?
public IQueryable<Object> GetObjects()
{
return _db.Objects.Where(o => o.IsArchived == false);
}
The not nice thing about returning IQueryable, is that if you ever have a different implementation of repository, say using different ORM, storing data in non-SQL database, cloud or XML file, it would be hard to implement same interface. It would be much easier to implement if you return more generic colections of domain objects. For example IEnumerable. You can always pass filtering criteria in.
The other drawback of returning IQueryable, is that it may happen, that when you actually run the query your object context may be already disposed (Depending on your implementation) or may be kept in memory longer than required.
A leaky abstraction such as IQueryable could cause problems, for example imagine you want to get some data from database and order it by Guid. If you enumerate the query by calling ToList() prior to sorting, you'll get different results if you do it after. The reason is that in first case the sorting will happen in .NET, but in other case it will happen in SQL which uses completely different order.
The nice thing about returning IQueryable here is that you can continue to build up your query further without hitting the db. Once you call ToList it will hit the db and you can't customize your query further without hitting the database a second time.

Time to start returning IQueryable<T> instead of IList<T> to my Web UI / Web API Layer?

I've got a multi-layer application that starts with the repository pattern for all data access and it returns IQueryable to the Services layer. The Services layer, which includes all of the business logic, returns IList to the Controllers (note: I'm using ASP.NET MVC for the UI layer). The benefit of returning IQueryable in the data access layer is that it allows my repositories to be extremely simple and the database queries to be deferred. However, I'm triggering the database queries in my services layer so that my unit tests is more reliable and I don't give flexibility to the Controllers to reshape my queries. However, I've recently encountered several situations where deferring the execution of queries down to the Controllers would have been significantly more performant because the Controllers had to do some projections on the data that was UI specific. Additionally, with the emergence of things like oData, I was starting to wonder if end points (e.g. web UI or web apis) should be working directly with IQueryable. What are your thoughts? Is it time to start returning IQueryable from the services layer to the UI layer? Or stick with IList?
This thread here: To return IQueryable<T> or not return IQueryable<T>
seems to vouch for returning IList to the UI layers, but I was wondering if things are changing because of new emerging technologies and techniques.
I like to stick with the IQueryable Interface when possible, the only problem is when you end up doing complex filtering or re-query on demand at the Controller level, if you have something like:
//DATA ACCESS
public IQueryable<T> GetStudents()
{
return db.Students;
}
And in your controller you do some re-sharping because your client want to filter some data of that result, surely you will be tempted to do it at the controller level:
var result = obj.GetStudents().Where(d=>d...);
and for me its ok, but just imaging if any other module need to use that same filter, you cant call it because its on the controller level.
So for me its a thing of balance between DRY, flexibility, and how scalable is the system.
If you need a fully scalable system you will need to do some or several overloads to GetStudents() method and get rid of any re-sharping at the controller level.

Resources