EF4, CodeFirst and Respository Pattern - Difference between using DbSet & DataContext - entity-framework-4

Trying to migrate my existing EF 4.2 project to use the Repository and Unit of Work patterns. In many people's samples I see them use the DbSet collections in the repositories, but to me this appears limiting as I can't use things like .Include(). Then again there are other samples using the DataContext (like this one http://www.efekaptan.com/repository-pattern-with-entity-framework-code-first-4.1).
So... is there a reason why I wouldn't want to use the DataContext?

You can use Include with DbSet<T>. You should reference DbContext in your repository (pass it to repository instance through constructor) because it is required for more advanced operations. Storing reference to DbSet<T> is just simplificatoin / optimization to avoid calling Set<T>() on the context each time you want to access it.
You must not create instance of the context in the repository implementation (as showed in your linked question because it would go against Unit of Work.

Related

Unit of Work with Dependency Injection

I'm building a relatively simple webapp in ASP.NET MVC 4, using Entity Framework to talk to MS SQL Server. There's lots of scope to expand the application in future, so I'm aiming for a pattern that maximises reusability and adaptability in the code, to save work later on. The idea is:
Unit of Work pattern, to save problems with the database by only committing changes at the end of each set of actions.
Generic repository using BaseRepository<T> because the repositories will be mostly the same; the odd exception can extend and add its additional methods.
Dependency injection to bind those repositories to the IRepository<T> that the controllers will be using, so that I can switch data storage methods and such with minimal fuss (not just for best practice; there is a real chance of this happening). I'm using Ninject for this.
I haven't really attempted something like this from scratch before, so I've been reading up and I think I've got myself muddled somewhere. So far, I have an interface IRepository<T> which is implemented by BaseRepository<T>, which contains an instance of the DataContext which is passed into its constructor. This interface has methods for Add, Update, Delete, and various types of Get (single by ID, single by predicate, group by predicate, all). The only repository that doesn't fit this interface (so far) is the Users repository, which adds User Login(string username, string password) to allow login (the implementation of which handles all the salting, hashing, checking etc).
From what I've read, I now need a UnitOfWork class that contains instances of all the repositories. This unit of work will expose the repositories, as well as a SaveChanges() method. When I want to manipulate data, I instantiate a unit of work, access the repositories on it (which are instantiated as needed), and then save. If anything fails, nothing changes in the database because it won't reach the single save at the end. This is all fine. My problem is that all the examples I can find seem to do one of two things:
Some pass a data context into the unit of work, from which they retrieve the various repositories. This negates the point of DI by having my Entity-Framework-specific DbContext (or a class inherited from it) in my unit of work.
Some call a Get method to request a repository, which is the service locator pattern, which is at least unpopular, if not an antipattern, and either way I'd like to avoid it here.
Do I need to create an interface for my data source and inject that into the unit of work as well? I can't find any documentation on this that's clear and/or complete enough to explain.
EDIT
I think I've been overcomplicating it; I'm now folding my repository and unit of work into one - my repository is entirely generic so this just gives me a handful of generic methods (Add, Remove, Update, and a few kinds of Get) plus a SaveChanges method. This gives me a worker class interface; I can then have a factory class that provides instances of it (also interfaced). If I also have this worker implement IDisposable then I can use it in a scoped block. So now my controllers can do something like this:
using (var worker = DataAccess.BeginTransaction())
{
Product item = worker.Get<Product>(p => p.ID == prodName);
//stuff...
worker.SaveChanges();
}
If something goes wrong before the SaveChanges(), then all changes are discarded when it exits the scope block and the worker is disposed. I can use dependency injection to provide concrete implementations to the DataAccess field, which is passed into the base controller constructor. Business logic is all in the controller and works with IQueryable objects, so I can switch out the DataAccess provider object for anything I like as long as it implements the IRepository interface; there's nothing specific to Entity Framework anywhere.
So, any thoughts on this implementation? Is this on the right track?
I prefer to have UnitOfWork or a UnitOfWorkFactory injected into the repositories, that way I need not bother it everytime a new reposiory is added. Responsibility of UnitOfWork would be to just manage the transaction.
Here is an example of what I mean.

Is it considered bad design to pass a repository interface as an argument to a method on a domain class?

Our domain model is very anemic right now. Our entities are mostly empty shells, almost purely designed for holding values and navigating to collections.
We are using EF 4.1 code-first ORM, and the design so far has been to shield our novice developers against the dreaded "LINQ to Entities cannot translate blablabla to a store expression" exception when querying against the context during early iterations.
We have various aggregate root repository interfaces over EF. However some blocks of code in the impls seems like they should be the domain's responsibility. As long as the repository interface is declared in the domain, and the impl is in the infrastructure (dependency injected), is it considered bad design to pass a repository interface as an argument to a method on an entity (or other domain) class?
For example, would this be bad?
public class EntityAbc {
public void SaveTo(IEntityAbcRepository repos) {...}
public void DeleteFrom(IEntityAbcRepository repos) {...}
}
What if a particular entity needed access to other aggregate root repositories? Would this be ok or not, and why?
public void Save() {
var abcRepos = DependencyInjector.Current.GetService<IEntityAbcRepository>();
var xyzRepos = DependencyInjector.Current.GetService<IEntityXyzRepository>();
// work with repositories
}
Update 1
I did not mention moving code to an application layer because I consider some of the code that uses IEntityAbcRepository to involve business rule enforcement. The repository impl should be as vanilla as possible, right? Its main responsibility should just be a simple abstraction over the ORM, allowing you to find / add / update / delete entities. Wrong?
Also, this question applies to methods on other non-entity domain classes -- factories, services, whatever pattern may be appropriate. Point being, I'm asking the question about any method on a domain class, not just an entity class. #Eranga, this is one place where you can use constructor injection because factories & services are not part of the ORM.
The application layer could then coordinate flow by injecting a repository impl into its constructor, and passing it as an argument to a domain service or factory. Is this bad practice?
Update 2
Adding another clarification here. What if the domain only needs access to the IEntityAbcRepository in order to execute its Find() method(s)? In the example above, the SaveTo and DeleteFrom methods would not invoke any add / update / delete methods on the repository interface.
So far we've combined the find / add / update / delete methods on a single aggregate root repository interface for simplicity. But I suppose there's nothing stopping us from separating them out into 2 interfaces, like so:
IEntityAbcReadRepository <-- defines all find method signatures
IEntityAbcWriteRepository <-- defines all add / update / delete method sigs
In this case, would it be bad practice to pass IEntityAbcReadRepository as a parameter to a domain method?
Your first approach is better compared to the second approach which uses "Service Locator" pattern. Dependencies are more obvious in the first approach.
Here are some links that explains why "Service Locator" is a bad choice
Is it bad to use servicelocation instead of constructor injection
...
Singleton Vs ServiceLocator
Say no to ServiceLocator
Both of these solutions stem from the fact that EF does not allow you to use constructor injection. However you can use property injection as explained in this answer. But that does not guarantee that mandatory dependencies are present.
So your first approach is the better solution.
Short answer: Yes!
Long answer:
Consider creating an AbcService in your application service layer. This service layer sits between your domain and your infrastructure. You can inject as many repositories into AbcService as you want. Then let the service handle SaveTo and DeleteFrom.
SaveTo and DeleteFrom, unless you are saving to and deleting from another entity, i.e. no data access is involved, are methods that sound like they shouldn't be on a domain entity, IMO.
Having persistence logic in your domain entities is IMO bad design in the first place. Good separation of concerns should mean that domain/business logic is separated from persistence logic, so your domain classes should be persistence ignorant.
Previous Entity Framwork versions might not have allowed such a separation but I think most recent versions solved that problem. I'm not that familiar with EF though, so I might be wrong.
With that said, where can you put methods such as Save() and Delete() ?
If you want to add to/remove your entity from its repository, Repository.Add() and Repository.Remove() are good choices. A repository basically serves as an illusion of an in-memory collection of your entities, so it makes sense for it to behave just like a collection or a list with the appropriate methods.
If you want to persist changes made to an existing entity, there are other ways to do that. You could have a Repository.Save() method but some consider it bad practice. Oftentimes the changes are part of a higher level operation handled in a transaction-like context such as a Unit of Work, in that case you can let the operation persist all the objects in its scope when it finishes. For instance, if you use an Open Session in View approach for your web application, changes are automatically persisted when the request ends.
Or you can rely on an ad-hoc call of your ORM's Save() method for your particular entity which hopefully shouldn't be grafted onto the entity code itself (with NHibernate, for instance, it's available at runtime on the proxied entity).
[Update]
Putting that in perspective with your subsequent questions (though I'm not sure I understand all of them well) :
I see no value in splitting your repository into a ReadRepository and a WriteRepository. In DDD, a repository's responsibility is clearly to provide a collection to query from as well as add to or remove from. It's still quite cohesive that way.
It's not an entity's responsibility to fiddle with its own persistence, so it shouldn't be aware of its own repository for that precise purpose. Otherwise, it's pretty rare that an entity rightfully needs to have knowledge of its own repository (usually it means that the entity has a relationship to another entity of the same type, like parent/child, and you want to get the other entity from the repository)
However, entities and other domain objects obviously do need to obtain references to other entities at times. In that case, try to get these references through traversal of other objects within the boundary of your aggregate first before looking for a repository. If you absolutely need a repository to get the object you want, it's a good idea to inject the repository through any flavour of injection you like. As Eranga pointed out, service locator might turn out to be a sub-par dependency injection ersatz though.
Last thing, the kind of injection you mentioned - SaveTo(IEntityAbcRepository repos) - is peculiar because it is neither constructor nor setter injection, but rather an ephemeral injection lasting just the time of a method. It implies that whoever calls your method must know what repository to pass at that precise moment, which is not obvious. It might be useful, but I'd say it's not the form of injection you would typically mainly use.

Do we need to use the Repository pattern when working in ASP.NET MVC with ORM solutions?

I am bit curious as to what experience other developers have of applying the Repository pattern when programming in ASP.NET MVC with Entity Framework or NHibernate. It seems to me that this pattern is already implemented in the ORM themselves. DbContext and DbSet<T> in the Entity Framework and by the ISession in NHibernate. Most of the concerns mentioned in the Repository pattern - as catalogued in POEE and DDD - are pretty adequately implemented by these ORMs. Namely these concerns are,
Persistence
OO View of the data
Data Access Logic Abstraction
Query Access Logic
In addition, most of the implemententations of the repository pattern that I have seen follow this implementation pattern - assuming that we are developing a blog application.
NHibernate implementation:
public class PostRepository : IPostRepository
{
private ISession _session;
public PostRepository(ISession session)
{
_session = session;
}
public void Add(Post post)
{
_session.Save(post);
}
// other crud methods.
}
Entity Framework:
public class PostRepository : IPostRepository
{
private DbContext _session;
public PostRepository(DbContext session)
{
_session = session;
}
public void Add(Post post)
{
_session.Posts.Add(post);
-session.SaveChanges();
}
// other crud methods.
}
It seems to me that when we are using ORMs - such as Nhibernate or Entity Framework - creating these repository implementation are redundant. Furthermore since these pattern implementations does no more than what is already there in the ORMS, these act more as noise than helpful OO abstractions. It seems using the repository pattern in the situation mentioned above is nothing more than developer self aggrandizement and more pomp and ceremony without any realizable techical benefits. What are your thoughts ??
The answer is no if you do not need to be able to switch ORM or be able to test any class that has a dependency to your ORM/database.
If you want to be able to switch ORM or be able to easily test your classes which uses the database layer: Yes you need a repository (with an interface specification).
You can also switch to a memory repository (which I do in my unit tests), a XML file or whatever if you use repository pattern.
Update
The problem with most repository pattern implementations which you can find by Googling is that they don't work very well in production. They lack options to limit the result (paging) and ordering the result which is kind of amazing.
Repository pattern comes to it's glory when it's combined with a UnitOfWork implementation and has support for the Specification pattern.
If you find one having all of that, let me know :) (I do have my own, exception for a well working specification part)
Update 2
Repository is so much more than just accessing the database in a abstracted way such as can be done by ORM's. A normal Repository implementation should handle all aggregate entities (for instance Order and OrderLine). Bu handling them in the same repository class you can always make sure that those are built correctly.
But hey you say: That's done automatically for me by the ORM. Well, yes and no. If you create a website, you most likely want to edit only one order line. Do you fetch the complete order, loop through it to find the order, and then add it to the view?
By doing so you introduce logic to your controller that do not belong there. How do you do it when a webservice want's the same thing? Duplicate your code?
By using a ORM it's quite easy to fetch any entity from anywhere myOrm.Fetch<User>(user => user.Id == 1) modify it and then save it. This can be quite handy, but also add code smells since you duplicate code and have no control over how the objects are created, if they got a valid state or correct associations.
The next thing that comes to mind is that you might want to be able to subscribe on events like Created, Updated and Deleted in a centralized way. That's easy if you have a repository.
For me an ORM provides a way to map classes to tables and nothing more. I still like to wrap them in repositories to have control over them and get a single point of modification.
I think it make sense only if you want to decrease level of dependency. In the abstract you can have IPostRepository in your infrastructure package and several independent implementations of this interface built on top of EF or NH, or something else. It useful for TDD.
In practice NH session (and EF context) implements something like the "Unit of Work" pattern. Furthermore with NH and the Repository pattern you can get a lot of bugs and architectural issues.
For example, NH entity can be saved bypassing your Repository implementation. You can get it from session (Repository.Load), change one of its properties, and call session.Flush (at the end of request for example, because Repository pattern doesn't suppose flushing) - and your changes will be successfully processed in db.
You've only mentioned basic CRUD actions. Doing these directly does mean you have to be aware of transactions, flushing and other things that a repository can wrap up, but I guess the value of repositories becomes more apparent when you think about complex retrieval queries.
Imagine then that you do decide to use the NHibernate session directly in your application layer.
You will need to do the equivalent of WHERE clauses and ORDER BYs etc, using either HQL or NHibernate criteria. This means your code has to reference NHibernate, and contains ideas specific to NHibernate. This makes your application hard to test and harder for others unfamiliar with NH to follow. A call to repository.GetCompletedOrders is much more descriptive and reusable than one that includes something like "where IsComplete = true and IsDeleted = false..." etc.
You could use Linq to NHibernate instead, but now you have the situation where you can easily forget that you're working on an IQueryable. You could end up chaining Linq expressions which generate enormous queries when they execute, without realising it (I speak from experience)! Mike Hadlow sparked a conversation on essentially this topic in his post Should my repository expose IQueryable.
N.b. If you don't like having lots of methods on custom repositories for different queries (like GetCompletedOrders), you can use specification parameters (like Get(specification)), which allow you to specify filters, orderings etc. without using data access language.
Going back to the list of benefits of repository that you gave:
Persistence
OO View of the data
Data Access Logic Abstraction
Query Access Logic
You can see that points 3 and 4 are not provided for by using the persistence framework classes directly, especially in real world retrieval scenarios.

Access to Entity Manager in ASP .NET MVC

Greetings,
Trying to sort through the best way to provide access to my Entity Manager while keeping the context open through the request to permit late loading. I am seeing a lot of examples like the following:
public class SomeController
{
MyEntities entities = new MyEntities();
}
The problem I see with this setup is that if you have a layer of business classes that you want to make calls into, you end up having to pass the manager as a parameter to these methods, like so:
public static GetEntity(MyEntities entityManager, int id)
{
return entityManager.Series.FirstOrDefault(s => s.SeriesId == id);
}
Obviously I am looking for a good, thread safe way, to provide the entityManager to the method without passing it. The way also needs to be unit testable, my previous attempts with putting it in Session did not work for unit tests.
I am actually looking for the recommended way of dealing with the Entity Framework in ASP .NET MVC for an enterprise level application.
Thanks in advance
Entity Framework v1.0 excels in Windows Forms applications where you can use the object context for as long as you like. In asp.net and mvc in particular it's a bit harder. My solution to this was to make the repositories or entity managers more like services that MVC could communicate with. I created a sort of generic all purpose base repository I could use whenever I felt like it and just stopped bothering too much about doing it right. I would try to avoid leaving the object context open for even a ms longer than is absolutely needed in a web application.
Have a look at EF4. I started using EF in production environment when that was in beta 0.75 or something similar and had no real issues with it except for it being "hard work" sometimes.
You might want to look at the Repository pattern (here's a write up of Repository with Linq to SQL).
The basic idea would be that instead of creating a static class, you instantiate a version of the Repository. You can pass in your EntityManager as a parameter to the class in the constructor -- or better yet, a factory that can create your EntityManager for the class so that it can do unit of work instantiation of the manager.
For MVC I use a base controller class. In this class you could create your entity manager factory and make it a property of the class so deriving classes have access to it. Allow it to be injected from a constructor but created with the proper default if the instance passed in is null. Whenever a controller method needs to create a repository, it can use this instance to pass into the Repository so that it can create the manager required.
In this way, you get rid of the static methods and allow mock instances to be used in your unit tests. By passing in a factory -- which ought to create instances that implement interfaces, btw -- you decouple your repository from the actual manager class.
Don't lazy load entities in the view. Don't make business layer calls in the view. Load all the entities the view will need up front in the controller, compute all the sums and averages the view will need up front in the controller, etc. After all, that's what the controller is for.

Repository Pattern vs DAL

Are they the same thing? Just finished to watch Rob Connery's Storefront tutorial and they seem to be similar techinques. I mean, when I implement a DAL object I have the GetStuff, Add/Delete etc methods and I always write the interface first so that I can switch db later.
Am I confusing things?
You're definitely not the one who confuses things. :-)
I think the answer to the question depends on how much of a purist you want to be.
If you want a strict DDD point of view, that will take you down one path. If you look at the repository as a pattern that has helped us standardize the interface of the layer that separates between the services and the database it will take you down another.
The repository from my perspective is just a clearly specified layer of access to data.Or in other words a standardized way to implement your Data Access Layer. There are some differences between different repository implementations, but the concept is the same.
Some people will put more DDD constraints on the repository while others will use the repository as a convenient mediator between the database and the service layer. A repository like a DAL isolates the service layer from data access specifics.
One implementation issue that seems to make them different, is that a repository is often created with methods that take a specification. The repository will return data that satisfies that specification. Most traditional DALs that I have seen, will have a larger set of methods where the method will take any number of parameters. While this may sound like a small difference, it is a big issue when you enter the realms of Linq and Expressions.
Our default repository interface looks like this:
public interface IRepository : IDisposable
{
T[] GetAll<T>();
T[] GetAll<T>(Expression<Func<T, bool>> filter);
T GetSingle<T>(Expression<Func<T, bool>> filter);
T GetSingle<T>(Expression<Func<T, bool>> filter, List<Expression<Func<T, object>>> subSelectors);
void Delete<T>(T entity);
void Add<T>(T entity);
int SaveChanges();
DbTransaction BeginTransaction();
}
Is this a DAL or a repository? In this case I guess its both.
Kim
A repository is a pattern that can be applied in many different ways, while the data access layer has a very clear responsibility: the DAL must know how to connect to your data storage to perform CRUD operations.
A repository can be a DAL, but it can also sit in front of the DAL and act as a bridge between the business object layer and the data layer. Which implementation is used is going to vary from project to project.
One large difference is that a DAO is a generic way to deal with persistence for any entity in your domain. A repository on the other hand only deals with aggregate roots.
I was looking for an answer to a similar question and agree with the two highest-ranked answers. Trying to clarify this for myself, I found that if Specifications, which go hand-in-hand with the Repository pattern, are implemented as first-class members of the domain model, then I can
reuse Specification definitions with different parameters,
manipulate existing Specification instances' parameters (e.g. to specialize),
combine them,
perform business logic on them without ever having to do any database access,
and, of course, unit-test them independent of actual Repository implementations.
I may even go so far and state that unless the Repository pattern is used together with the Specification pattern, it's not really "Repository," but a DAL. A contrived example in pseudo-code:
specification100 = new AccountHasMoreOrdersThan(100)
specification200 = new AccountHasMoreOrdersThan(200)
assert that specification200.isSpecialCaseOf(specification100)
specificationAge = new AccountIsOlderThan('2000-01-01')
combinedSpec = new CompositeSpecification(
SpecificationOperator.And, specification200, specificationAge)
for each account in Repository<Account>.GetAllSatisfying(combinedSpec)
assert that account.Created < '2000-01-01'
assert that account.Orders.Count > 200
See Fowler's Specification Essay for details (that's what I based the above on).
A DAL would have specialized methods like
IoCManager.InstanceFor<IAccountDAO>()
.GetAccountsWithAtLeastOrdersAndCreatedBefore(200, '2000-01-01')
You can see how this can quickly become cumbersome, especially since you have to define each of the DAL/DAO interfaces with this approach and implement the DAL query method.
In .NET, LINQ queries can be one way to implement specifications, but combining Specification (expressions) may not be as smooth as with a home-grown solution. Some ideas for that are described in this SO Question.
My personal opinion is that it is all about mapping, see: http://www.martinfowler.com/eaaCatalog/repository.html. So the output/input from the repository are domain objects, which on the DAL could be anything. For me that is an important addition/restriction, as you can add a repository implementation for a database/service/whatever with a different layout, and you have a clear place to concentrate on doing the mapping. If you were not to use that restriction and have the mapping elsewhere, then having different ways to represent data can impact the code in places it shouldn't be changing.
It's all about interpretation and context. They can be very similar or indeed very different, but as long as the solution does the job, what is in a name!
In the external world (i.e. client code) repository is same as DAL, except:
(1) it's insert/update/delete methods is restricted to have the data container object as the parameter.
(2) for read operation it may take simple specification like a DAL (for instance GetByPK) or advanced specification.
Internally it works with a Data Mapper Layer (for instance entity framework context etc) to perform the actual CRUD operation.
What Repository pattern doesn't mean:-
Also, I've seen people often get confused to have a separate Save method as the repository pattern sample implementation besides the Insert/Update/Delete methods which commits all the in-memory changes performed by insert/update/delete methods to database. We can have a Save method definitely in a repository, but that is not the responsibility of repository to isolate in-memory CUD (Create, Update, Delete) and persistence methods (that performs the actual write/change operation in database), but the responsibility of Unit Of Work pattern.
Hope this helps!
Repository is a pattern, this is a way to implement the things in standardized way to reuse the code as we can.
Advantage of using repository pattern is to mock your data access layer, so that you can test your business layer code without calling DAL code. There are other big advantages but this seems to be very vital to me.
From what I understand they can mean basically the same thing - but the naming varies based on context.
For example, you might have a Dal/Dao class that implements an IRepository interface.
Dal/Dao is a data layer term; the higher tiers of your application think in terms of Repositories.
So in most of the (simple) cases DAO is an implementation of Repository?
As far as I understand,it seems that DAO deals precisely with db access (CRUD - No selects though?!) while Repository allows you to abstract the whole data access,perhaps being a facade for multiple DAO (maybe different data sources).
Am I on the right path?
One could argue that a "repository" is a specific class and a "DAL" is the entire layer consisting of the repositories, DTOs, utility classes, and anything else that is required.

Resources