Can I delete a single child entity without loading the entire collection? -

I have 2 classes, like the below.
They can have very large collections - a Website may have 2,000+ WebsitePages and vice-versa.
class WebsitePage
public int ID {get;set;}
public string Title {get;set;}
public List<Website> Websites {get;set;}
class Website
public int ID {get;set;}
public string Title {get;set;}
public List<WebsitePage> WebsitePages {get;set;}
I am having trouble removing a WebsitePage from a Website. Particularly when removing a WebsitePage from mutliple Websites.
For example, I might have code like this:
var pageToRemove = db.WebsitePages.FirstOrDefault();
var websites = db.Websites.Include(i => i.WebsitePages).ToList();
foreach(var website in websites)
If each website Include() 2k pages, you can imagine it takes ages to load that second line.
But if I don't Include() the WebsitePages when fetching the Websites, there is no child collection loaded for me to delete from.
I have tried to just Include() the pages that I need to delete, but of course when saving that gives me an empty collection.
Is there a recommended or better way to approach this?
I am working with an existing MVC site and I would rather not have to create an entity class for the join table unless absolutely necessary.

No, you can't... normally.
A many-to-many relationship (with a hidden junction table) can only be affected by adding/removing items in the nested collections. And for this the collections must be loaded.
But there are some options.
Option 1.
Delete data from the junction table by raw SQL. Basically this looks like
"DELETE FROM WebsiteWebsitePage WHERE WebsiteID = x AND WebsitePageID = y"));
(not using parameters).
Option 2.
Include the junction into the class model, i.e. map the junction table to a class WebsiteWebsitePage. Both Website and WebsitePage will now have
public ICollection<WebsiteWebsitePage> WebsiteWebsitePages { get; set; }
and WebsiteWebsitePage will have reference properties to both Website and WebsitePage. Now you can manipulate the junctions directly through the class model.
I consider this the best option, because everything happens the standard way of working with entities with validations and tracking and all. Also, chances are that sooner or later you will need an explicit junction class because you're going to want to add more data to it.
Option 3.
The box of tricks.
I tried to do this by removing a stub entity from the collection. In your case: create a WebsitePage object with a valid primary key value and remove it from Website.WebsitePages without loading the collection. But EF doesn't notice the change because it isn't tracking Website.WebsitePages, and the item is not in the collection to begin with.
But this made me realize I had to make EF track a Website.WebsitePages collection with 1 item in it and then remove that item. I got this working by first building the Website item and then attaching it to a new context. I'll show the code I used (a standard Product - Category model) to prevent typos.
Product prd;
// Step 1: build an object with 1 item in its collection
Category cat = new Category { Id = 3 }; // Stub entity
using(var db = new ProdCatContext())
db.Configuration.LazyLoadingEnabled = false;
prd = db.Products.First();
// Step 2: attach to a new context and remove the category.
using(var db = new ProdCatContext())
db.Configuration.LazyLoadingEnabled = false;
db.SaveChanges(); // Deletes the junction record.
Lazy loading is disabled, otherwise the Categories would still be loaded when prd.Categories is addressed.
My interpretation of what happens here is: In the second step, EF not only starts tracking the product when you attach it, but also its associations, because it 'knows' you can't load these associations yourself in a many to many relationship. It doesn't do this, however, when you add the category in the first step.


Best practices with DTOs in ASP.NET MVC Entity Framework

What's the most preferred way to work with Entity Framework and DTOs?
Let's say that after mapping I have objects like:
int id
sting name
List<Book> books
int id
string name
Author author
int authorID
int id
sting name
int id
string name
int authorID
Since author can have a lot of books I don't want to retrieve all of them, when for example I'm only interested in authors.
But sometimes I might want to get few authors and filtered books or all books.
I could go with multiple queries AuthorDTO GetAuthor(int id) List<BookDTO> GetBooks(int authorID). But that means several accesses to database.
The ways I see it:
If I had in AuthorDTO field List<BookDTO> books the job could be done. But sometimes I would keep this list empty, if for example I listed only authors. And that means some unconsistency, mess and a lot of details to remember.
Return Tuple<AuthorDTO, List<BookDTO>> it might be a bit confusing.
Define new DTO.
AuthorDTO author
List<BookDTO> books
The problem with sticking to a sinlge AuthorDTO and selectively filling the List is that you are now forced to keep track of where that DTO came from. Is the list of Books not hydrated, or does this Author simply have no books? Do I have to go back to my controller and call a different method to get a different state of the same DTO? This lacks clarity from the consumer's standpoint.
In my experience, I've leaned the way of more DTOs instead of trying to re-use a set of basic DTOs to represent multiple different sets of data. It requires a bit more "boilerplate", having to set up a whole bunch of similar DTOs and mappings between DTO and Entity, but in the end the specificity and clarity makes the codebase easier to read and manage.
I think some clarification of the issues involved will actually solve your confusion here.
First and most importantly, your entity classes are DTOs. In fact, that's all they are. They're classes that represent a table structure in your database so that data from queries Entity Framework makes can be mapped on to them. In other words, they are literally objects that transfer data. The failing of Microsoft and subsequently far too many MVC developers is to conflate them with big-M Models described by the MVC pattern.
As a result, it makes absolutely zero sense to use Entity Framework to return one or more instances of an entity and then map that to yet another DTO class before finally utilizing it in your code. All you're doing is creating a pointless level of abstraction that adds nothing to your application but yet another thing to maintain.
As far as relationships go, that's where Entity Framework's lazy/eager loading comes in. In order to take advantage of it, though, the property representing the relationship must follow a very specific convention:
public virtual ICollection<Book> Books { get; set; }
If you type it as something like List<Book>, Entity Framework will not touch the relationship at all. It will not ever load the related entities and it will not persist changes made to that property when saving the entity back to the database. The virtual keyword allows Entity Framework to dynamically subclass your entity and override the collection property to add the logic for lazy-loading. Without that, the related entities will only ever be loaded if you explicitly use Load from the EF API.
Assuming your property is defined in that way, then you gain a whole world of abilities. If you want all books belonging to the author you can just interact with author.Books directly (iterate, query, whatever). No queries are made until you do something that requires evaluation of the queryset. EF issues just-in-time queries depending on the information you're requesting from the database. If you do want to load all the related books at the same time you retrieve the author, you can just use Include with your query:
var author = db.Authors.Include(m => m.Books).SingleOrDefault(m => m.Id == id);
My first question would be to ask why you are creating DTO's in the first place? Is there a consumer on the other end that is using this data? Is it a screen? Are you building DTO's just to build DTO's?
Since you tagged the question as MVC i'm going to assume you are sending data to a view. You probably want a ViewModel. This ViewModel should contain all the data that is shown on the View that uses it. Then use entity framework to populate the view model. This may be done with a single query using projections or something complex.
So after all that blathering. I would say you want option 3.
Just like the others said, for clarity reasons, you should avoid creating "generic" DTO's for specific cases.
When you want to sometimes have authors and some of their books then model a DTO for that.
When you need only the authors then create another DTO that is more suited for that.
Or maybe you don't need DTOs, maybe a List containing their names is enough. Or maybe you could in fact use an anonymous type, like new { AuthorId = author.Id, AuthorName = author.Name }. It depends on the situation.
If you're using ASP.NET MVC the DTO you'll want is in fact a ViewModel that best represents your page.
Based on what you've described, you're view model could be something like this
public class BookViewModel{
public int Id {get;set;}
public string Name {get;set;}
public class AuthorViewModel{
public int Id {get;set;}
public string Name {get;set;}
public List<BookViewModel> Books {get;set;} = new List<BookViewModel>();
public class AuthorsViewModel
public List<AuthorViewModel> Authors {get;set;} = new List<AuthorViewModel>();
//add in this class other properties, like the filters used on the page...
public void Load(){
//here you can retrieve the data from your database.
//you could do like this:
//step 1: retrieve data from DB via EF
//step 2: fill in the Authors view models from the data at step 1
//and in your controller you're calling the Load method to fill you're viewmodel with data from db.
public class AuthorsController{
public ActionResult Index(){
AuthorsViewModel model = new AuthorsViewModel();
return View(model);

.SaveChanges() stores duplicates in entity framework

for ( int i = 0; i < libraryList.Count; i++)
if (ModelState.IsValid)
A library contains an entity 'predefinedgoals' which is already set up in the DB. So when the above code runs it stores dublicates of 'predefinedgoals' and assigns new ID's to them.
I read that I should attach the existing entity to the context but I'm not sure how to to do it in my scenario. The classes look like this:
class library
int libraryID
list<book> bks
class book
int bookID
list<importantdates> impdts
class importantdate
int importantdateID
predefinedgoal predfg
int numberofresellers
class predefinedgoal
int predefinedgoalID
string description
int daysfrompublication
I tried something like this right after ModelState.IsValid but I sense I'm doing it wrong:
var prdfgs= context.predefinedgoals.ToList();
foreach(var pg in prdfgs)
This answer is going to be based on a couple of assumptions, but I've seen this exact problem so many times that this is automatically my go-to answer.
What I think you're doing is that you're creating Library, Book, and ImportantDate objects (and setting up all of the relationships between them as well). In the process of doing all of this, however, you are trying to set the PreDefinedGoal navigational property on those ImportantDate objects, all the while leaving the actual int FK property (that would be something like PreDefinedGoalID), still set to 0. When that happens, Entity Framework disregards the fact that the object contained in the navigational property has an ID on it, and assumes that you are trying to create this PreDefinedGoal object from scratch, just like you're creating the ImportantDate object (as well as the others). It will then create a PreDefinedGoal object with the exact same data as the one you're actually trying to use, but it will create it as a separate, duplicate record in the database.
The solution to your problem then is simple: Don't set the navigational property. Just simply set the FK (ImportantDate.PreDefinedGoalID) to the ID of the PreDefinedGoal object that you want to hook up to it. When you do that, and you save it, Entity Framework will then reach out to the database for the correct object based on that ID, and thus you will avoid having duplicate PreDefinedGoal objects in your database.
FYI: I learned this from one of Julie Lerman's MSDN posts. If you're going to be working with EF, I highly recommend reading her posts and columns.
I am in the same situation and found a workaround. The way this workaround works makes me think that in this case EF is to blame for handling the situation badly.
In order to simplify the example I will just post an example with one object and it's navigational property.
public class Topic
int Id { get; set; }
public String Name { get; set; }
public String Description { get; set; }
public class Course
int Id { get; set; }
public Topic Topic { get; set; }
// additional properties don't matter now
Note the absence of any foreign key or other data annotations. EF6 will correctly create the database schema from this and infer that Id is the primary key.
Without workaround adding a new course for an existing topic will create a new topic object with a new Id (overwriting the Id it was given!) :
await db.SaveChangesAsync();
The braindead workaround:
course.topic = db.Topics.Find(course.topic.Id);
await db.SaveChangesAsync();
In other words, if the topic has been loaded from the context directly, EF will recognize it as an existing topic and don't try to add it again.
Update: To just attach the entity without reloading it:
However you will run into more issues with this setup, it is probably best to use ForeignKey attribute(s) and include the TopicId in Course object. Following works OK but still looks ridiculous to me:
public int TopicId { get; set; }
public virtual Topic Topic { get; set; }
Would love to hear about a less redundant solution though.
The answer to why it stored duplicates in my scenario was that I performed tasks in two different classes - using different database context variables in each of them.
So class #1 is the one in my question, that's where I'm saving to the DB using context #1. In class #2 I retrieved all the PredefinedGoals and added them to ImportantDates but to do this I created context #2. The ID's and objects were the same but retrieved from different context variables.
I solved it by retrieving the PredefinedGoals in class #1 with context variable #1 and sent them as an argument to class #2.

Changing relationships between tables to simplify data access in MVC3 MVC3 app with Entity framework. Lets say I have 3 tables; Article, Category and Author.
I create relations between
Category.CategoryId -> Article.CategoryId and Author.AuthorId -> Article.AuthorId
Using code first navigation properties
public virtual Category Category { get; set; }
public virtual Author Author { get; set; }
That means that when I view a list of the articles I have to :
return View(db.Article
In order to have access to the names of categories and authors and not just their id’s
How much would it hurt to break this classic schema and not create relationships between these tables? Then I could just return SelectLists from Author and Category Tables in a ViewModel and populate the Category and Author fields in my Article table directly with the corresponding names not the id’s and also preserve data integrity.
My query would be simplified to just:
return View(db.Article.ToList());
I suppose I will have to create indexes for those fields to speed up searches.
Is this being done somewhere or is it completely wrong?
Does it have better or worse performance?
#Panos, your original approach is correct, deleting foreign keys would be a mistake. With the includes you avoid the lazy loading in this scenario and you have a good performance.
public virtual Category Category { get; set; }
public virtual Author Author { get; set; }
you defined category and Author as virtual and it means these object wont load without an Include command in your query. you man use a select list in your grid without removing these relations, because these relations doesn't have any real load without Include command in your query.
but be aware of using .ToList() this will load all records of your query and later it may become a large amount of data.

How to delete all many-to-many relations without loading related entities in Entity Framework?

I have db scheme where Product table have many to many relation to Color table. I'm using EF and create POCO objects:
public class Product
public Guid Id {get;set;}
public ICollection<Color> Colors {get;set;}
public class Color
public Guid Id {get;set;}
public ICollection<Product> Products {get;set;}
In many situations it is necessary to delete all colors related to product and set new colors. So i want to delete all many to many relations whitout exactly knowing id of related colors. Is it possible to delete them without additional queries to db? I know i can just write stored procedure which will delete all relation with colors for specified product, but it will be better to find general approach through entity framework.
If you don't know keys of colors you cannot delete them without loading them first - EF deletes records one by one so it needs to know which record to delete.
The straight forward option is executing SQL DELETE directly:
.ExecuteSqlCommand("DELETE FROM dbo.ProductColors WHERE ProductId = #Id", product.Id);

Entity framework many to many relation bottleneck in inserting data

i have a view that shows the user a form and the user should upload a file and choose all the categories associated with it.
the controller that is responsible in submitting the data should
retrieve the file info and
insert data in the file category
retrieve the related category ids and
insert them as well in the table
that is abstracted by the EF just
insert the file and the category ids.
this is my problem the controller just gets some info about the category not all of it. basically it only needs the ids for the insertion
i can't use
public ActionResult SaveFile(File file, List<Category> Checkbox, HttpPostedFileBase FileUpload)
//some stuff
//for example got the first category and named it to category1
i asked someone and he told me you have to select the category you want to insert
is this really necessary ? i only need a category id and a file id to make the insert why would i fire another request to the database that i don't really need
i am using
EF 4
It is better to select category first because it will save you a lot of possible problems but it is not necessary. You can use dummy category object:
var category = new Category { Id = receivedId };
You will only create new category and you will set its PK. Now you need to handle file insertion where you must explicitly instruct ObjectContext to insert only file (because your categories exists in database):
context.Files.Attach(file); // now whole object graph is attached but marked as Unchanged
context.ObjectStateManager.ChangeObjectState(file, EntityState.Added); // mark only file entity as inserted
You can also take opposite direction:
context.Files.AddObject(file); // all objects in object graph are marked for insertion
foreach (var category in file.Categories)
// you don't want to insert categories again
context.ObjectStateManager.ChangeObjectState(category, EntityState.Unchanged);
This scenario works if you know that all categories exist in your database. If you want to insert new categories together with saving file you will need to query categories first or add some information about which category is new and which is existing.
