DBContext (entity framework) and pre-loaded entities - asp.net-mvc

I use code first in a web application where I have a form to upload text files and import the data into my database.
Each file may have up to 20.000+ records for import.
To speed things up I preload some entities so not to ask the DbContext every time. Then when I create an object for insert, I do for example:
myNewObject.Category = preloadedCategories.First(p => p.Code == code);
I have read some articles on the web because EF is extremey slow on batch inserts, so what I do is:
first use Configuration.AutoDetectChangesEnabled = false;
then every 1000 records I dispose the object and make a new one.
BUT! since the preloaded entities where loaded from a db context that was disposed, after making a new DbContext, I have a problem with preloadedCategories.First(p => p.Code == code). When I make a SaveChanges(), EF tries to also save the preloadedCategories.First(p => p.Code == code) object and fails.
So how can I achive this? I don't want to aks the DbContext every time to load some (non changing) objects. Is it possible?
thanks

When dealing with a large number of records in EF, a few things will help
As #janhartmann states, use .AsNoTracking()
As you stated, use Configuration.AutoDetectChangesEnabled = false, which will require the next point
Use context.Categories.Entry(category).State = EntityState.Modified to attach a disconnected entity to a context and mark is as modified
Also make check that preloadedCategories is no longer an IQuerable and that the data really is local and not trying to lazy load from the database.
If there are no changes to your Category object and you just want to link your myNewObject to an existing category, you have two options
Set the foreign key on myNewObject instead of the navigation property
Use context.Products.Entry(myNewObject).State = EntitySate.Added instead of context.Products.Add(myNewObject) to avoid it adding the entire graph of navigation properties
Good luck

Related

Fix uneccessary copy of NSManagedObject

I'm sorry the title may mislead you, since I'm not so good at English. Let me describe my problem as below (You may skip to the TL;DR version at the bottom of this question).
In Coredata, I design a Product entity. In app, I download products from a server. It return JSON string, I defragment it then save to CoreData.
After sometimes has passed, I search a product from that server again, having some interaction with server. Now, I call the online product XProduct. This product may not exist in CoreData, and I also don't want to save it to CoreData since it may not belong to this system (it come from other warehouse, not my current warehouse).
Assume this XProduct has the same properties as Product, but not belong to CoreData, the developer from before has designed another Object, the XProduct, and copy everything (the code) from Product. Wow. The another difference between these two is, XProduct has some method to interact with server, like: - (void)updateStock:(NSInteger)qty;
Now, I want to upgrade the Product properties, I'll have to update the XProduct also. And I have to use these two separately, like:
id product = anArrayContainsProducts[indexPath.row];
if ([product isKindOfClass:[XProduct class]] {
// Some stuff with the xproduct
}
else {
// Probably the same display to the cell.
}
TL;DR
Basically, I want to create a scenario like this:
Get data from server.
Check existed in CoreData.
2 == true => add to array (also may update some data from server).
2 == false => create object (contains same structure as NSManagedObject from JSON dictionary => add to array.
The object created in step 4 will never exist in CoreData.
Questions
How can I create an NSManagedObject without having it add to NSMangedObjectContext and make sure the app would run fine?
If 1 is not encouragement, please suggest me a better approach to this. I really don't like to duplicate so many codes like that.
Update
I was thinking about inheritance (XProduct : Product) but it still make XProduct the subclass of NSManagedObject, so I don't think that is a good approach.
There are a couple of possibilities that might work.
One is just to create the managed objects but not insert them into a context. When you create a managed object, the context argument is allowed to be nil. For example, calling insertNewObjectForEntityForName(_:inManagedObjectContext:) with no context. That gives you an instance of the managed object that's not going to be saved. They have the same lifetime as any other object.
Another is to use a second Core Data stack for these objects, with an in-memory persistent store. If you use NSInMemoryStoreType when adding the persistent store (instead of NSSQLiteStoreType), you get a complete, working Core Data stack. Except that when you save changes, they only get saved in memory. It's not really persistent, since it disappears when the app exits, but aside from that it's exactly the same as any other Core Data stack.
I'd probably use the second approach, especially if these objects have any relationships, but either should work.

Entity Framework - persisting changes to database only partially

Is is possible to select what portion of the Entity Framework entities are persisted back to database?
ObjectContext.SaveChanges() saves everything but if I want to persist only certain items, how to do that?
You need to detach the objects you don't want persisted from the ObjectContext. You do this by assigning their EntityState to Detached as follows:
context.Products.First().State = EntityState.Detached
or
context.Detach(Products.First());
Use multiple contexts to keep track of different sets of data:
The following psuedo code should help you. Cleary there are more ways than just one to do this.
using(ObjectContext context1 = new ObjectContext())
{
using(ObjectContext context2 = new ObjectContext())
{
//Do Stuff
//Now only a portion of your changes are saved to the database
context2.SaveChanges();
}
}

Why is Entity framework loading data from the db when I set a property?

I have two tables (there are more in the database but only two are involved here).
Account and AccountStatus, an account can have an AccountStatus (active,inactive etc).
I create a new Account and set a couple of properties but when I reach this code:
1. var status = db.AccountStatuses.SingleOrDefault(s => s.ID == (long)AccountStatusEnum.Active);
2. account.AccountStatus = status;
3. db.Accounts.AddObject(account);
The first line executes fine, but when I reach the second line it takes a REALLY long time, and when I step in to the code it seems that every single account is loaded from the database.
I don't see why it should even want to load all the accounts?
We use Entity Framework 4 and Poco and we have lazy loading enabled.
Any suggestions?
Cheers
/Jimmy
You have to be careful which constructs you use to fetch data, as some will pull in the whole set and filter afterword. (aside: the long time delay may be the database being created and seeded, if there isn't one already, it will occur the first time you touch it, likely with a query of some sort. Also remember that when you retrieve a whole dataset, you may in actuality only have what amounts to a compiled query that won't be evaluated until you interact with it).
Try this form instead and see if you have the same issue:
var status = db.AccountStatuses.Where(s => s.ID == (long)AccountStatusEnum.Active);

Setting a collection of related entities in the correct way in EF4 using POCO's (src is the DB)

I have a POCO entity Report with a collection of a related POCO entity Reference. When creating a Report I get an ICollection<int> of ids. I use this collection to query the reference repository to get an ICollection<Reference> like so:
from r in referencesRepository.References
where viewModel.ReferenceIds.Contains(r.Id)
select r
I would like to connect the collection straight to Report like so:
report.References = from r in referencesRepository.References
where viewModel.ReferenceIds.Contains(r.Id)
select r;
This doesn't work because References is an ICollection and the result is an IEnumerable. I can do ToList(), but I think I will then load all of the references into memory. There also is no AddRange() function.
I would like to be able to do this without loading them into memory.
My question is very similar to this one. There, the only solution was to loop through the items and add them one by one. Except in this question the list of references does not come from the database (which seemed to matter). In my case, the collection does come from the database. So I hope that it is somehow possible.
Thanks in advance.
When working with entity framework you must load objects into memory if you want to work with them so basically you can do something like this:
report.References = (from r in referencesRepository.References
where viewModel.ReferenceIds.Contains(r.Id)
select r).ToList();
Other approach is using dummy objects but it can cause other problems. Dummy object is new instance of Reference object which have only Id set to PK of existing object in DB - it will act like that existing object. The problem is that when you add Report object to context you must manually set each instance of Reference in ObjectStateManager to Unchanged state otherwise it will insert it to DB.
report.References = viewModel.ReferenceIds.Select(i => new Reference { Id = i }).ToList();
// later in Report repository
context.Reports.AddObject(report);
foreach (var reference in report.References)
{
context.ObjectStateManager.ChangeObjectState(reference, EntityState.Unchanged);
}

Tables from 2 databases in one LINQ to SQL class

I want to include a table (Events) from another database in my LINQ to SQL class.
How should I format the Data Source of that table?
I have tried with IP.dbo.Events, IP.DatabaseName.dbo.Events and so on, but can not get it to work.
The reason for all this is to select Events that a Client have attended.
I have also tried to have the table in another LINQ to SQL class, but then it complains on the Data Context.
public static IQueryable<Event> ByClientID(this IQueryable<Event> events, int clientID)
{
events = from e in events
from c in new MyDataContext().ClientCourses
where e.EventID == c.CourseID &&
c.ClientID == clientID
select e;
return events;
}
You can only use tables that reside on the same physical SQL Server in two different instances. I did this once as someone had "cleverly" put an application's DB across two database instances.
There is a blog post on it here that may help.
Could you create a view that returns the data from the 2nd database and use this instead? (Not tried this so absolutely no idea if it'll work :)
Obviously this is no good if you need to be saving to the other database too..

Resources