Navigation Property Not Evaluated EF Core - ef-core-2.0

I have an issue in EF Core where I am trying to get a related entity and all it's dependent structures, but am not having much success with it.
Currently, I have a query like this:
var user = new Guid(id);
var userCustAffs = _data.UserCustomerAffiliation.Include(x => x.Customer)
.ThenInclude(x => x.Brand).Where(x => x.UserId.Equals(user)).ToList();
var result = userCustAffs.Select(p => p.Customer).ToList();
When I should be able to do something like this to simplify it (and remove unneccesary things being evaluated locally vs the db)
var user = new Guid(id);
var userCustAffs = _data.UserCustomerAffiliation.Include(x => x.Customer)
.ThenInclude(x => x.Brand).Where(x => x.UserId.Equals(user))
.Select(y => y.Customer).ToList();
However, when I do the latter query, I get an error that
The Include operation for navigation '[x].Customer.Brand' is unnecessary and was ignored
because the navigation is not reachable in the final query results
However, Brand is very important, as it drives some of the properties off of the Customer model. What is the proper way to restructure this query so that I get the results I want (e.g. Customer with its relevant Brand, limited by the userId affiliated on the UserCustomerAffiliation table).
I have seen a recommendation before to "start" the query from the Customer instead of UserCustomerAffiliation, but that seems contrary to every instinct I have from a DB optimization standpoint (and Customer does not have a navigation property back to UserCustomerAffiliation atm).

The answer to why this happens this (after some research) is quite interesting and a good example of why knowing how EF Core works is important to using it.
Linq in general works on the idea of deferred execution. To put it very simply, if I make a Linq statement on a particular line, it may not get evaluated or executed until the data is "needed." Most of the time we shortcut this with .ToList() which forces immediate execution. The general idea here is that sometimes datasets are not needed (say, if an exception occurs before it gets evaluated but after it would be 'loaded').
EF Core takes this one step further and ties the idea of deferred execution with database optimization. If, for example, I get a subset of data from the database:
var result = _context.SomeTable.Where(x => x.name == "SomeValue");
But later all I care about is the size of the dataset:
return result.Count;
The DB call can be optimized to
select count(*) from SomeTable where name = "SomeValue";
instead of
select * from SomeTable where name = "SomeValue";
Similarly, the query I have above was being optimized away. Because I chained the whole thing before it was evaluated, the EF Core optimizer threw away a table I needed.
The reason this works:
var user = new Guid(id);
var userCustAffs = _data.UserCustomerAffiliation.Include(x => x.Customer)
.ThenInclude(x => x.Brand).Where(x =>
x.UserId.Equals(user)).ToList();
var result = userCustAffs.Select(p => p.Customer).ToList();
Is because I force execution of the query that is something like
Select u.*, c.*, b.* from usercustomeraffiliation u,
inner join Customer c on u.customerid = c.id
inner join Brand b on c.brandid = c.id
where u.userid = 'userId';
And then strip out the customer object (and the brand object underneath it) in memory. It would be more efficient to be able to generate a query like:
Select c.*, b.* from Customer c on u.customerid = c.id
inner join Brand b on c.brandid = c.id
where c.id in (select u.customerid from usercustomeraffiliation u
where u.userid = 'userId');
But, that gets optimized away.

Related

EFCORE 2.2.6 Simple join throws Must be a reducible node

I'm upgrading an application to EF 2.2, but using EFCore 2.2.6 keeps throwing "Must be a reducible node" ArgumentException when I try to do a simple join.
like
var list = (from a in db.TableA().Include("TableC")
join b in inMemoryList on a.Id equals b.AId
select a).ToList();
If I change to
var list = (from a in db.TableA().ToList()
join b in inMemoryList on a.Id equals b.AId
select a).ToList();
It works, but slow down the process.
Any ideas ? Thanks
This happens from time to time between EF Core versions.
In general, avoid joins to in-memory collections if you can - they've never been supported well.
Use Contains for single field filtering where possible, for instance
var ids = inMemoryList.Select(x => x.AId); // has to be outside the query expression tree
var list = db.TableA.Where(a => ids.Contains(a.Id)).ToList();

Accessing columns in cross reference table via LINQ with Entity Framework

I have a Users table, a Roles table, and a cross reference table Users_Roles which has the following columns:
User_ID Role_ID Beamline_ID Facility_ID Laboratory_ID
The Beamline_ID, Facility_ID, Laboratory_ID are only filled in depending on the Role_ID. If someone has the Role_ID of 2 ("Lab Admin") then they will have an entry in Laboratory_ID.
I am trying to figure out how to get the Laboratory_ID for a specific row in this table. For example, I know I have User_ID = 1. I want to get the Laboratory_ID for User_ID = 1 and Role_ID = 2 ("Lab Admin").
This is obviously simple when dealing with SQL but I am new to Entity Framework and I am trying to do this with Entities and I'm having some trouble. I am using MVC so in my controller I have done this:
User user = new User();
user.GetUser(User.Identity.Name);
var labID = user.Users_Roles.Where(r => r.Role_ID == 2);
That should get me the "row" of that user when the Role = Lab Admin but I don't know how to grab the Labortory_ID column now. I thought maybe it would be:
var labID = user.Users_Roles.Where(r => r.Role_ID == 2).Select(l => l.Laboratory_ID);
But that is not correct. Any help would be greatly appreciated.
EDIT: I am using the database first approach and I am using DBContext. So typically I would access the context like this:
var context = new PASSEntities();
As for why the code you've already tried doesn't work, at first glance I think you just need to tack on a .Single() to the end:
var labID = user.Users_Roles.Where(r => r.Role_ID == 2)
.Select(l => l.Laboratory_ID)
.Single();
Select() returns a sequence, and it looks like you want a single value, so I think that might be the cause of the problem you were having. Single() will throw an exception if there's not exactly one result, which sounds appropriate for your table structure, but there's also First() if you don't care about enforcing that.
But for this kind of thing, you might also think about just querying manually using your DbContext, instead of trying to load entities individually and then traversing down through their navigation properties -- that results in more local searching than you probably need, and, depending on the circumstances, might be performing more/larger queries against your context and database than necessary too.
Here's an example LINQ query you could use (you might have to adjust the names of your tables and so on):
var labID = (from ur in context.Users_Roles
where ur.User_ID == 1 && ur.Role_ID == 2
select ur.Laboratory_ID).Single();
When you execute that statement, it will get translated into SQL equivalent to this:
SELECT TOP 1 Laboratory_ID FROM Users_Roles WHERE User_ID = 1 AND Role_ID = 2
...so it's pretty efficient. (Actually, if User_ID and Role_ID form the primary key for the Users_Roles table, and that record has already been loaded previously, I think it'll just return the cached copy, and won't need to query the database at all.)
That's a very bare-bones query, though; more than likely, you're eventually going to want to query based on properties of the user or role, instead of just searching for hard-coded IDs you know in advance. In that case, you can adjust your query to something like this:
var labID = (from u in context.Users
join ur in context.Users_Roles on u.User_ID equals ur.User_ID
join r in context.Roles on ur.Role_ID equals r.Role_ID
where u.UserName == "John Smith" && r.RoleName == "Lab Admin"
select ur.Laboratory_ID).Single();
That would, as you can probably tell, return the Lab ID for the user John Smith under the role Lab Admin.
Does that make any sense? LINQ syntax can take a little getting used to.

Alternative to using String.Join in Linq query

I am trying to use the Entity Framework in my ASP MVC 3 site to bind a Linq query to a GridView datasource. However since I need to pull information from a secondary table for two of the fields I am getting the error
LINQ to Entities does not recognize the method 'System.String Join(System.String, System.Collections.Generic.IEnumerable'1[System.String])' method, and this method cannot be translated into a store expression.
I would like to be able to do this without creating a dedicated view model. Is there an alternative to using String.Join inside a Linq query?
var grid = new System.Web.UI.WebControls.GridView();
//join a in db.BankListAgentId on b.ID equals a.BankID
var banks = from b in db.BankListMaster
where b.Status.Equals("A")
select new
{
BankName = b.BankName,
EPURL = b.EPURL.Trim(),
AssociatedTPMBD = b.AssociatedTPMBD,
FixedStats = String.Join("|", from a in db.BankListAgentId
where a.BankID == b.ID &&
a.FixedOrVariable.Equals("F")
select a.AgentId.ToString()),
VariableStats = String.Join("|", from a in db.BankListAgentId
where a.BankID == b.ID &&
a.FixedOrVariable.Equals("V")
select a.AgentId.ToString()),
SpecialNotes = b.SpecialNotes,
};
grid.DataSource = banks.ToList();
grid.DataBind();
If you're not overly worried about performance (since it has subqueries, it may generate n+1 queries to the database, and if the database rows are large, you may fetch un-necessary data), the simplest fix is to add an AsEnumerable() to do the String.Join on the web/application side;
var banks = (from b in db.BankListMaster
where b.Status.Equals("A") select b)
.AsEnumerable()
.Select(x => new {...})
At the point of the call to AsEnumerable(), the rest of the Linq query will be done on the application side instead of the database side, so you're free to use any operators you need to get the job done. Of course, before that you'll want to filter the result as much as possible.

How can I set a many-to-many EntityCollection in Entity Framework efficiently?

When Entity Framework generates an ObjectContext for a two database tables (let's say Table1 and Table2) connected with a many-to-many relationship table, it doesn't create an object for the xref table, opting instead for collection properties on either end of the relationship. So on Table1 you have EntityCollection<Table2> Table2s and on Table2 you have EntityCollection<Table2> Table1s. In most cases that's actually pretty great...
However, in this scenario, I have a list of integers that represent the database IDs of the Table2 rows that should be in the Table1.Table2s collection.
I can't see any way to just set that collection using the entity keys, so I'm stuck selecting these into the ObjectContext, which is already a ton of work to do for no reason. I let myself hope that LINQ-to-Entities will intelligently defer the execution and perform it all on the SQL server like I would like (though my Where uses Contains, which may or may not be correctly translated to IN() in SQL). So I can go as far as:
table1instance.Table2s.Clear();
var table2sToInclude = context.Table2s.Where(
t =>
listOfTable2DatabaseIds.Contains(t.Id));
But there's no EntityCollection<T>.AddRange(IEnumerable<T>) or anything, nor is there an IEnumerable<T>.ToEntityCollection<T>() extension method of course, so I don't know what to do with these results at this point. All I can do is
foreach (var table2 in table2sToInclude)
{
table1instance.Table2s.Add(table2);
}
which seems ridiculous and I know will force a lot of unnecessary evaluation.
Is there a "correct", or, perhaps, "less lame" way to do this?
No EF will not defer any query execution. There is nothing like insert from select. Linq-to-entities is just query language and responsibility of query is to execute. It is strictly separated from persistence functionality offered by EF itself.
If you want to create relations between existing item from table1 and exiting items from table2 you can use code like this:
using (var ctx = new YourContext())
{
var table1 = new Table1 { Id = 123 };
ctx.Table1s.Attach(table1);
foreach (var table2 in table2sToInclude.Select(id => new Table2 { Id = id }))
{
ctx.Table2s.Attach(table2);
order.Table2s.Add(table2);
}
ctx.SaveChanges();
}
This code creates relation between Table1's item with id 123 and all Table2's items from table2sToInclude without loading any single record from the database.
What makes adding records one by one "lame"? Do you understand what is benefit of AddRange? AddRange in typical collection extends capacity of internal array and just copy items to extended array. EntityCollection is not typical array and it must process each added entity. So even if there will be some AddRange it will internally iterate items and process them on by one.

ASP.NET MVC & EF4 Entity Framework - Are there any performance concerns in using the entities vs retrieving only the fields i need?

Lets say we have 3 tables, Users, Products, Purchases.
There is a view that needs to display the purchases made by a user.
I could lookup the data required by doing:
from p in DBSet<Purchases>.Include("User").Include("Product") select p;
However, I am concern that this may have a performance impact because it will retrieve the full objects.
Alternatively, I could select only the fields i need:
from p in DBSet<Purchases>.Include("User").Include("Product") select new SimplePurchaseInfo() { UserName = p.User.name, Userid = p.User.Id, ProductName = p.Product.Name ... etc };
So my question is:
Whats the best practice in doing this?
== EDIT
Thanks for all the replies.
[QUESTION 1]: I want to know whether all views should work with flat ViewModels with very specific data for that view, or should the ViewModels contain the entity objects.
Real example: User reviews Products
var query = from dr in productRepository.FindAllReviews()
where dr.User.UserId = 'userid'
select dr;
string sql = ((ObjectQuery)query).ToTraceString();
SELECT [Extent1].[ProductId] AS [ProductId],
[Extent1].[Comment] AS [Comment],
[Extent1].[CreatedTime] AS [CreatedTime],
[Extent1].[Id] AS [Id],
[Extent1].[Rating] AS [Rating],
[Extent1].[UserId] AS [UserId],
[Extent3].[CreatedTime] AS [CreatedTime1],
[Extent3].[CreatorId] AS [CreatorId],
[Extent3].[Description] AS [Description],
[Extent3].[Id] AS [Id1],
[Extent3].[Name] AS [Name],
[Extent3].[Price] AS [Price],
[Extent3].[Rating] AS [Rating1],
[Extent3].[ShopId] AS [ShopId],
[Extent3].[Thumbnail] AS [Thumbnail],
[Extent3].[Creator_UserId] AS [Creator_UserId],
[Extent4].[Comment] AS [Comment1],
[Extent4].[DateCreated] AS [DateCreated],
[Extent4].[DateLastActivity] AS [DateLastActivity],
[Extent4].[DateLastLogin] AS [DateLastLogin],
[Extent4].[DateLastPasswordChange] AS [DateLastPasswordChange],
[Extent4].[Email] AS [Email],
[Extent4].[Enabled] AS [Enabled],
[Extent4].[PasswordHash] AS [PasswordHash],
[Extent4].[PasswordSalt] AS [PasswordSalt],
[Extent4].[ScreenName] AS [ScreenName],
[Extent4].[Thumbnail] AS [Thumbnail1],
[Extent4].[UserId] AS [UserId1],
[Extent4].[UserName] AS [UserName]
FROM [ProductReviews] AS [Extent1]
INNER JOIN [Users] AS [Extent2] ON [Extent1].[UserId] = [Extent2].[UserId]
LEFT OUTER JOIN [Products] AS [Extent3] ON [Extent1].[ProductId] = [Extent3].[Id]
LEFT OUTER JOIN [Users] AS [Extent4] ON [Extent1].[UserId] = [Extent4].[UserId]
WHERE N'615005822' = [Extent2].[UserId]
or
from d in productRepository.FindAllProducts()
from dr in d.ProductReviews
where dr.User.UserId == 'userid'
orderby dr.CreatedTime
select new ProductReviewInfo()
{
product = new SimpleProductInfo() { Id = d.Id, Name = d.Name, Thumbnail = d.Thumbnail, Rating = d.Rating },
Rating = dr.Rating,
Comment = dr.Comment,
UserId = dr.UserId,
UserScreenName = dr.User.ScreenName,
UserThumbnail = dr.User.Thumbnail,
CreateTime = dr.CreatedTime
};
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name],
[Extent1].[Thumbnail] AS [Thumbnail],
[Extent1].[Rating] AS [Rating],
[Extent2].[Rating] AS [Rating1],
[Extent2].[Comment] AS [Comment],
[Extent2].[UserId] AS [UserId],
[Extent4].[ScreenName] AS [ScreenName],
[Extent4].[Thumbnail] AS [Thumbnail1],
[Extent2].[CreatedTime] AS [CreatedTime]
FROM [Products] AS [Extent1]
INNER JOIN [ProductReviews] AS [Extent2] ON [Extent1].[Id] = [Extent2].[ProductId]
INNER JOIN [Users] AS [Extent3] ON [Extent2].[UserId] = [Extent3].[UserId]
LEFT OUTER JOIN [Users] AS [Extent4] ON [Extent2].[UserId] = [Extent4].[UserId]
WHERE N'userid' = [Extent3].[UserId]
ORDER BY [Extent2].[CreatedTime] ASC
[QUESTION 2]: Whats with the ugly outer joins?
In general, only retrieve what you need, but keep in mind to retrieve enough information so your application is not too chatty, so if you can batch a bunch of things together, do so, otherwise you'll pay network traffic cost everytime you need to go back to the database and retrieve some more stuffs.
In this case, assuming you will only need those info, I would go with the second approach (if that's what you really need).
Eager loading with .Include doesn't really play nice when you want filtering (or ordering for that matter).
That first query is basically this:
select p.*, u.*, p2.*
from products p
left outer join users u on p.userid = u.userid
left outer join purchases p2 on p.productid = p2.productid
where u.userid == #p1
Is that really what you want?
There is a view that needs to display the purchases made by a user.
Well then why are you including "Product"?
Shouldn't it just be:
from p in DBSet<Purchases>.Include("User") select p;
Your second query will error. You must project to an entity on the model, or an anonymous type - not a random class/DTO.
To be honest, the easiest and most well performing option in your current scenario is to query on the FK itself:
var purchasesForUser = DBSet<Purchases>.Where(x => x.UserId == userId);
That should produce:
select p.*
from products p
where p.UserId == #p1
The above query of course requires you to include the foreign keys in the model.
If you don't have the FK's in your model, then you'll need more LINQ-Entities trickery in the form of anonymous type projection.
Overall, don't go out looking to optimize. Create queries which align with the scenario/business requirement, then optimize if necessary - or look for alternatives to LINQ-Entities, such as stored procedures, views or compiled queries.
Remember: premature optimization is the root of all evil.
*EDIT - In response to Question Update *
[QUESTION 1]: I want to know whether all views should work with flat ViewModels with very specific data for that view, or should the ViewModels contain the entity objects.
Yes - ViewModel's should only contain what is required for that View. Otherwise why have the ViewModel? You may as well bind straight to the EF model. So, setup the ViewModel which only the fields it needs for the view.
[QUESTION 2]: What's with the ugly outer joins?
That is default behaviour for .Include. .Include always produces a left outer join.
I think the second query will throw exception because you can't map result to unmapped .NET type in Linq-to-entities. You have to return annonymous type and map it to your object in Linq-to-objects or you have to use some advanced concepts for projections - QueryView (projections in ESQL) or DefiningQuery (custom SQL query mapped to new readonly entity).
Generally it is more about design of your entities. If you select single small entity it is not a big difference to load it all instead of projection. If you are selecting list of entities you should consider projections - expecially if tables contains columns like nvarchar(max) or varbinar(max) which are not needed in your result!
Both create almost the same query: select from one table, with two inner joins. The only thing that changes from a database perspective is the amount of fields returned, but that shouldn't really matter that much.
I think here DRY wins from a performance hit (if it even exists): so my call is go for the first option.

Resources