I have been trying to find the correct LINQ LAMBDA expression for the last month, with zero successful results.
I cannot use the following tools:
Linqer is unable to run because the Microsoft tool it uses to create the SQL connection (the dbml file) refuses to install on my Win8.1 system
LinqPad doesn’t provide an actual translation until you actually buy the product (which makes the “trial” fundamentally broken in the first place)
I have three levels of tables that I need to bring back into a single viewmodel.
Level one is a company table. Easy as s**t.
Level two is a "cycle" table. This is where I have gotten hung up on, since many cycles can exist for a company, but I need to grab only the latest cycle by date.
Level three is a pair of tables that exist off the cycle, I only need a true/false test for content in those tables for the cycle in question. I haven't even tried this yet.
So far I have come up with a minimally functional SQL script (that only deals with the first two levels), but my MVC project is making use of several tools that hook straight into LAMBDA expressions, including PagedList. I need a LAMBDA expression and not a pure SQL expression.
My SQL:
SELECT
co.CompanyId
, co.CompanyName
, co.CompanyCity
, co.NumberOfEmployees
, co.ProspectingScore
, cd.PDFResourceLibrary
, cd.PresentationDone
, cd.MOUDone
FROM Company AS co
OUTER APPLY (
SELECT
TOP 1 MAX(CycleDate) AS CycleDate
, PDFResourceLibrary
, PresentationDone
, MOUDone
FROM Cycle AS cy
WHERE cy.CompanyId = co.CompanyId
GROUP BY PDFResourceLibrary, PresentationDone, MOUDone
) AS cd
ORDER BY co.ProspectingScore DESC
I have tried a number of lambda expressions to date:
db.Company
.GroupJoin(
db.Cycle
, co => co.CompanyId
, cy => cy.CycleId
, (x, y) => new { Company = x, Cycle = y }
).Select(
y => y.Cycle.OrderByDescending(y => y.CycleDate).SingleOrDefault()
).ToList();
But this throws a local variable cannot have the same name as a method type parameter as well as a Cannot implicitly convert type System.Collections.Generic.List to System.Collections.Generic.IEnumerable error.
Converting it to a Join flags the OrderByDescending as invalid, but I need that to drop all but the most latest cycle.
I also can't seem to do a join to save my effin' life. All the examples out there fail with my system.
For example, a join that comes sooooooo close is:
db.Company.Join(
db.Cycle.OrderByDescending(x => x.CycleDate).SingleOrDefault()
, co => co.CompanyId
, cy => cy.CycleId
, (x, y) => new { Company = x, Cycle = y }
).ToList();
but then it claims The type arguments for method Queryable Join cannot be inferred from the usage. Like, --what??
I have also tried the following:
db.Company.Include(
x => x.Cycle.OrderByDescending(y => y.CycleDate).SingleOrDefault()
).ToList();
which works for the company but I cannot seem to drill past the company and into the cycle, when I go, #(item.Cycle.PresentationDone it says that ICollection<Cycle> does not contain a definition for PresentationDone, even though I have an ICollection for Cycle in my Company model. It's right there, but the system won’t see it to follow.
An attempt with
db.Company.Select(x => new { Company = x, Cycle = x.Cycle.OrderByDescending(y => y.CycleDate).Single() }).ToList();
also throws the List to IEnumerable conversion error.
As a final note, please keep in mind that I am bringing two models into the same page, and the second model is the same as the first but focuses only on the company. IT works. It has no problem pulling data out of the DB:
viewModel.AllCompanies = db.Company.ToList();
Because it does not need to pull anything from the cycle -- I am ignoring anything beneath the Company level. But for the first query, I have to bring several items off of the cycle, and so I need to query for the most recent cycle.
EDIT:
With the generous assistance of Ivan Stoev I have assembled the following:
var query = (IPagedList<DashboardUserData>)(
from co in db.Company
join cy in db.Cycle on co.CompanyId equals cy.CycleId into cycles
from cd in cycles.OrderByDescending(cy => cy.CycleDate).Take(1).DefaultIfEmpty()
orderby co.ProspectingScore descending
select new {
CompanyId = co.CompanyId,
CompanyName = co.CompanyName,
CompanyCity = co.CompanyCity,
NumberOfEmployees = co.NumberOfEmployees,
ProspectingScore = co.ProspectingScore,
PDFResourceLibrary = (bool?)cd.PDFResourceLibrary,
PresentationDone = (bool?)cd.PresentationDone,
MOUDone = (bool?)cd.MOUDone
}).ToPagedList(regionPageIndex, pageSize);
And my model is such:
public IPagedList<DashboardUserData> RegionalCompanies { get; set; }
public class DashboardUserData {
public Guid CompanyId { get; set; }
public string CompanyName { get; set; }
public string CompanyCity { get; set; }
public int? NumberOfEmployees { get; set; }
public int? ProspectingScore { get; set; }
public bool? PDFResourceLibrary { get; set; }
public bool? PresentationDone { get; set; }
public bool? MOUDone { get; set; }
}
But for some reason I am unable to attach the data to the model. I get the following error:
Unable to cast object of type 'PagedList.PagedList`1[<>f__AnonymousType9`8[System.Guid,System.String,System.String,System.Nullable`1[System.Int32],System.Nullable`1[System.Int32],System.Nullable`1[System.Boolean],System.Nullable`1[System.Boolean],System.Nullable`1[System.Boolean]]]' to type 'PagedList.IPagedList`1[CCS.Models.DashboardUserData]'.
It's probably something stupidly simple, but I'm missing it.
EDIT 2:
When I add a filter to the original table, Company:
var query = (IPagedList<DashboardUserData>)(
from co in db.Company
where co.RegionId == new Guid(User.GetClaimValue("Region"))
I now get an issue of:
Only parameterless constructors and initializers are supported in LINQ to Entities
Still have that model issue from the first Edit, tho.
Edit 3:
Huh, I may have solved Edit 1:
viewModel.RegionalCompanies = (
from co in db.Company
where co.RegionId == regionId
join cy in db.Cycle on co.CompanyId equals cy.CycleId into cycles
from cd in cycles.OrderByDescending(cy => cy.CycleDate).Take(1).DefaultIfEmpty()
orderby co.ProspectingScore descending
select new DashboardUserData {
CompanyId = co.CompanyId,
CompanyName = co.CompanyName,
CompanyCity = co.CompanyCity,
NumberOfEmployees = co.NumberOfEmployees,
ProspectingScore = co.ProspectingScore,
PDFResourceLibrary = cd.PDFResourceLibrary,
PresentationDone = cd.PresentationDone,
MOUDone = cd.MOUDone
}).ToPagedList(regionPageIndex, pageSize);
But now the second viewModel:
viewModel.AllOtherCompanies = await db.Company.Where(c => c.RegionId != regionId).Include(c => c.Province).ToPagedListAsync(allPageIndex, pageSize);
return View(viewModel);
Is throwing one very confusing error message:
The method 'Skip' is only supported for sorted input in LINQ to Entities. The method 'OrderBy' must be called before the method 'Skip'.
which I have never seen before, even with the original code. Googling right now.
Edit 4:
OMFG, I think I have it working. I was concentrating on getting the first model to work properly with IPagedList, and forgot that IPagedList requires an order in order to page properly. This is coming once I get column sorting and paging implemented on the page side and the correct code in the controller, but once I stuck in a temporary .OrderBy() in the second viewModel, everything suddenly stood up properly.
A big shout-out to Ivan Stoev, your reply was a massive kick in the right direction!! Thank you!!
I would suggest you when working with complex queries, to use the LINQ query syntax for the most parts of the query because it's much easier to follow and modify due to the transparent identifiers. Also it maps more natively to the SQL query.
For instance, here is the LINQ equivalent of your SQL query:
var query =
from co in db.Company
from cd in (
from cy in db.Cycle
where cy.CompanyId == co.CompanyId
group cy by new { cy.PDFResourceLibrary, cy.PresentationDone, cy.MOUDone } into g
select new
{
CycleDate = g.Max(cy => cy.CycleDate),
g.Key.PDFResourceLibrary,
g.Key.PresentationDone,
g.Key.MOUDone
}
)
.OrderByDescending(cy => cy.CycleDate).Take(1) // TOP 1
.DefaultIfEmpty() // OUTER
orderby co.ProspectingScore descending
select new
{
co.CompanyId,
co.CompanyName,
co.CompanyCity,
co.NumberOfEmployees,
co.ProspectingScore,
cd.PDFResourceLibrary,
cd.PresentationDone,
cd.MOUDone
};
EF generated SQL query from the above:
SELECT
[Extent1].[CompanyId] AS [CompanyId],
[Extent1].[CompanyName] AS [CompanyName],
[Extent1].[CompanyCity] AS [CompanyCity],
[Extent1].[NumberOfEmployees] AS [NumberOfEmployees],
[Extent1].[ProspectingScore] AS [ProspectingScore],
[Limit1].[PDFResourceLibrary] AS [PDFResourceLibrary],
[Limit1].[PresentationDone] AS [PresentationDone],
[Limit1].[MOUDone] AS [MOUDone]
FROM [dbo].[Company] AS [Extent1]
OUTER APPLY (SELECT TOP (1) [Project1].[PDFResourceLibrary] AS [PDFResourceLibrary], [Project1].[PresentationDone] AS [PresentationDone], [Project1].[MOUDone] AS [MOUDone]
FROM ( SELECT
[GroupBy1].[A1] AS [C1],
[GroupBy1].[K1] AS [PDFResourceLibrary],
[GroupBy1].[K2] AS [PresentationDone],
[GroupBy1].[K3] AS [MOUDone]
FROM ( SELECT
[Extent2].[PDFResourceLibrary] AS [K1],
[Extent2].[PresentationDone] AS [K2],
[Extent2].[MOUDone] AS [K3],
MAX([Extent2].[CycleDate]) AS [A1]
FROM [dbo].[Cycle] AS [Extent2]
WHERE [Extent2].[CompanyId] = [Extent1].[CompanyId]
GROUP BY [Extent2].[PDFResourceLibrary], [Extent2].[PresentationDone], [Extent2].[MOUDone]
) AS [GroupBy1]
) AS [Project1]
ORDER BY [Project1].[C1] DESC ) AS [Limit1]
ORDER BY [Extent1].[ProspectingScore] DESC
This should cover the question of how to convert the original SQL query.
But do you really need to follow the original SQL query? According to this requirement:
many cycles can exist for a company, but I need to grab only the latest cycle by date
it looks more natural to use something like this:
var query =
from co in db.Company
join cy in db.Cycle on co.CompanyId equals cy.CycleId into cycles
from cd in cycles.OrderByDescending(cy => cy.CycleDate).Take(1).DefaultIfEmpty()
orderby co.ProspectingScore descending
select new
{
co.CompanyId,
co.CompanyName,
co.CompanyCity,
co.NumberOfEmployees,
co.ProspectingScore,
cd.PDFResourceLibrary,
cd.PresentationDone,
cd.MOUDone
};
which generates:
SELECT
[Extent1].[CompanyId] AS [CompanyId],
[Extent1].[CompanyName] AS [CompanyName],
[Extent1].[CompanyCity] AS [CompanyCity],
[Extent1].[NumberOfEmployees] AS [NumberOfEmployees],
[Extent1].[ProspectingScore] AS [ProspectingScore],
[Limit1].[PDFResourceLibrary] AS [PDFResourceLibrary],
[Limit1].[PresentationDone] AS [PresentationDone],
[Limit1].[MOUDone] AS [MOUDone]
FROM [dbo].[Company] AS [Extent1]
OUTER APPLY (SELECT TOP (1) [Project1].[PDFResourceLibrary] AS [PDFResourceLibrary], [Project1].[PresentationDone] AS [PresentationDone], [Project1].[MOUDone] AS [MOUDone]
FROM ( SELECT
[Extent2].[CycleDate] AS [CycleDate],
[Extent2].[PDFResourceLibrary] AS [PDFResourceLibrary],
[Extent2].[PresentationDone] AS [PresentationDone],
[Extent2].[MOUDone] AS [MOUDone]
FROM [dbo].[Cycle] AS [Extent2]
WHERE [Extent1].[CompanyId] = [Extent2].[CycleId]
) AS [Project1]
ORDER BY [Project1].[CycleDate] DESC ) AS [Limit1]
ORDER BY [Extent1].[ProspectingScore] DESC
Related
I have a 'Skill' table where i store skills. And in 'Job' table i store all required skill when post job like UpWork. Employeers have checkbox to select all required skills. But i store skillID like: 1,5,6,8 in job table. When i retrieve the job details, i want to get name of the all skills because i want to show SkillName with other details of the Job from job table. My Web Api:
[HttpGet]
[Route("api/JobApi/BrowseJobs/")]
public object BrowseJobs()
{
var skills = db.Skills.ToDictionary(d => d.SkillID, n => n.SkillName);
var jobData = (from j in db.Jobs where j.Preference==2
//from cj in j.ClosedJobs.DefaultIfEmpty()
join cj in db.ClosedJobs.DefaultIfEmpty()
on j.JobID equals cj.JobID into closedJob
where !closedJob.Any()
join c in db.Categories on j.Category equals c.CategoryID
join jobContract in
(
from appliedJob in db.AppliedJobs.DefaultIfEmpty()
from offer in appliedJob.JobOffers.DefaultIfEmpty()
from contract in db.Contracts.DefaultIfEmpty()
select new { appliedJob, offer, contract }
).DefaultIfEmpty()
on j.JobID equals jobContract.appliedJob.JobID into jobContracts
where !jobContracts.Any(jobContract => jobContract.contract.CompletedDate != null)
select new
{
JobTitle = j.JobTitle,
JobID = j.JobID,
ReqSkillCommaSeperated = j.ReqSkill,
Category = c.CategoryName,
Budget=j.Budget,
Deadline=j.Deadline,
JobDetails=j.JobDetails,
PublishDate=j.PublishDate,
TotalApplied=(from ap in db.AppliedJobs where j.JobID == ap.JobID select ap.AppliedJobID).DefaultIfEmpty().Count()
}).AsEnumerable()
.Select(x => new
{
JobID = x.JobID,
JobTitle = x.JobTitle,
Category = x.Category,
Budget = x.Budget,
Deadline = x.Deadline,
JobDetails = x.JobDetails,
PublishDate = x.PublishDate,
SkillNames = GetSkillName(x.ReqSkillCommaSeperated, skills),
TotalApplied = (from ap in db.AppliedJobs where x.JobID == ap.JobID select ap.AppliedJobID).DefaultIfEmpty().Count()
}).ToList();
return jobData.AsEnumerable();
}
private string GetSkillName(string reqSkill, Dictionary<int, string> skills)
{
if (reqSkill == null) return string.Empty;
var skillArr = reqSkill.Split(',');
var skillNameList = skillArr.Select(skillId => skills[Convert.ToInt32(skillId)])
.ToList();
return String.Join(",", skillNameList);
}
My Problem is that the code is working well in my VS 2013. But when i uploaded it on a Godaddy live server, it doesn't work! returns 500 internal server error
Now i want to Make a SQL query instead of Linq. Can i do SQL with my desired result?
===================Edited=====================
your sql code is well worked. But i have others condition to be put on.
1. I need to show those job which is not closed yet (ClosedJobs table take the closed jobs ID).If a job ID is found on ClosedJobs table, it will not return in the list.
join cj in db.ClosedJobs.DefaultIfEmpty()
on j.JobID equals cj.JobID into closedJob
where !closedJob.Any()
Those job which is not found on Contracts table(Contracts table take the jobID of a job that is started as contract)
2nd Edit===================
join jobContract in
(
from appliedJob in db.AppliedJobs.DefaultIfEmpty()
from offer in appliedJob.JobOffers.DefaultIfEmpty()
from contract in db.Contracts.DefaultIfEmpty()
select new { appliedJob, offer, contract }
).DefaultIfEmpty()
on j.JobID equals jobContract.appliedJob.JobID into jobContracts
where !jobContracts.Any(jobContract => jobContract.contract.CompletedDate != null)
EXP: Job table has relation with AppliedJobs table. AppliedJobs table has relation with JobOffers. JobOffers has relation with Contracts.
i don't want to show those jobs that are completed.(Contracts.CompletedDate != null). When a contract starts the field CompletedDate is set to null. After completing the contract ,it is changed null to the completed date.
Where i will apply the condition?
How can i do that? Can you help me? #John Cappelletti
EDIT - Removed OUTER APPLY
Below is a simple example of using Stuff() and XML. If the sequence is important, then we must split the string first.
To be clear #Skills and #YourData are table variables and simply demonstrative.
Example
Declare #Skills table (SkillID int,SkillName varchar(50))
Insert Into #Skills values
(1,'ASP')
,(2,'JavaScript')
,(3,'AngularJS')
,(4,'WordPress')
,(5,'Joomla')
Declare #YourData table (ID int,ReqSkill varchar(50))
Insert Into #YourData values
(1,'2,3,4,5,1')
,(2,'3')
,(3,'3,4,5,2')
,(4,null)
Select A.ID
,Skills = Stuff((Select ',' +SkillName
From #Skills
Where charindex(concat(',',SkillID,','),','+A.ReqSkill+',')>0
For XML Path ('')),1,1,'')
From #YourData A
-- Your WHERE Statement Here --
Returns
ID Skills
1 ASP,JavaScript,AngularJS,WordPress,Joomla
2 AngularJS
3 JavaScript,AngularJS,WordPress,Joomla
4 NULL
In my Person table is a RequestedLocation column which stores location IDs. The IDs match the LocationId column in the Locations table, the Locations table also has the text location names, in the LocatioName column.
In my view, I need to display the string LocationName in the view which has the Person model passed to it. The view will be displaying a List of people in a telerik grid. CUrrently it works great, except the RequestedLocation column is all integers.
I am populating all my grids with methods containing LINQ queries. Here is the method that currently works:
public List<Person> GetPeople()
{
var query = from p in _DB.Person.ToList()
select p;
return query.ToList();
}
Here is the regular SQL query that works, and I need to convert into LINQ:
SELECT ApplicantID
,FirstName
,LastName
,MiddleName
,DateofBirth
,Gender
,RequestedVolunteerRole
,RequestedVolunteerLocation
,l.LocationName
FROM Form.Person p
JOIN dbo.Location l ON p.RequestedVolunteerLocation = l.LocationID
Order BY ApplicantID
Here is my attempt to convert to LINQ:
public List<NewApplicantViewModel> GetPeople()
{
var query = from pl in _DB.Person.ToList()
join l in _Elig_DB.Locations.ToList() on pl.RequestedVolunteerLocation equals l.LocationID
select new
{
pl.RequestedVolunteerLocation = l.LocationName
};
return query.ToList();
The number of errors I get from this are numerous, but most are along the lines of:
Cannot convert from type Annonymous to Type List<NewAPplicantModel>
and
Invalid annonymous type declarator.
Please help, and thank you for reading my post.
Oh, and I have only been programming for a couple months, so if I am going about this all wrong, please let me know. Only thing I have to stick with is the table structure because it is an existing app that I am updating, and changing the location or person tables would have large consequences.
public List<NewApplicantViewModel> GetPeople()
{
var query = from pl in _DB.Person
join l in _Elig_DB.Locations on pl.RequestedVolunteerLocation
equals l.LocationID
select new NewApplicantViewModel
{
LocationName = l.LocationName,
otherPropery = p.Property
};
return query.ToList();
}
Beware of calling _DB.Person.ToList() it will load all persons from DB because ToList() immediately executes the query and the join would be performed in memory (not in DB).
The reason you are getting an error is you are projecting an anonymous type
select new
{
pl.RequestedVolunteerLocation = l.LocationName
};
Instead, you need to project a NewApplicantViewModel
select new NewApplicantViewModel
{
RequestedVolunteerLocation = l.LocationName
};
I am trying to write a query that includes 2 joins.
1 StoryTemplate can have multiple Stories
1 Story can have multiple StoryDrafts
I am starting the query on the StoryDrafts object because that is where it's linked to the UserId.
I don't have a reference from the StoryDrafts object directly to the StoryTemplates object. How would I build this query properly?
public JsonResult Index(int userId)
{
return Json(
db.StoryDrafts
.Include("Story")
.Include("StoryTemplate")
.Where(d => d.UserId == userId)
,JsonRequestBehavior.AllowGet);
}
Thank you for any help.
Try to flatten your hierarchy if it works for you. Here is a sample, and you may want to customize it for your needs.
var result = from c in db.Customers
join o in db.Orders
on c equals o.Customers
select new
{
custid = c.CustomerID,
cname = c.CompanyName,
address = c.Address,
orderid = o.OrderID,
freight = o.Freight,
orderdate = o.OrderDate
};
If flattering does not meet your requirements then you need to use query that returns a Nested Group. Finally, look at the following link for more references - LINQ Query Expressions .
When I retrieve a list of items from a database including some children (via .Include), and order the randomly, EF gives me an unexpected result.. I creates/clones addition items..
To explain myself better, I've created a small and simple EF CodeFirst project to reproduce the problem.
First i shall give you the code for this project.
The project
Create a basic MVC3 project and add the EntityFramework.SqlServerCompact package via Nuget.
That adds the latest versions of the following packages:
EntityFramework v4.3.0
SqlServerCompact v4.0.8482.1
EntityFramework.SqlServerCompact v4.1.8482.2
WebActivator v1.5
The Models and DbContext
using System.Collections.Generic;
using System.Data.Entity;
namespace RandomWithInclude.Models
{
public class PeopleContext : DbContext
{
public DbSet<Person> Persons { get; set; }
public DbSet<Address> Addresses { get; set; }
}
public class Person
{
public int ID { get; set; }
public string Name { get; set; }
public virtual ICollection<Address> Addresses { get; set; }
}
public class Address
{
public int ID { get; set; }
public string AdressLine { get; set; }
public virtual Person Person { get; set; }
}
}
The DB Setup and Seed data: EF.SqlServerCompact.cs
using System.Collections.Generic;
using System.Data.Entity;
using System.Data.Entity.Infrastructure;
using RandomWithInclude.Models;
[assembly: WebActivator.PreApplicationStartMethod(typeof(RandomWithInclude.App_Start.EF), "Start")]
namespace RandomWithInclude.App_Start
{
public static class EF
{
public static void Start()
{
Database.DefaultConnectionFactory = new SqlCeConnectionFactory("System.Data.SqlServerCe.4.0");
Database.SetInitializer(new DbInitializer());
}
}
public class DbInitializer : DropCreateDatabaseAlways<PeopleContext>
{
protected override void Seed(PeopleContext context)
{
var address1 = new Address {AdressLine = "Street 1, City 1"};
var address2 = new Address {AdressLine = "Street 2, City 2"};
var address3 = new Address {AdressLine = "Street 3, City 3"};
var address4 = new Address {AdressLine = "Street 4, City 4"};
var address5 = new Address {AdressLine = "Street 5, City 5"};
context.Addresses.Add(address1);
context.Addresses.Add(address2);
context.Addresses.Add(address3);
context.Addresses.Add(address4);
context.Addresses.Add(address5);
var person1 = new Person {Name = "Person 1", Addresses = new List<Address> {address1, address2}};
var person2 = new Person {Name = "Person 2", Addresses = new List<Address> {address3}};
var person3 = new Person {Name = "Person 3", Addresses = new List<Address> {address4, address5}};
context.Persons.Add(person1);
context.Persons.Add(person2);
context.Persons.Add(person3);
}
}
}
The controller: HomeController.cs
using System;
using System.Data.Entity;
using System.Linq;
using System.Web.Mvc;
using RandomWithInclude.Models;
namespace RandomWithInclude.Controllers
{
public class HomeController : Controller
{
public ActionResult Index()
{
var db = new PeopleContext();
var persons = db.Persons
.Include(p => p.Addresses)
.OrderBy(p => Guid.NewGuid());
return View(persons.ToList());
}
}
}
The View: Index.cshtml
#using RandomWithInclude.Models
#model IList<Person>
<ul>
#foreach (var person in Model)
{
<li>
#person.Name
</li>
}
</ul>
this should be all, and you application should compile :)
The problem
As you can see, we have 2 straightforward models (Person and Address) and Person can have multiple Addresses.
We seed the generated database 3 persons and 5 addresses.
If we get all the persons from the database, including the addresses and randomize the results and just print out the names of those persons, that's where it all goes wrong.
As a result, i sometimes get 4 persons, sometimes 5 and sometimes 3, and i expect 3. Always.
e.g.:
Person 1
Person 3
Person 1
Person 3
Person 2
So.. it's copying/cloning data! And that's not cool..
It just seems that EF looses track of what addresses are a child of which person..
The generated SQL query is this:
SELECT
[Project1].[ID] AS [ID],
[Project1].[Name] AS [Name],
[Project1].[C2] AS [C1],
[Project1].[ID1] AS [ID1],
[Project1].[AdressLine] AS [AdressLine],
[Project1].[Person_ID] AS [Person_ID]
FROM ( SELECT
NEWID() AS [C1],
[Extent1].[ID] AS [ID],
[Extent1].[Name] AS [Name],
[Extent2].[ID] AS [ID1],
[Extent2].[AdressLine] AS [AdressLine],
[Extent2].[Person_ID] AS [Person_ID],
CASE WHEN ([Extent2].[ID] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C2]
FROM [People] AS [Extent1]
LEFT OUTER JOIN [Addresses] AS [Extent2] ON [Extent1].[ID] = [Extent2].[Person_ID]
) AS [Project1]
ORDER BY [Project1].[C1] ASC, [Project1].[ID] ASC, [Project1].[C2] ASC
Workarounds
If i remove the .Include(p =>p.Addresses) from the query, everything goes fine. but of course the addresses aren't loaded and accessing that collection will make a new call to the database every time.
I can first get the data from the database and randomize later by just adding a .ToList() before the .OrderBy.. like this: var persons = db.Persons.Include(p => p.Addresses).ToList().OrderBy(p => Guid.NewGuid());
Does anybody have any idea of why it is happening like this?
Might this be a bug in the SQL generation?
As one can sort it out by reading AakashM answer and Nicolae Dascalu answer, it strongly seems Linq OrderBy requires a stable ranking function, which NewID/Guid.NewGuid is not.
So we have to use another random generator that would be stable inside a single query.
To achieve this, before each querying, use a .Net Random generator to get a random number. Then combine this random number with a unique property of the entity to get randomly sorted. And to 'randomize' a bit the result, checksum it. (checksum is a SQL Server function that compute a hash; original idea founded on this blog.)
Assuming Person Id is an int, you could write your query this way :
// Random instances should be stored and reused, not instanciated at each usage.
// But beware, it is not thread safe. If you want to share it between threads, you
// would have to use locks, see its documentation.
// https://learn.microsoft.com/en-us/dotnet/api/system.random.
// But using locks is a bad idea for scalability, especially in a Web context.
var randomGenerator = new Random();
// ...
var rnd = randomGenerator.NextDouble();
var persons = db.Persons
.Include(p => p.Addresses)
.OrderBy(p => SqlFunctions.Checksum(p.Id * rnd));
Like the NewGuid hack, this is very probably not a good random generator with a good distribution and so on. But it does not cause entities to get duplicated in results.
Beware:
If your query ordering does not guarantees uniqueness of your entities ranking, you must complement it for guarantying it. By example, if you use a non-unique property of your entities for the checksum call, then add something like .ThenBy(p => p.Id) after the OrderBy.
If your ranking is not unique for your queried root entity, its included children may get mixed with children of other entities having the same ranking. And then the bug will stay here.
Note:
I would prefer use .Next() method to get an int then combine it through a xor (^) to an entity int unique property, rather than using a double and multiply it. But SqlFunctions.Checksum unfortunately does not provide an overload for int data type, though the SQL server function is supposed to support it. You may use a cast to overcome this, but for keeping it simple I finally had chosen to go with the multiply.
tl;dr: There's a leaky abstraction here. To us, Include is a simple instruction to stick a collection of things onto each single returned Person row. But EF's implementation of Include is done by returning a whole row for each Person-Address combo, and reassembling at the client. Ordering by a volatile value causes those rows to become shuffled, breaking apart the Person groups that EF is relying on.
When we have a look at ToTraceString() for this LINQ:
var people = c.People.Include("Addresses");
// Note: no OrderBy in sight!
we see
SELECT
[Project1].[Id] AS [Id],
[Project1].[Name] AS [Name],
[Project1].[C1] AS [C1],
[Project1].[Id1] AS [Id1],
[Project1].[Data] AS [Data],
[Project1].[PersonId] AS [PersonId]
FROM ( SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name],
[Extent2].[Id] AS [Id1],
[Extent2].[PersonId] AS [PersonId],
[Extent2].[Data] AS [Data],
CASE WHEN ([Extent2].[Id] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1]
FROM [Person] AS [Extent1]
LEFT OUTER JOIN [Address] AS [Extent2] ON [Extent1].[Id] = [Extent2].[PersonId]
) AS [Project1]
ORDER BY [Project1].[Id] ASC, [Project1].[C1] ASC
So we get n rows for each A, plus 1 row for each P without any As.
Adding an OrderBy clause, however, puts the thing-to-order-by at the start of the ordered columns:
var people = c.People.Include("Addresses").OrderBy(p => Guid.NewGuid());
gives
SELECT
[Project1].[Id] AS [Id],
[Project1].[Name] AS [Name],
[Project1].[C2] AS [C1],
[Project1].[Id1] AS [Id1],
[Project1].[Data] AS [Data],
[Project1].[PersonId] AS [PersonId]
FROM ( SELECT
NEWID() AS [C1],
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name],
[Extent2].[Id] AS [Id1],
[Extent2].[PersonId] AS [PersonId],
[Extent2].[Data] AS [Data],
CASE WHEN ([Extent2].[Id] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C2]
FROM [Person] AS [Extent1]
LEFT OUTER JOIN [Address] AS [Extent2] ON [Extent1].[Id] = [Extent2].[PersonId]
) AS [Project1]
ORDER BY [Project1].[C1] ASC, [Project1].[Id] ASC, [Project1].[C2] ASC
So in your case, where the ordered-by-thing is not a property of a P, but is instead volatile, and therefore can be different for different P-A records of the same P, the whole thing falls apart.
I'm not sure where on the working-as-intended ~~~ cast-iron bug continuum this behaviour falls. But at least now we know about it.
I dont think there is an issue in query generation, but there is definately an issue when EF tries to convert rows into object.
It looks like there is an inherent assumption here that data for the same person in a joined statement will be returned grouped together order by or not.
for example the result of a joined query will always be
P.Id P.Name A.Id A.StreetLine
1 Person 1 10 ---
1 Person 1 11
2 Person 2 12
3 Person 3 13
3 Person 3 14
even if you order by some other column, same person would always appear one after the other.
this assumption is mostly true for any joined query.
But there is a deeper issue here i think. OrderBy is for when you want data in certain order ( as opposite to random), so that assumption does seem reasonable.
i think you should really get data out and then randomize it according to some other means in your code
From theory:
To sort a list of items, the compare function should be stable relative to items; this means that for any 2 items x, y the result of x< y should be the same as many time is queried(called).
I think the issue is related to misunderstanding of specification(documentation) of OrderBy method:
keySelector - A function to extract a key from an element.
EF didn't mention explicitly if the provided function should return the same value for same object as many times is called (in your case returns different/random values), but I think the "key" term that they used in documentation implicitly suggested this.
When you define a query path to define the query results, (use Include), the query path is only valid on the returned instance of ObjectQuery. Other instances of ObjectQuery and the object context itself are not affected. This functionality lets you chain multiple "Includes" for eager loading.
Therefor, Your statement translates into
from person in db.Persons.Include(p => p.Addresses).OrderBy(p => Guid.NewGuid())
select person
instead of what you intended.
from person in db.Persons.Include(p => p.Addresses)
select person
.OrderBy(p => Guid.NewGuid())
Hence your second workaround works fine :)
Reference: Loading Related Objects While Querying A Conceptual Model in Entity
Framework - http://msdn.microsoft.com/en-us/library/bb896272.aspx
I also ran into this problem, and solved it by adding a Randomizer Guid property to the main class I was fetching. I then set the column's default value to NEWID() like this (using EF Core 2)
builder.Entity<MainClass>()
.Property(m => m.Randomizer)
.HasDefaultValueSql("NEWID()");
When fetching, it gets a bit more complicated. I created two random integers to function as my order-by indexes, then ran the query like this
var rand = new Random();
var randomIndex1 = rand.Next(0, 31);
var randomIndex2 = rand.Next(0, 31);
var taskSet = await DbContext.MainClasses
.Include(m => m.SubClass1)
.ThenInclude(s => s.SubClass2)
.OrderBy(m => m.Randomizer.ToString().Replace("-", "")[randomIndex1])
.ThenBy(m => m.Randomizer.ToString().Replace("-", "")[randomIndex2])
.FirstOrDefaultAsync();
This seems to be working well enough, and should provide enough entropy for even a large dataset to be fairly randomized.
I'm quite new to linq, so please bear with me.
I'm working on a asp.net webpage and I want to add a "search function" (textbox where user inputs name or surname or both or just parts of it and gets back all related information). I have two tables ("Person" and "Application") and I want to display some columns from Person (name and surname) and some from Application (score, position,...). I know how I could do it using sql, but I want to learn more about linq and thus I want to do it using linq.
For now I got two main ideas:
1.)
var person = dataContext.GetTable<Person>();
var application = dataContext.GetTable<Application>();
var p1 = from p in Person
where(p.Name.Contains(tokens[0]) || p.Surname.Contains(tokens[1]))
select new {Id = p.Id, Name = p.Name, Surname = p.Surname}; //or maybe without this line
//I don't know how to do the following properly
var result = from a in Application
where a.FK_Application.Equals(index) //just to get the "right" type of application
//this is not right, but I don't know how to do it better
join p1
on p1.Id == a.FK_Person
2.) The other idea is just to go through "Application" and instead of "join p1 ..." to use
var result = from a in Application
where a.FK_Application.Equals(index) //just to get the "right" type of application
join p from Person
on p.Id == a.FK_Person
where p.Name.Contains(tokens[0]) || p.Surname.Contains(tokens[1])
I think that first idea is better for queries without the first "where" condition, which I also intended to use. Regardless of what is better (faster), I still don't know how to do it using linq. Also in the end I wanted to display / select just some parts (columns) of the result (joined tables + filtering conditions).
I really want to know how to do such things using linq as I'll be dealing also with some similar problems with local data, where I can use only linq.
Could somebody please explain me how to do it, I spent days trying to figure it out and searching on the Internet for answers.
var result = from a in dataContext.Applications
join p in dataContext.Persons
on p.Id equals a.FK_Person
where (p.Name.Contains("blah") || p.Surname.Contains("foo")) && a.FK_Application == index
select new { Id = p.Id, Name = p.Name, Surname = p.Surname, a.Score, a.Position };
Well as Odrahn pointed out, this will give you flat results, with possibly many rows for a single person, since a person could join on multiple applications that all have the same FK. Here's a way to search all the right people, and then add on the relevant application to the results:
var p1 = from p in dataContext.Persons
where(p.Name.Contains(tokens[0]) || p.Surname.Contains(tokens[1]))
select new {
Id = p.Id, Name = p.Name, Surname = p.Surname,
BestApplication = dataContext.Applications.FirstOrDefault(a => a.FK_Application == index /* && ???? */);
};
Sorry - it looks like this second query will result in a roundtrip per person, so it clearly won't be scalable. I assumed L2S would handle it better.
In order to answer this properly, I need to know if Application and Person are directly related (i.e. does Person have many Applications)? From reading your post, I'm assuming that they are because Application seems to have a foreign key to person.
If so, then you could create a custom PersonModel which will be populated by the fields you need from the different entities like this:
class PersonModel
{
string Name { get; set; }
string Surname { get; set; }
List<int> Scores { get; set; }
List<int> Positions { get; set; }
}
Then to populate it, you'd do the following:
// Select the correct person based on Name and Surname inputs
var person = dataContext.Persons.Where(p => p.Name.Contains("firstname") || p.Name.Contains("surname")).FirstOrDefault();
// Get the first person we find (note, there may be many - do you need to account for this?)
if (person != null)
{
var scores = new List<int>();
var positions = new List<int>();
scores.AddRange(person.Applications.Select(i => i.Score);
positions.AddRange(person.Applications.Select(i => i.Position);
var personModel = new PersonModel
{
Name = person.Name,
Surname = person.Surname,
Scores = scores,
Positions = positions
};
}
Because of your relationship between Person and Application, where a person can have many applications, I've had to account for the possibility of there being many scores and positions (hence the List).
Also note that I've used lambda expressions instead of plain linqToSql for simple selecting so that you can visualise easily what's going on.