How to deal with Null Values in OUTER JOINS using Entity Framework 6

How to deal with Null Values in OUTER JOINS using Entity Framework 6 - asp.net-mvc

I need to process the query at the bottom of this post in C#. The query works, but I don't know how to use it in EF6. I used a method and a viewmodel for it (variable query = the query below). But when it encounters null values in the OUTER JOIN, int32 cant accept this value when calling .toList(). What's the best way to deal with it?
var result = context.Database.SqlQuery<TourQueryViewModel>(query);
var reslist = result.ToList();
I tried my first steps with LINQ, but I dont get it how to translate into LINQ itself or the query-methods, that are equivalent to it. So I am hoping for your help.
SELECT toursdata.TourId AS TourId, toursdata.Tourname AS Tourname,toursdata.Tourdate Tourdate,
toursdata.VehicleId AS VehicleId, toursdata.VehicleName AS VehicleName, toursdata.LicenseNumber AS LicenseNumber,
Employees.EmployeeId AS EmployeeId, Employees.Gender AS Gender, Employees.Forename AS Forename, Employees.Surname AS Surname
FROM (
SELECT te.TourId, te.Tourname, te.Tourdate,
Vehicles.VehicleId, Vehicles.VehicleName, Vehicles.LicenseNumber,
TourEmployees.EmployeeId
FROM TourEmployees RIGHT OUTER JOIN Tours AS te ON TourEmployees.TourId = te.TourId,
Tours AS tv INNER JOIN Vehicles ON tv.VehicleId = Vehicles.VehicleId
WHERE tv.TourId = te.TourId
) toursdata
LEFT OUTER JOIN Employees ON toursdata.EmployeeId = Employees.EmployeeId

To eliminate the null problem, I changed the data type of the corresponding entity-attribute to a nullable-type
int turned to int?.
Didnt know about that language feature.

Related

How to use the exceptjoin in Cognos-11?

I don't get an except join to work in Cognos-11. Where or what am I missing?
Some understanding for a beginner in this branch would be nice ;-)
What I've tried so far is making two queries. The first one holds data items like "customer", "BeginningDate" and "Purpose". The second query holds data items like "customer", "Adress" and "Community".
What I'd like to accomplish is to get in query3: the "customers" from query1 that are not available in query2. To me it sounds like an except-join.
I went to the query work area, created a query3 and dragged an "except-join" icon on it. Then I dragged query1 into the upper space and query2 into the lower. What I'm used to getting with other joins, is a possibility to set a new link, cardinality and so on. Now double clicking the join isn't opening any pop-up. The properties of the except-join show "Set operation = Except", "Duplicates = remove", "Projection list = Manual".
How do I get query3 filled with the data item "customer" that only holds a list of customers which are solely appearing in query1?

In SQL terms, you want
select T2.C1
from T1
left outer join T2 on T1.C1 = T2.C1
where T2.C1 is null
So, in the query pane of a Cognos report...
Use a regular join.
Join using customer from both queries.
Change the cardinality to 1..1 on the query1 side and 0..1 on the query2 side.
In the filters for query3, add a filter for query2.customer is null.

EXCEPT is not a join. It is used to compare two data sets.
https://learn.microsoft.com/en-us/sql/t-sql/language-elements/set-operators-except-and-intersect-transact-sql?view=sql-server-2017
What you need is an INNER JOIN. That would be the join tool in the Toolbox in Cognos.

Cannot figure out this complex query

I have three tables and I am having trouble getting to the bottom of this one.
So I am given the ID of a vehicle, and I need to write a query to output a list of its seat rows.
Vehicle
{
id,
vehicleTypeId,
color
}
VehicleType
{
id,
type,
}
VehicleSeats
{
id,
description
}
This is as far as my query has gotten and I am just as a complete loss on getting this one out right. I need it to output a list of seats, not a list of types, I just do not know how to take it deeper.
var vehicleSeatsList = (from c in db.Vehicle
where c.VehicleTypeID == id
select c).ToList();
Here was my final solution. I have yet to plug it in to know if it is right. I was being a dummy. Dont know why I wasn't thinking of just doing a join...
var VehicleTypeSeats = (from c in db.VehicleTypeSeats
join a in db.Vehicles on c.AircraftTypeID equals a.VehicleTypeID
where c.VehicleTypeID == id
select c).ToList();

Assuming this is in SQL, there is no relationship between your VehicleSeats and Vehicle tables. You would need to add VehicleSeats.vehicleID or something like that. If you did, the below would work:
SELECT VehicleSeats.description
FROM VehicleSeats, Vehicle
WHERE VehicleSeats.vehicleID = Vehicle.id
AND Vehicle.id = {your given vehicle ID};

Simply inner join Vehicle and Vehicle seats on their common column then add the where statement.
var VehicleTypeSeats = (from c in db.VehicleTypeSeats
join a in db.Vehicles on c.AircraftTypeID equals a.VehicleTypeID
where c.VehicleTypeID == id
select c).ToList();

How to convert SQL statement "delete from TABLE where someID not in (select someID from Table group by property1, property2)

I'm trying to convert the following SQL statement to Core Data:
delete from SomeTable
where someID not in (
select someID
from SomeTable
group by property1, property2, property3
)
Basically, I want to retrieve and delete possible duplicates in a table where a record is deemed a duplicate if property1, property2 and property3 are equal to another record.
How can I do that?
PS: As the title says, I'm trying to convert the above SQL statement into iOS Core Data methods, not trying to improve, correct or comment on the above SQL, that is beyond the point.
Thank you.

It sounds like you are asking for SQL to accomplish your objective. Your starting query won't do what you describe, and most databases wouldn't accept it at all on account of the aggregate subquery attempting to select a column that is not a function of the groups.
UPDATE
I had initially thought the request was to delete all members of each group containing dupes, and wrote code accordingly. Having reinterpreted the original SQL as MySQL would do, it seems the objective is to retain exactly one element for each combination of (property1, property2, property3). I guess that makes more sense anyway. Here is a standard way to do that:
delete from SomeTable st1
where someID not in (
select min(st2.someId)
from SomeTable st2
group by property1, property2, property3
)
That's distinguished from the original by use of the min() aggregate function to choose a specific one of the someId values to retain from each group. This should work, too:
delete from SomeTable st1
where someID in (
select st3.someId
from SomeTable st2
join SomeTable st3
on st2.property1 = st3.property1
and st2.property2 = st3.property2
and st2.property3 = st3.property3
where st2.someId < st3.someId
)
These two queries will retain the same rows. I like the second better, even though it's longer, because the NOT IN operator is kinda nasty for choosing a small number of elements from a large set. If you anticipate having enough rows to be concerned about scaling, though, then you should try both, and perhaps look into optimizations (for example, an index on (property1, property2, property3)) and other alternatives.
As for writing it in terms of Core Data calls, however, I don't think you exactly can. Core Data does support grouping, so you could write Core Data calls that perform the subquery in the first alternative and return you the entity objects or their IDs, grouped as described. You could then iterate over the groups, skip the first element of each, and call Core Data deletion methods for all the rest. The details are out of scope for the SO format.
I have to say, though, that doing such a job in Core Data is going to be far more costly than doing it directly in the database, both in time and in required memory. Doing it directly in the database is not friendly to an ORM framework such as Core Data, however. This sort of thing is one of the tradeoffs you've chosen by going with an ORM framework.
I'd recommend that you try to avoid the need to do this at all. Define a unique index on SomeTable(property1, property2, property3) and do whatever you need to do to avoid trying to creating duplicates or to gracefully recover from a (failed) attempt to do so.

DELETE SomeTable
FROM SomeTable
LEFT OUTER JOIN (
SELECT MIN(RowId) as RowId, property1, property2, property3
FROM SomeTable
GROUP BY property1, property2, property3
) as KeepRows ON
SomeTable.RowId = KeepRows.RowId
WHERE
KeepRows.RowId IS NULL

A few pointers for doing this in iOS: Before iOS 9 the only way to delete objects is individually, ie you will need to iterate through an array of duplicates and delete each one. (If you are targeting iOS9, there is a new NSBatchDeleteRequest which will help delete them all in one go - it does act directly on the store but also does some cleanup to eg. ensure relationships are updated where necessary).
The other problem is identifying the duplicates. You can configure a fetch to group its results (see the propertiesToGroupBy of NSFetchRequest), but you will have to specify NSDictionaryResultType (so the results are NOT the objects themselves, just the values from the relevant properties.) Furthermore, CoreData will not let you fetch properties (other than aggregates) that are not specified in the GROUP BY. So the suggestion (in the other answer) to use min(someId) will be necessary. (To fetch an expression such as this, you will need to use an NSExpression, embed it in an NSExpressionDescription and pass the latter in propertiesToFetch of the fetch request).
The end result will be an array of dictionaries, each holding the someId value of your prime records (ie the ones you don't want to delete), from which you have then got to work out the duplicates. There are various ways, but none will be very efficient.
So as the other answer says, duplicates are better avoided in the first place. On that front, note that iOS 9 allows you to specify attributes that you would like to be unique (individually or collectively).
Let me know if you would like me to elaborate on any of the above.

Group-wise Maximum:
select t1.someId
from SomeTable t1
left outer join SomeTable t2
on t1.property1 = t2.property1
and t1.property2 = t2.property2
and t1.property3 = t2.property3
and t1.someId < t2.someId
where t2.someId is null;
So, this could be the answer
delete SomeTable
where someId not in
(select t1.someId
from SomeTable t1
left outer join SomeTable t2
on t1.property1 = t2.property1
and t1.property2 = t2.property2
and t1.property3 = t2.property3
and t1.someId < t2.someId
where t2.someId is null);
Sqlfiddle demo

You can use exists function to check for each row if there is another row that exists whose id is not equal to the current row and all other properties that define the duplicate criteria of each row are equal to all the properties of the current row.
delete from something
where
id in (SELECT
sm.id
FROM
sometable sm
where
exists( select
1
from
sometable sm2
where
sm.prop1 = sm2.prop1
and sm.prop2 = sm2.prop2
and sm.prop3 = sm2.prop3
and sm.id != sm2.id)
);

I think you could easily handle this by creating a derived duplicate_flg column and set it to 1 when all three property values are equal. Once that is done, you could just delete those records where duplicate_flg = 1. Here is a sample query on how to do this:
--retrieve all records that has same property values (property1,property2 and property3)
SELECT *
FROM (
SELECT someid
,property1
,property2
,property3
,CASE
WHEN property1 = property2
AND property1 = property3
THEN 1
ELSE 0
END AS duplicate_flg
FROM SomeTable
) q1
WHERE q1.duplicate_flg = 1;
Here is a sample delete statement:
DELETE
FROM something
WHERE someid IN (
SELECT someid
FROM (
SELECT someid
,property1
,property2
,property3
,CASE
WHEN property1 = property2
AND property1 = property3
THEN 1
ELSE 0
END AS duplicate_flg
FROM SomeTable
) q1
WHERE q1.duplicate_flg = 1
);

Simply, if you want to remove duplicate from table you can execute below Query :
delete from SomeTable
where rowid not in (
select max(rowid)
from SomeTable
group by property1, property2, property3
)

if you want to delete all duplicate records try the below code
WITH tblTemp as
(
SELECT ROW_NUMBER() Over(PARTITION BY Property1,Property2,Property3 ORDER BY Property1) As RowNumber,* FROM Table_1
)
DELETE FROM tblTemp where RowNumber >1
Hope it helps

Use the below query to delete the duplicate data from that table
delete from SomeTable where someID not in
(select Min(someID) from SomeTable
group by property1+property2+property3)

Using distinct in a join

I'm still a novice at SQL and I need to run a report which JOINs 3 tables. The third table has duplicates of fields I need. So I tried to join with a distinct option but hat didn't work. Can anyone suggest the right code I could use?
My Code looks like this:
SELECT
C.CUSTOMER_CODE
, MS.SALESMAN_NAME
, SUM(C.REVENUE_AMT)
FROM C_REVENUE_ANALYSIS C
JOIN M_CUSTOMER MC ON C.CUSTOMER_CODE = MC.CUSTOMER_CODE
/* This following JOIN is the issue. */
JOIN M_SALESMAN MS ON MC.SALESMAN_CODE = (SELECT SALESMAN_CODE FROM M_SALESMAN WHERE COMP_CODE = '00')
WHERE REVENUE_DATE >= :from_date
AND REVENUE_DATE <= :to_date
GROUP BY C.CUSTOMER_CODE, MS.SALESMAN_NAME
I also tried a different variation to get a DISTINCT.
/* I also tried this variation to get a distinct */
JOIN M_SALESMAN MS ON MC.SALESMAN_CODE =
(SELECT distinct(SALESMAN_CODE) FROM M_SALESMAN)
Please can anyone help? I would truly appreciate it.
Thanks in advance.

select distinct
c.customer_code,
ms.salesman_code,
SUM(c.revenue_amt)
FROM
c_revenue c,
m_customer mc,
m_salesman ms
where
c.customer_code = mc.customer_code
AND mc.salesman_code = ms.salesman_code
AND ms.comp_code = '00'
AND Revenue_Date BETWEEN (from_date AND to_date)
group by
c.customer_code, ms.salesman_name
The above will return you any distinct combination of Customer Code, Salesman Code and SUM of Revenue Amount where the c.CustomerCode matches an mc.customer_code AND that same mc record matches an ms.salesman_code AND that ms record has a comp_code of '00' AND the Revenue_Date is between the from and to variables. Then, the whole result will be grouped by customer code and salesman name; the only thing that will cause duplicates to appear is if the SUM(revenue) is somehow different.
To explain, if you're just doing a straight JOIN, you don't need the JOIN keywords. I find it tends to convolute things; you only need them if you're doing an "odd" join, like an LEFT/RIGHT join. I don't know your data model so the above MIGHT still return duplicates but, if so, let me know.

ASP.NET MVC & EF4 Entity Framework - Are there any performance concerns in using the entities vs retrieving only the fields i need?

Lets say we have 3 tables, Users, Products, Purchases.
There is a view that needs to display the purchases made by a user.
I could lookup the data required by doing:
from p in DBSet<Purchases>.Include("User").Include("Product") select p;
However, I am concern that this may have a performance impact because it will retrieve the full objects.
Alternatively, I could select only the fields i need:
from p in DBSet<Purchases>.Include("User").Include("Product") select new SimplePurchaseInfo() { UserName = p.User.name, Userid = p.User.Id, ProductName = p.Product.Name ... etc };
So my question is:
Whats the best practice in doing this?
== EDIT
Thanks for all the replies.
[QUESTION 1]: I want to know whether all views should work with flat ViewModels with very specific data for that view, or should the ViewModels contain the entity objects.
Real example: User reviews Products
var query = from dr in productRepository.FindAllReviews()
where dr.User.UserId = 'userid'
select dr;
string sql = ((ObjectQuery)query).ToTraceString();
SELECT [Extent1].[ProductId] AS [ProductId],
[Extent1].[Comment] AS [Comment],
[Extent1].[CreatedTime] AS [CreatedTime],
[Extent1].[Id] AS [Id],
[Extent1].[Rating] AS [Rating],
[Extent1].[UserId] AS [UserId],
[Extent3].[CreatedTime] AS [CreatedTime1],
[Extent3].[CreatorId] AS [CreatorId],
[Extent3].[Description] AS [Description],
[Extent3].[Id] AS [Id1],
[Extent3].[Name] AS [Name],
[Extent3].[Price] AS [Price],
[Extent3].[Rating] AS [Rating1],
[Extent3].[ShopId] AS [ShopId],
[Extent3].[Thumbnail] AS [Thumbnail],
[Extent3].[Creator_UserId] AS [Creator_UserId],
[Extent4].[Comment] AS [Comment1],
[Extent4].[DateCreated] AS [DateCreated],
[Extent4].[DateLastActivity] AS [DateLastActivity],
[Extent4].[DateLastLogin] AS [DateLastLogin],
[Extent4].[DateLastPasswordChange] AS [DateLastPasswordChange],
[Extent4].[Email] AS [Email],
[Extent4].[Enabled] AS [Enabled],
[Extent4].[PasswordHash] AS [PasswordHash],
[Extent4].[PasswordSalt] AS [PasswordSalt],
[Extent4].[ScreenName] AS [ScreenName],
[Extent4].[Thumbnail] AS [Thumbnail1],
[Extent4].[UserId] AS [UserId1],
[Extent4].[UserName] AS [UserName]
FROM [ProductReviews] AS [Extent1]
INNER JOIN [Users] AS [Extent2] ON [Extent1].[UserId] = [Extent2].[UserId]
LEFT OUTER JOIN [Products] AS [Extent3] ON [Extent1].[ProductId] = [Extent3].[Id]
LEFT OUTER JOIN [Users] AS [Extent4] ON [Extent1].[UserId] = [Extent4].[UserId]
WHERE N'615005822' = [Extent2].[UserId]
or
from d in productRepository.FindAllProducts()
from dr in d.ProductReviews
where dr.User.UserId == 'userid'
orderby dr.CreatedTime
select new ProductReviewInfo()
{
product = new SimpleProductInfo() { Id = d.Id, Name = d.Name, Thumbnail = d.Thumbnail, Rating = d.Rating },
Rating = dr.Rating,
Comment = dr.Comment,
UserId = dr.UserId,
UserScreenName = dr.User.ScreenName,
UserThumbnail = dr.User.Thumbnail,
CreateTime = dr.CreatedTime
};
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name],
[Extent1].[Thumbnail] AS [Thumbnail],
[Extent1].[Rating] AS [Rating],
[Extent2].[Rating] AS [Rating1],
[Extent2].[Comment] AS [Comment],
[Extent2].[UserId] AS [UserId],
[Extent4].[ScreenName] AS [ScreenName],
[Extent4].[Thumbnail] AS [Thumbnail1],
[Extent2].[CreatedTime] AS [CreatedTime]
FROM [Products] AS [Extent1]
INNER JOIN [ProductReviews] AS [Extent2] ON [Extent1].[Id] = [Extent2].[ProductId]
INNER JOIN [Users] AS [Extent3] ON [Extent2].[UserId] = [Extent3].[UserId]
LEFT OUTER JOIN [Users] AS [Extent4] ON [Extent2].[UserId] = [Extent4].[UserId]
WHERE N'userid' = [Extent3].[UserId]
ORDER BY [Extent2].[CreatedTime] ASC
[QUESTION 2]: Whats with the ugly outer joins?

In general, only retrieve what you need, but keep in mind to retrieve enough information so your application is not too chatty, so if you can batch a bunch of things together, do so, otherwise you'll pay network traffic cost everytime you need to go back to the database and retrieve some more stuffs.
In this case, assuming you will only need those info, I would go with the second approach (if that's what you really need).

Eager loading with .Include doesn't really play nice when you want filtering (or ordering for that matter).
That first query is basically this:
select p.*, u.*, p2.*
from products p
left outer join users u on p.userid = u.userid
left outer join purchases p2 on p.productid = p2.productid
where u.userid == #p1
Is that really what you want?
There is a view that needs to display the purchases made by a user.
Well then why are you including "Product"?
Shouldn't it just be:
from p in DBSet<Purchases>.Include("User") select p;
Your second query will error. You must project to an entity on the model, or an anonymous type - not a random class/DTO.
To be honest, the easiest and most well performing option in your current scenario is to query on the FK itself:
var purchasesForUser = DBSet<Purchases>.Where(x => x.UserId == userId);
That should produce:
select p.*
from products p
where p.UserId == #p1
The above query of course requires you to include the foreign keys in the model.
If you don't have the FK's in your model, then you'll need more LINQ-Entities trickery in the form of anonymous type projection.
Overall, don't go out looking to optimize. Create queries which align with the scenario/business requirement, then optimize if necessary - or look for alternatives to LINQ-Entities, such as stored procedures, views or compiled queries.
Remember: premature optimization is the root of all evil.
*EDIT - In response to Question Update *
[QUESTION 1]: I want to know whether all views should work with flat ViewModels with very specific data for that view, or should the ViewModels contain the entity objects.
Yes - ViewModel's should only contain what is required for that View. Otherwise why have the ViewModel? You may as well bind straight to the EF model. So, setup the ViewModel which only the fields it needs for the view.
[QUESTION 2]: What's with the ugly outer joins?
That is default behaviour for .Include. .Include always produces a left outer join.

I think the second query will throw exception because you can't map result to unmapped .NET type in Linq-to-entities. You have to return annonymous type and map it to your object in Linq-to-objects or you have to use some advanced concepts for projections - QueryView (projections in ESQL) or DefiningQuery (custom SQL query mapped to new readonly entity).
Generally it is more about design of your entities. If you select single small entity it is not a big difference to load it all instead of projection. If you are selecting list of entities you should consider projections - expecially if tables contains columns like nvarchar(max) or varbinar(max) which are not needed in your result!

Both create almost the same query: select from one table, with two inner joins. The only thing that changes from a database perspective is the amount of fields returned, but that shouldn't really matter that much.
I think here DRY wins from a performance hit (if it even exists): so my call is go for the first option.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart