Efficient way to get a huge number of records sorted via Linq. Also some help regarding editing an existing DB entry - asp.net-mvc

The fist part of my question is
suppose I have a poco class
public class shop{
public virtual string fruitName {get;set;}
public virtual double numberOfFruitsLeftToConsume {get;set;}
public virtual double numberOfFruitsLeftForStorage {get;set;}
public virtual List<Locations> shopLocations {get;set;}
}
I add new fruits in the db by creating a new object of shop and then add it via my context then save it.
Now to retrieve the data
will it be more efficient for me to first filter by fruit name get a List then in that collection should I run my query to sort by the number of fruits to consume , or should I just put it all into one query. Supposing that the site has more than 1000 hits/sec and a massive DB, which method will be efficient.
List<shop> sh = context.shopDB.Where(p => p.fruitName == "mango" &&
p.fruitName == "apple").ToList();
List<shop> sh = sh.Where(f => f.numberOfFruitsLeftToConsume >= 100 &&
f.numberOfFruitsLeftForStorage <= 100).ToList();
The example has no meaning , I just wanted to show the type of query I am using.
The second part of my question is, when I initialize the class shop I do not initialize the List within it. Later on when I try to add it it does not get saved, the shop is connected to the user.
ApplicationUser user = await usemanager.FindByEmailAsync("email");
if(user.shops.shopLocations == null){
user.shops.shopLocation = new List<Location>();
uset.shops.shopLocation.Add(someLocation);
await context.shopDB.SaveChangesAsync();
}
////already tried
//List<Location> loc = new List<Location>();
//loc.Add(someLocation);
//user.shops.shopLocation = loc;
//await context.shopDB.SaveChangesAsync();
I tried both the methods in a try catch block and no exception is thrown.
If you need any more details or if something is not clear to you please ask.
Thank you.
If I add Location and LocationId properties to shop, and then save, I can only view the LocationId , but the Location property still remains null.
To clear any question , If I save a location Individually it saves. So I don't think I'm providing wrong data.

Now to retrieve the data will it be more efficient for me to first filter by fruit name get a List then in that collection should I run my query to sort by the number of fruits to consume , or should I just put it all into one query. Supposing that the site has more than 1000 hits/sec and a massive DB, which method will be efficient.
You are the only one who can answer that question by measuring the query performance. Only as a general rule I can say that putting all into one query and let the database do the most of the job (eventually tuning it by creating appropriate indexes) is usually preferable.
What about the second part of the question (which basically is a different question), this block
if (user.shops.shopLocations == null)
{
user.shops.shopLocation = new List<Location>();
user.shops.shopLocation.Add(someLocation);
await context.shopDB.SaveChangesAsync();
}
looks suspicious. Your shopLocations member is declared as virtual, which means it's intended to use lazy loading, hence most probably will never be null. And even if it is null, you need to keep only the new part inside the if and do the rest outside, like this
if (user.shops.shopLocations == null)
user.shops.shopLocation = new List<Location>();
user.shops.shopLocation.Add(someLocation);
await context.shopDB.SaveChangesAsync();

1st Question
Because you are calling .ToList() at the end of your queries it will have to fetch all the rows from the db each time, so it will be much faster to do all your filtering in one LINQ .Where() call like this:
List<shop> sh = context.shopDB.Where(p => p.fruitName == "mango" && p.fruitName == "apple" && f.numberOfFruitsLeftToConsume >= 100 && f.numberOfFruitsLeftForStorage <= 100).ToList();
but if you don't call .ToList() at the end of first Linq query, spliting your query into two calls will be tottally fine and will yield the same performance as the previous approach like this:
var sh = context.shopDB.Where(p => p.fruitName == "mango" &&
p.fruitName == "apple");
List<shop> shList = sh.Where(f => f.numberOfFruitsLeftToConsume >= 100 &&
f.numberOfFruitsLeftForStorage <= 100).ToList();
2nd Question
when you initialize the Location for the shop, you must set the shopId property and then it should work, if not the problem might be with your database relationships.

Related

How can I improve this query in Entity framework 6.0?

Can someone please suggest me what wrong with this query? How can I improve the performance and decrease the time taken to execute it?
IQueryable<Mapper> query = null;
query = (from c in entities.Users
where c.UserEmailAddress == emailAddress
&& c.UserPassword == password
&& c.IsAccountVerified == true
select new Mapper()
{
UserId= c.UserID,
Name = c.UserName
});
custObj = query.ToList<Mapper>().FirstOrDefault();
I am using EF profiler it alerts me following warning:-
Query on unindexed column
Column Type mismatch
More than one session per request
FYI:-
EmailAddress - varchar(50) - Non ClusteredIndex
Password - varchar(max) - No Index
IsAccountVerified - bool - No Index
Even in local, I notice its taking 2-4 seconds to execute?
Apart from it, is there can someone suggest imp guidelines to fine tune the queries in EF.
I am using EF6.0
I think the problem is that you're using unnecessary complex query as EmailAddress is probably unique. Now you are checking three conditions to select your record, but using only email address should be fairly enough. I would rather select user basing on EmailAddress (and maybe IsAccountVerified) and later checked password hash in code.
The code would be something like this (I haven't checked it):
var user = entities.Users.FirstOrDefault(u => u.EmailAddress == emailAddress)
Mapper custObj = null;
if(user != null && user.IsAccountVerified && user.Password == password)
custObj = new Mapper
{
UserId = user.UserID,
Name = user.UserName
};
Now you are not making a query on non indexed column, and results will be the same.
I checked simillar case on MS SQL database. Select based on one condition using indexed column boosted the query nicely (0ms instead of 13ms in my case).

Add multiple relationships between 2 nodes with Neo4jClient

I am building a social graph application using Neo4jClient to store data and I am trying to come up with the best strategy to model where a user has worked and currently works. So my User is connected with the relationship "USER_WORK" to several Work nodes which has StartDate/EndDate properties. If EndDate is not set I want to add another relationship "CURRENT" between the user/work node to be able to fetch only current work places in an efficient way.
For some reason Neo4jClient does not let me do this? Below query executes without exceptions and the work node and all relationships except "CURRENT" is added (and yes I have checked that there is no problem with the EndDate logic:) I have also tried using Create instead of CreateUnique but that doesn't solve the problem :(
var query = graphClient.Cypher
.Match("(city:City)", "(profession:Profession)", "(company:Company)", "(user:User)")
.Where((CityEntity city) => city.Id == model.CityId)
.AndWhere((ProfessionEntity profession) => profession.Id == model.ProfessionId)
.AndWhere((CompanyEntity company) => company.Id == model.CompanyId)
.AndWhere((UserEntity user) => user.Id == model.UserId)
.Merge("(work:Work { Id: {Id} })")
.OnCreate()
.Set("work = {entity}")
.WithParams(new
{
Id = entity.Id,
entity
})
.CreateUnique("work-[:WORK_AS_PROFESSION]->profession")
.CreateUnique("work-[:WORK_AT_COMPANY]->company")
.CreateUnique("work-[:WORK_IN_CITY]->city")
.CreateUnique("user-[:USER_WORK]->work");
if (model.EndDate == DateTime.MinValue)
{
query.CreateUnique("user-[:CURRENT]->work");
}
query.ExecuteWithoutResults();
When you call CreateUnique to create the user-[:CURRENT]->work relationship, it's not actually being appended to the query. What you need to change that line to is:
query = query.CreateUnique("user-[:CURRENT]->work");
Which is what is happening for all the fluent methods chained in the first query you write out. The easiest way to spot these things is stick a breakpoint on the query.ExecuteWithoutResults(); method and when VS breaks there, hover over query and see if the text matches what you think it should.

Query - Does Not Contain

I have a search query to lookup Customers.
I would like to use the Sounds Like function to return additional possible results, however this is returning some of the same results in my main search query.
I would like to only show the additional results in a partial view.
I basically need a DoesNotContain.
Here is what I have so far for my main query:
customer = customer.Where(c => SqlFunctions.StringConvert((double)c.CustomerID).Trim().Equals(searchString)
|| c.CustomerName.ToUpper().Contains(searchString.ToUpper()));
And for the additional results:
customeradditional = customeradditional.Where(c => SqlFunctions.SoundCode(c.CustomerName.ToUpper()) == SqlFunctions.SoundCode(searchString.ToUpper()));
The only possible solution I can see at the minute is to do a Contains Query, loop through each item and get the IDs, then do another query for CustomerID != 1 or CustomerID != 2 or CustomerID != 3, etc.
Try using Except:
customeradditional = customeradditional
.Where(c => SqlFunctions.SoundCode(c.CustomerName.ToUpper()) == SqlFunctions.SoundCode(searchString.ToUpper()))
.Except(customer);
I am not sure if I understood you correct:
From what you have now, the customeraddtional query does return some of the customers already returned in the customer query. And you only want the results, which are not already contained in the customer query.
Then the solution would be:
customeradditional = customeradditional.Where(c =>
SqlFunctions.SoundCode(c.CustomerName.ToUpper()) ==
SqlFunctions.SoundCode(searchString.ToUpper()))
.Except(customer);
This way your are explicitly excluding every item, which is present in the customer object.

Entity Framework 4 Linq help - Pulling data from multiple tables filtered

Not sure how this is done, I have my .edmx set up so that the navigation properties match the foreign key relationships on the tables. Not sure if I still need to perform joins or if EF will give me access to the related table data through the navigational properties automatically.
What I need to do it get all the ContentSections and their associated ContentItems based on the ContentView and filtered by the DiversionProgram.CrimeNumber.
I would like to get back IEnumerable, for each ContentSection it should have access to it's ContentItems via the navigation property ContentItems
Thanks
Something like:
using(Entities context = new Entities())
{
IEnumerable<ContentSection> enumerator = context.ContentSections
.Include("ContentItems")
.Where<ContentSection>(cs => cs.ContentView.ContentViewID == someID && cs.ContentItems.Where<ContentItem>(ci => ci.DiversionProgram.CrimeNumber == someCrimeNumber))
.AsEnumerable<ContentSection>
}
I've interpreted
based on the ContentView
as cs.ContentView.ContentViewID == someID
This will give you all the ContentSections for a given ContentView. And interpreted
filtered by the DiversionProgram.CrimeNumber
as cs.ContentItems.Where<ContentItem>(ci => ci.DiversionProgram.CrimeNumber == someCrimeNumber)
which will give you all those ContentItems that have a specific CrimeNumber.
Or did you mean something else with based on / filtered by. Maybe OrderBy, or all those ContentSections where Any of it's ContentItems would have a certain CrimeNumber?
You can eager load to get all associated records, but when you want to start filtering/ordering, don't bother with Include.
Just do a projection with anonymous types and EF will work out what it needs to do. It's a bit hairy, but it'll work. If it get's too complicated, bite the bullet and use a SPROC.
Now, with that caveat, something like this (off the top of my head):
var query = ctx.ContentView
.Select(x => new
{
ContentSections = x.ContentSections
.Where(y => y.ContentItems
.Any(z => z.DivisionProgram.CrimeNumber = 87))
}).ToList().Select(x => x.ContentSections);
If you use the CTP5 you can do something very unique it looks like this:
var context = new YourEntitiesContext();
var query = context.ContentView.Include(cs => cs.ContentSections
.Select(ci => ci.ContentItems
.Select(dp => dp.DiversionProgram)
.Where(dp.CrimeNumber == crimeNumber)))
.Where(cv => cv.ContentViewID == contentViewID).FirtsOrDefault();
You can learn more about the CTP5 and how it can be used in Database first scenario here
var query = from t1 in studentManagementEntities.StudentRegistrations
join t2 in studentManagementEntities.StudentMarks
on t1.StudentID equals t2.StudentID
select new
{
t1.selected column name,
t2.selected column name
};

StackOverflowException caused by a linq query

edit #2: Question solved halfways. Look below
As a follow-up question, does anyone know of a non-intrusive way to solve what i'm trying to do below (namely, linking objects to each other without triggering infinite loops)?
I try to create a asp.net-mvc web application, and get a StackOverFlowException. A controller triggers the following command:
public ActionResult ShowCountry(int id)
{
Country country = _gameService.GetCountry(id);
return View(country);
}
The GameService handles it like this (WithCountryId is an extension):
public Country GetCountry(int id)
{
return _gameRepository.GetCountries().WithCountryId(id).SingleOrDefault();
}
The GameRepository handles it like this:
public IQueryable<Country> GetCountries()
{
var countries = from c in _db.Countries
select new Country
{
Id = c.Id,
Name = c.Name,
ShortDescription = c.ShortDescription,
FlagImage = c.FlagImage,
Game = GetGames().Where(g => g.Id == c.GameId).SingleOrDefault(),
SubRegion = GetSubRegions().Where(sr => sr.Id == c.SubRegionId).SingleOrDefault(),
};
return countries;
}
The GetGames() method causes the StackOverflowException:
public IQueryable<Game> GetGames()
{
var games = from g in _db.Games
select new Game
{
Id = g.Id,
Name = g.Name
};
return games;
}
My Business objects are different from the linq2sql classes, that's why I fill them with a select new.
An unhandled exception of type 'System.StackOverflowException' occurred in mscorlib.dll
edit #1: I have found the culprit, it's the following method, it triggers the GetCountries() method which in return triggers the GetSubRegions() again, ad nauseam:
public IQueryable<SubRegion> GetSubRegions()
{
return from sr in _db.SubRegions
select new SubRegion
{
Id = sr.Id,
Name = sr.Name,
ShortDescription = sr.ShortDescription,
Game = GetGames().Where(g => g.Id == sr.GameId).SingleOrDefault(),
Region = GetRegions().Where(r => r.Id == sr.RegionId).SingleOrDefault(),
Countries = new LazyList<Country>(GetCountries().Where(c => c.SubRegion.Id == sr.Id))
};
}
Might have to think of something else here :) That's what happens when you think in an OO mindset because of too much coffee
Hai! I think your models are recursively calling a method unintentionally, which results in the stack overflow. Like, for instance, your Subregion object is trying to get Country objects, which in turn have to get Subregions.
Anyhow, it always helps to check the stack in a StackOverflow exception. If you see a property being accessed over and over, its most likely because you're doing something like this:
public object MyProperty { set { MyProperty = value; }}
Its easier to spot situations like yours, where method A calls method B which calls method A, because you can see the same methods showing up two or more times in the call stack.
The problem might be this: countries have subregions and subregions have countries. I don't know how you implement the lazy list, but that might keep calling GetCountries and then GetSubRegions and so on. To find that out, I would launch the debugger en set breakpoints on the GetCountries and GetSubRegions method headers.
I tried similar patterns with LinqToSql, but it's hard to make bidirectional navigation work without affecting the performance to much. That's one of the reasons I'm using NHibernate right now.
To answer your edited question, namely: "linking objects to each other without triggering infinite loops":
Assuming you've got some sort of relation where both sides need to know about the other... get hold of all the relevant entities in both sides, then link them together, rather than trying to make the fetch of one side automatically fetch the other. Or just make one side fetch the other, and then fix up the remaining one. So in your case, the options would be:
Option 1:
Fetch all countries (leaving Subregions blank)
Fetch all Subregions (leaving Countries blank)
For each Subregion, look through the list of Countries and add the Subregion to the Country and the Country to the Subregion
Option 2:
Fetch all countries (leaving Subregions blank)
Fetch all Subregions, setting Subregion.Countries via the countries list fetched above
For each subregion, go through all its countries and add it to that country
(Or reverse country and subregion)
They're basically equialent answers, it just changes when you do some of the linking.

Resources