my Linq query times out and the results are never retreived..
using (var repository = PersistenceService.GetRepository<Organisation>(UserId))
{
var organisation = repository.GetAll().FirstOrDefault(x => x.Id == organisationId);
if (organisation != null)
{
isCourseApproverRole = organisation.Contacts.FirstOrDefault(con => con.RoleName == "CourseApprover" &&
con.Individual.Id == individualId) != null;
}
}
When I try doing all this in one query it works fine..
Can some one explain why above query will time out??
Note: organisation.Contacts contain about 18,000 rows for the selected organisation.
It's because of massive lazy loading.
The first command...
var organisation = repository.GetAll().FirstOrDefault(x => x.Id == organisationId);
... pulls an Organisation object into memory. That shouldn't be any problem.
But then you access organisation.Contacts. It doesn't matter which LINQ method you apply to this collection, the whole collection is pulled into memory by lazy loading. The LINQ filter is applied afterwards.
However, though highly inefficient, this still shouldn't cause a timeout. Fetching 18000 records by an indexed search (I assume) shouldn't take more than 30s (I assume) unless something else is terribly wrong (like low resources, bad network).
The part con.Individual.Id == individualId is the culprit. If you would have monitored the executed SQL commands you'd have seen that this causes one query for each Individual until the predicate Id == individualId is matched. These queries run while the query organisation.Contacts is read. No doubt, this causes the timeout.
You could fix this by replacing the predicate by con.IndividualId == individualId (i.e. using the foreign key property). But I really don't understand why you don't do this in one query, which works fine, as you say. By the current approach you fetch large amounts of data, while in the end you only need one boolean value!
Related
I'm trying to display a table that counts webhooks and arranges the various counts into cells by date_sent, sending_ip, and esp (email service provider). Within each cell, the controller needs to count the webhooks that are labelled with the "opened" event, and the "sent" event. Our database currently includes several million webhooks, and adds at least 100k per day. Already this process takes so long that running this index method is practically useless.
I was hoping that Rails could break down the enormous model into smaller lists using a line like this:
#today_hooks = #m_webhooks.where(:date_sent => this_date)
I thought that the queries after this line would only look at the partial list, instead of the full model. Unfortunately, running this index method generates hundreds of SQL statements, and they all look like this:
SELECT COUNT(*) FROM "m_webhooks" WHERE "m_webhooks"."date_sent" = $1 AND "m_webhooks"."sending_ip" = $2 AND (m_webhooks.esp LIKE 'hotmail') AND (m_webhooks.event LIKE 'sent')
This appears that the "date_sent" attribute is included in all of the queries, which implies that the SQL is searching through all 1M records with every single query.
I've read over a dozen articles about increasing performance in Rails queries, but none of the tips that I've found there have reduced the time it takes to complete this method. Thank you in advance for any insight.
m_webhooks.controller.rb
def index
def set_sub_count_hash(thip) {
gmail_hooks: {opened: a = thip.gmail.send(#event).size, total_sent: b = thip.gmail.sent.size, perc_opened: find_perc(a, b)},
hotmail_hooks: {opened: a = thip.hotmail.send(#event).size, total_sent: b = thip.hotmail.sent.size, perc_opened: find_perc(a, b)},
yahoo_hooks: {opened: a = thip.yahoo.send(#event).size, total_sent: b = thip.yahoo.sent.size, perc_opened: find_perc(a, b)},
other_hooks: {opened: a = thip.other.send(#event).size, total_sent: b = thip.other.sent.size, perc_opened: find_perc(a, b)},
}
end
#m_webhooks = MWebhook.select("date_sent", "sending_ip", "esp", "event", "email").all
#event = params[:event] || "unique_opened"
#m_list_of_ips = [#List of three ip addresses]
end_date = Date.today
start_date = Date.today - 10.days
date_range = (end_date - start_date).to_i
#count_array = []
date_range.times do |n|
this_date = end_date - n.days
#today_hooks = #m_webhooks.where(:date_sent => this_date)
#count_array[n] = {:this_date => this_date}
#m_list_of_ips.each_with_index do |ip, index|
thip = #today_hooks.where(:sending_ip => ip) #Stands for "Today Hooks ip"
#count_array[n][index] = set_sub_count_hash(thip)
end
end
Well, your problem is very simple, actually. You gotta remember that when you use where(condition), the query is not straight executed in the DB.
Rails is smart enough to detect when you need a concrete result (a list, an object, or a count or #size like in your case) and chain your queries while you don't need one. In your code, you keep chaining conditions to the main query inside a loop (date_range). And it gets worse, you start another loop inside this one adding conditions to each query created in the first loop.
Then you pass the query (not concrete yet, it was not yet executed and does not have results!) to the method set_sub_count_hash which goes on to call the same query many times.
Therefore you have something like:
10(date_range) * 3(ip list) * 8 # (times the query is materialized in the #set_sub_count method)
and then you have a problem.
What you want to do is to do the whole query at once and group it by date, ip and email. You should have a hash structure after that, which you would pass to the #set_sub_count method and do some ruby gymnastics to get the counts you're looking for.
I imagine the query something like:
main_query = #m_webhooks.where('date_sent > ?', 10.days.ago.to_date)
.where(sending_ip:#m_list_of_ips)
Ok, now you have one query, which is nice, but I think you should separate the query in 4 (gmail, hotmail, yahoo and other), which gives you 4 queries (the first one, the main_query, will not be executed until you call for materialized results, don forget it). Still, like 100 times faster.
I think this is the result that should be grouped, mapped and passed to #set_sub_count instead of passing the raw query and calling methods on it every time and many times. It will be a little work to do the grouping, mapping and counting for sure, but hey, it's faster. =)
In case this helps anybody else, I learned how to fill a hash with counts in a much simpler way. More importantly, this approach runs a single query (as opposed to the 240 queries that I was running before).
#count_array[esp_index][j] = MWebhook.where('date_sent > ?', start_date.to_date)
.group('date_sent', 'sending_ip', 'event', 'esp').count
I'll try to explain this well, as it's quite strange.
I have a .NET MVC 4 app with Linq2SQL where I'm developing a search where users are able to seek elements by one or more different values.
I have a method that receive a string like: "value1&value2|value3&value4"
I pretend to return results where a determinated element contains value1 AND value2 OR that determinated element contains value3 AND value4.
I'm using an algorithm like this:
if (!String.IsNullOrEmpty(figuresSelected)) {
string[] groupsOR = figuresSelected.Split('|');
IQueryable<CENTROSOPERACIONES> sqlAuxOr = null;
foreach (string goupOR in groupsOR) {
string[] groupsAND = groupOR.Split('&');
IQueryable<CENTROSOPERACIONES>[] sqlAuxAnd = sql; //sql contains current query so far
foreach (string group in groupsAND) {
sqlAuxAnd = sqlAuxAnd.Where(o => o.CENTROS.FIGURASCENTROS.Any(f => f.FIGURAS.nombre == group));
}
if (sqlAuxAnd != null)
if (sqlAuxOr == null)
sqlAuxOr = sqlAuxAnd;
else
sqlAuxOr = sqlAuxOr.Union(sqlAuxAnd);
}
}
Well, the problem here is that, on every iteration inside the inner loop, sqlAuxAnd queries from the original data container on sql, not from the subresult obtained on the previous loop. I've followed the loop iteration by iteration and I've checked that sqlAuxAnd contains the expected subresult on the first iteration, but when reaching a second one that value is lost and a full query from the whole data is stored on it.
I've tried to store sqlAuxAnd partial results on different elements (I used an array) and checked it's evolution. On the first iteration sqlAuxAnd[0] contains the expected result and sqlAuxAnd[1] is empty, on the second iteration both elements, sqlAuxAnd[0] and sqlAuxAnd[1] contains the expected result from querying the second value from the whole dataset.
It seems I'm losing something here... probably it's something trivial, but I supose I'm too dizzy to see it. Anyone have any idea to share?
Feel free to ask for explanations on the matter, english is not my mother language and I supose that probably the question is not as well redacted as it should be.
You need to capture the group variable from the foreach:
foreach (string group in groupsAND) {
var captured = group;
sqlAuxAnd = sqlAuxAnd.Where(o => o.CENTROS.FIGURASCENTROS.Any(f => f.FIGURAS.nombre == captured));
}
See Linq query built in foreach loop always takes parameter value from last iteration.
I have an an entity containing two optional to-many relationships (childA <<-> parent <->> childB). Each of these two child entities contain an optional string that I am interested in querying on.
Using the same format, I get the results I expect for one, but not the other. I understand that means I don't understand the tools I'm working with; and hoped for some insight. This is what the two queries look like:
childA.#count != 0 AND (0 == SUBQUERY(childA, $a, $a.string != NIL).#count)
childB.#count != 0 AND (0 == SUBQUERY(childB, $a, $a.string != NIL).#count)
I would expect to get back results from non-nil instances of both childA and childB only if each entity instances' string is also nil. My question is, why would one give the results that I expect; while the other does not?
Clarification:
I'm trying to solve the general problem where I'm searching for one of two things. I'm either searching for a default value in an attribute. When the attribute is optional, I'm additionally searching for a nil attribute. The problem is further compounded when optional relationships' should only be considered when the are populated. Without the relationship count != 0, I get back all parents with a nil relationship. In one case, this is the desired behavior. In another case, this appears to diminish the returned parent count (to 0 results).
For the optional attribute case, the query might look like:
parent.#count != 0 AND (parent.gender == -1) OR (parent.gender == NIL)
Where there are optional relationships in the key-paths, the query takes the form exemplified in the first example.
Again, I have gotten the results I have expected with all but one case, where there doesn't seem to be anything unique to it's relationships nor attribute characteristics. Or I should say, there's nothing unique about this exception in data model structure or query format...
Maybe you mixed up == and != in the second clause, and it should be
childA.#count != 0 AND (SUBQUERY(childA, $a, $a.string != NIL).#count != 0)
It would be clearer what you want to achieve if you could formulate the query in plain English first.
BTW, you can use expressionForSubquery:usingIteratorVariable:predicate: of class NSExpression to build the subquery for you. You might get useful error reporting more easily then.
I understand the problem now.
In my case, it's usually logical correct to first filter out NSSets with 0 count. But, in the problematic case, it's logically correct to return the results of both NSSets with 0 count, and NSSets with > 0 count where the attribute is nil (when optional), or the attribute is set to its default value. In other words, in the problematic case, the left condition needs to be removed, resulting in the following format:
(0 == SUBQUERY(childA, $a, $a.string != NIL).#count)
It seems I'll need to have the managed objects indicate which scenario is appropriate on a case by case basis...yuk!
I am designing a project on ASP.NET MVC 3.
I am using this query in my controller:
int batchSize = (int)db.ProductFormulation
.Where(r => r.ProductID == p)
.Min(r => r.Quantity);
Where p is input by the user.
When i run my project and user enter a value of p which does not exist in my table then an error occurrs.
How can i stop this error, e.g a message box should be created that states there does not exist record for value you entered, and my project should run continuously.
Please suggest me what should i do for it. Thanks in advance.
You're getting an error because Min is operating over a sequence with no elements.
You're basically looking for MinOrDefault(), which doesn't exist in the LINQ framework.
This answer has a good implementation of how to achieve it.
Alternatively, if you don't want to do the aggregate operation server-side, you could materialize the sequence first then perform the min:
int batchSize = 0;
var results = db.ProductFormulation.Where(r => r.ProductID == p).ToList();
if (results.Count > 0)
batchSize = results.Min(x => x.Quantity);
Obviously if you've got a lot of records, the above isn't really suitable, and you're better off with the aforementioned extension method.
I'm attempting to implement complete search functionality in my ASP.NET MVC (C#, Linq-to-Sql) website.
The site consists of about 3-4 tables that have about 1-2 columns that I want to search.
This is what I have so far:
public List<SearchResult> Search(string Keywords)
{
string[] split = Keywords.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
List<SearchResult> ret = new List<SearchResult>();
foreach (string s in split)
{
IEnumerable<BlogPost> results = db.BlogPosts.Where(x => x.Text.Contains(s) || x.Title.Contains(s));
foreach (BlogPost p in results)
{
if (ret.Exists(x => x.PostID == p.PostID))
continue;
ret.Add(new SearchResult
{
PostTitle= p.Title,
BlogPostID = p.BlogPostID,
Text=p.Text
});
}
}
return ret;
}
As you can see, I have a foreach for the keywords and an inner foreach that runs over a table (I would repeat it for each table).
This seems inefficent and I wanted to know if theres a better way to create a search method for a database.
Also, what can I do to the columns in the database so that they can be searched faster? I read something about indexing them, is that just the "Full-text indexing" True/False field I see in SQL Management Studio?
Also, what can I do to the columns in
the database so that they can be
searched faster? I read something
about indexing them, is that just the
"Full-text indexing" True/False field
I see in SQL Management Studio?
Yes, enabling full-text indexing will normally go a long way towards improving performance for this scenario. But unfortunately it doesn't work automatically with the LIKE operator (and that's what your LINQ query is generating). So you'll have to use one of the built-in full-text searching functions like FREETEXT, FREETEXTTABLE, CONTAINS, or CONTAINSTABLE.
Just to explain, your original code will be substantially slower than full-text searching as it will typically result in a table scan. For example, if you're searching a varchar field named title with LIKE '%ABC%' then there's no choice but for SQL to scan every single record to see if it contains those characters.
However, the built-in full-text searching will actually index the text of every column you specify to include in the full-text index. And it's that index that drastically speeds up your queries.
Not only that, but full-text searching provides some cool features that the LIKE operator can't give you. It's not as sophisticated as Google, but it has the ability to search for alternate versions of a root word. But one of my favorite features is the ranking functionality where it can return an extra value to indicate relevance which you can then use to sort your results. To use that look into the FREETEXTTABLE or CONTAINSTABLE functions.
Some more resources:
Full-Text Search (SQL Server)
Pro Full-Text Search in SQL Server 2008
The following should do the trick. I can't say off the top of my head whether the let kwa = ... part will actually work or not, but something similar will be required to make the array of keywords available within the context of SQL Server. I haven't used LINQ to SQL for a while (I've been using LINQ to Entities 4.0 and nHibernate for some time now, which have a different set of capabilities). You might need to tweak that part to get it working, but the general principal is sound:
public List<SearchResult> Search(string keywords)
{
var searcResults = from bp in db.BlogPosts
let kwa = keywords.Split(new char[]{' '}, StringSplitOptions.RemoveEmptyEntries);
where kwa.Any(kw => bp.Text.Contains(kw) || bp.Title.Contains(kw))
select new SearchResult
{
PostTitle = bp.Title,
BlogPostID = bp.BlogPostID,
Test = bp.Text
};
return searchResults.ToList();
}