Would it make sense to add inlineCount to the resultset when the data comes from the cache instead of the server ?
I store the count locally, so as long as don't leave the current url (I use angularjs), I can get it from a variable in my controller. But once I've left that url, if I go back to it, the data will still come from the cache, but my local variable is reset to initial value.
Update 12 March 2015
The requisite changes have been committed and should make the next release (after 1.5.3).
Of course the inlineCount is only available in async query execution.
The synchronous em.executeQueryLocally(query) returns immediately with the paged result array; there's no place to hold the inlineCount.
Here's an excerpt from the new, corresponding test in DocCode.queryTests: "can page queried customers in cache with inline count"
var query = EntityQuery.from('Customers')
.where('CompanyName', 'startsWith', 'A')
.orderBy('CompanyName')
.skip(2).take(2)
.inlineCount()
.using(breeze.FetchStrategy.FromLocalCache);
return em.executeQuery(query)
.then(localQuerySucceeded);
function localQuerySucceeded(data) {
var custs = data.results;
var count = custs.length;
equal(count, 2,
"have full page of cached 'A' customers now; count = " + count);
var inlineCount = data.inlineCount;
ok(inlineCount && inlineCount > 2,
'have inlineCount=' + inlineCount + ' which is greater than page size');
}
Update 9 May 2014
After more reflection, I've decided that you all are right and I have been wrong. I have entered feature request #2267 in our internal tracking system. No promise as to when we'll get it done (soon I trust); stay on us.
Original
I don't think it makes sense to support inlineCount for cache queries because Breeze can't calculate that value.
Walk this through with me. You issue a paged query to the server with the inline count and a page size of 5 items.
// Get the first 5 Products beginning with 'C'
// and also get the total of all products beginning with 'C'
var query = EntityQuery.from("Products")
.where("ProductName", "startsWith", "C")
.take(5)
.inlineCount()
.using(manager);
Suppose you run this once and the server reports that there are 142 'C' customers in the database.
Ok now take that same query and execute it locally:
query.using(FetchStrategy.FromLocalCache).execute().then(...).fail(...);
How is Breeze supposed to know the count? It only has 5 entities in cache. How can it know that there are 142 'C' customers in the database? It can't.
I think you're best option is to check the returned data object to see if it has an inlineCount value. Only reset your bound copy of the count if the query went remote.
Related
I am trying to sort out a performance issue in doing a data restore on iOS (i.e. data exists in a table, then I update some of it from a flat file backup). My actual case is 1250 times slower on iOS than Windows. I started with the raindrops example from Dexie.bulkPut, which does not exhibit this behavior (it's slower, but only by about 15%, and gradually modified it to be more like what I need to do.
What I have found is that if my table has a single compound key, I can bulkPut the data twice, and it takes nearly the same amount of time both times. But if there are two compound keys, the second write takes about the same time on the computer, but much, much longer on iOS (times in seconds).
Windows Windows iOS iOS
Records First write Second write First write Second write
20,000 2.393 1.904 5.057 131.127
40,000 5.231 3.941 9.533 509.616
60,000 7.808 8.331 14.188 1205.181
Here is my test program:
var db = new Dexie("test");
db.delete().then(function() {
db.version(1).stores({
raindrops: '[matrix+row+col], [matrix+done]'
});
db.open();
var t1 = new Date();
var drops = [];
for (var i=0;i<20000;++i) { // make a matrix
drops.push({matrix: 0, row: Math.floor(i/100)+1, col: i % 100 +1, done: i%2});
}
db.raindrops.bulkPut(drops).then(function(lastKey) {
t2 = new Date();
console.log("Time in seconds " + (t2.getTime() - t1.getTime())/1000);
db.raindrops.bulkPut(drops).then(function(lastKey) {
t3 = new Date();
console.log("Reputting -- Time in seconds " + (t3.getTime() - t2.getTime())/1000);
});
});
});
I am wondering if anyone has any suggestions. I need that second index but I also need for people to be able to do a restore in finite time (my users are not very careful about clearing browser data). Is there any chance of this performance improving? The fact that it's so much better for Windows suggests that it doesn't HAVE to be that way. Or is there a way I could drop the second index before doing a restore and then re-indexing? Done is 0 or 1 and that index is there so I can get a quick count of records with done set (or not set), but maybe it would be better to count manually.
I think I've found a bug with the date filtering on the delta API.
I'm finding on one of the email accounts I'm working with using Office 365 Graph API that the "messages" graph API delta request is returning a different number of items than are actually in a folder for the expected time range. There are 150,000 items covering 10 years in the folder but delta only returns the last 5,000-ish items covering the last 60 or so days.
Paging Works Fine
When querying the graph API for the folder "Inbox" it has 154,045 total items and 57456 unread items.
IUserMailFoldersCollectionPage foldersPage =
await client.Users[mailboxid].MailFolders.Request().GetAsync();
I can skip over 10,000, 50,000 or more messages using paging.
model.messages = await client.Users[mailboxid].MailFolders[folderid].Messages.Request().Top(top)
.Skip(skip).GetAsync();
Delta with Date Filter doesn't work
But when looping with nextToken and deltaTokens, the deltaToken appears after 5000 or so email messages. Basically it seems like it's only returning results for the last couple months even though the filter is saying find messages for the last 20 years.
Here is the example for how we generate the Delta request. The time is hardcoded here but in reality it is a variable.
var sFilter = $"receivedDateTime ge {DateTimeOffset.UtcNow.AddYears(-20).ToString("yyyy-MM-dd")}";
model.messages = await client.Users[mailboxid].MailFolders[folderid].Messages.Delta().Request()
.Header("Prefer", "odata.maxpagesize=" + maxpagesize)
.Filter(sFilter)
.OrderBy("receivedDateTime desc")
.GetAsync();
And then on each paging operation I do the following. "nexttoken" is either the next or delta link depending on what came back from the first request.
model.messages = new MessageDeltaCollectionPage();
model.messages.InitializeNextPageRequest(client, nexttoken);
model.messages = await model.messages.NextPageRequest
.Header("Prefer", "odata.maxpagesize=" + maxpagesize)
.GetAsync();
Delta without Filter works
If I do the exact same code for delta above but remove the "Filter" operation on date, then I get all the messages in the folder.
This isn't a great solution since I normally only need messages for the last year or 2 years and if there are 15 years of messages it is a huge waste to query everything.
Update on 12/3/2019
I'm still getting this issue. I recently switched back to trying to use Delta again whereas before I was querying everything from the server even though I might only need the last month of data. But that's super wasteful.
This code works fine for most mailboxes but sometimes I encounter a mailbox with this issue.
My code looks like this.
string sStartingTime = startingTime.ToString("yyyy'-'MM'-'dd'T'HH':'mm':'ss") + "Z";
var messageCollectionPage = await client.Users[mailboxsource.GetMailboxIdFromAccountID()].MailFolders[folder.Id].Messages.Delta().Request()
.Filter("receivedDateTime+ge+" + Uri.EscapeDataString(sStartingTime))
.Select(select)
.Header("Prefer", "odata.maxpagesize=" + preferredPageSize)
.OrderBy("receivedDateTime desc")
.GetAsync(cancellationToken);
At around 5000 results the Delta request just stops returning results even though there are 66K items in the folder.
Paul, my peers confirmed there is indeed a 5000-item limit if you apply $filter to a delta query of the message resource.
Within the next day, the docs will also be updated with this information. Thank you for your patience and support!
I have a case where we are doing a select on a domain like:
select * from mydomain where some_val = 'foo' and some_date < '2012-03-01T00:00+01:00'
When iterating the results of this query - we are doing some work and then updating the row and setting the field some_date to the current date/time. Marking that it's processed.
The question I have is will the nexttoken request break when it returns to simpledb to get the next set of records? When it returns to get the next batch - all of the ones in the first batch will now have some_date with a value that no longer is within the original query range.
I don't know how the next-token is implemented to know whether its just a pointer to the next item or whether it somehow is an offset that might "skip" a whole batch of records.
So if we retrieved 3 records at a time and I had this in my domain:
record 1, '2012-01-12T19:20+01:00'
record 2, '2012-02-14T19:20+01:00'
record 3, '2012-01-22T19:20+01:00'
record 4, '2012-01-21T19:20+01:00'
record 5, '2012-02-22T19:20+01:00'
record 6, '2012-01-20T19:20+01:00'
record 7, '2012-01-18T19:20+01:00'
record 8, '2012-01-17T19:20+01:00'
record 9, '2012-02-12T19:20+01:00'
My first execution I would get: record 1, 2, 3
If i set their some_date field to: '2012-03-12T19:20+01:00' before returning for the next-token batch - would the next-token request then return 4,5,6? Or would it return 7,8,9 (because the token was set to start at the 4th record and now 1,2,3 are no longer in the result set).
If it is important - we are using the boto library (python).
would the next-token request then return 4,5,6? Or would it return
7,8,9 [...]?
Good question, this can indeed be a bit confusing - still anything but the former (i.e. 4,5,6) wouldn't make sense for practical usage and Amazon SimpleDB works like so accordingly, see Select:
Operations that run longer than 5 seconds return a time-out error
response or a partial or empty result set. Partial and empty result
sets contain a NextToken value, which allows you to continue the
operation from where it left off [emphasis mine]
Please take note of the additional note in section Request Parameters though, which might be a bit surprising eventually:
Note
The response to a Select operation with ConsistentRead set to
true returns a consistent read. However, for any following Select
operation requests that include a NextToken value, Amazon SimpleDB
ignores the ConsistentRead field, and the subsequent results are
eventually consistent. [emphasis mine]
I am working on application built on ASP.NET MVC 3.0 and displaying the data in MVC WebGrid.
I am using LINQ to get the records from Entities to EntityViewModel. In doing this I have to convert the records from entity to EntityViewModel.
I have 30K records to be displayed in the grid, for each and every record there are 3 flags where It has to go 3 other tables and compare the existence of the record and paint with true or false and display the same in grid.
I am displaying 10 records at a time, but it is bit very slow as I am getting all the records and storing in my application.
The Paging is in place (I mean to say -only 10 records are being displayed in web grid) but all the records are getting loaded into the application which is taking 15-20 seconds. I have checked the place where this time is being spent by the processor. It's happening in the painting place(where every record is being compared with 3 other tables).
I have converted LINQ query to SQL and I can see my SQL query is getting executed under 2 seconds. By this , I can strongly say that, I do not want to spend time on SQL indexing as the speed of SQL query is good enough.
I have two options to implement
1) Caching for MVC
2) Paging(where I should get only first ten records).
I want to go with the paging technique for performance improvement .
Now my question is how do I pass the number 10(no of records to service method) so that It brings up only ten records. And also how do I get the next 10 records when clicking on the next page.
I would post the code, but I cannot do it as it has some sensitive data.
Any example how to tackle this situation, many thanks.
If you're using SQL 2005 + you could use ROW_NUMBER() in your stored procedure:
http://msdn.microsoft.com/en-us/library/ms186734(v=SQL.90).aspx
or else if you just want to do it in LINQ try the Skip() and Take() methods.
As simple as:
int page = 2;
int pageSize = 10;
var pagedStuff = query.Skip((page - 1) * pageSize).Take(pageSize);
You should always, always, always be limiting the amount of rows you get from the database. Unbounded reads kill applications. 30k turns into 300k and then you are just destroying your sql server.
Jfar is on the right track with .Skip and .Take. The Linq2Sql engine (and most entity frameworks) will convert this to SQL that will return a limited result set. However, this doesn't preclude caching the results as well. I recommend doing that as well. That fastest trip to SQL Server is the one you don't have to take. :) I do something like this where my controller method handles paged or un-paged results and caches whatever comes back from SQL:
[AcceptVerbs("GET")]
[OutputCache(Duration = 360, VaryByParam = "*")]
public ActionResult GetRecords(int? page, int? items)
{
int limit = items ?? defaultItemsPerPage;
int pageNum = page ?? 0;
if (pageNum <= 0) { pageNum = 1; }
ViewBag.Paged = (page != null);
var records = null;
if (page != null)
{
records = myEntities.Skip((pageNum - 1) * limit).Take(limit).ToList();
}
else
{
records = myEntities.ToList();
}
return View("GetRecords", records);
}
If you call it with no params, you get the entire results set (/GetRecords). Calling it will params will get you the restricted set (/GetRecords?page=3&items=25).
You could extend this method further by adding .Contains and .StartsWith functionality.
If you do decide to go the custom stored procedure route, I'd recommend using "TOP" and "ROW_NUMBER" to restrict results rather than a temp table.
Personally I would create a custom stored procedure to do this and then call it through Linq to SQL. e.g.
CREATE PROCEDURE [dbo].[SearchData]
(
#SearchStr NVARCHAR(50),
#Page int = 1,
#RecsPerPage int = 50,
#rc int OUTPUT
)
AS
SET NOCOUNT ON
SET FMTONLY OFF
DECLARE #TempFound TABLE
(
UID int IDENTITY NOT NULL,
PersonId UNIQUEIDENTIFIER
)
INSERT INTO #TempFound
(
PersonId
)
SELECT PersonId FROM People WHERE Surname Like '%' + SearchStr + '%'
SET #rc = ##ROWCOUNT
-- Calculate the final offset for paging --
DECLARE #FirstRec int, #LastRec int
SELECT #FirstRec = (#Page - 1) * #RecsPerPage
SELECT #LastRec = (#Page * #RecsPerPage + 1)
-- Final select --
SELECT p.* FROM People p INNER JOIN #TempFound tf
ON p.PersonId = tf.PersonId
WHERE (tf.UID > #FirstRec) AND (tf.UID < #LastRec)
The #rc parameter is the total number of records found.
You obviously have to model it to your own table, but it should run extremely fast..
To bind it to an object in Linq to SQL, you just have to make sure that the final selects fields match the fields of the object it is to be bound to.
I am trying to get the count of the items in a sharepoint document library programatically. The scale I am working with is 30-70000 items. We have usercontrol in a smartpart to display the count . Ours is a TEAM site.
This is the code to get the total count:
SPList VoulnterrList = web.Lists[ListTitle];
SPQuery query = new SPQuery();
query.ViewAttributes = "Scope=\"Recursive\"";
string queries = "<Where><Eq><FieldRef Name='ApprovalStatus' /><Value Type='Choice'>Pending</Value></Eq></Where>";
query.Query = queries;
SPListItemCollection lstitemcollAssoID = VoulnterrList.GetItems(query);
lblCount.Text = "Total Proofs: " + VoulnterrList.Items.Count.ToString() + " Pending Proofs: " + lstitemcollAssoID.Count.ToString();
The problem is this has serious performance issue it takes 75 to 80 sec to load the page. if we comment this page load will decrees to 4 sec. Any better approch for this problem
Ours is sharepoint 2007
Use VoulnterrList.ItemCount instead of VoulnterrList.Items.Count.
When List.Items is used, all items in the list are loaded from the content database. Since we don't actually need the items to get the count this is wasted overhead.
This will fix performance at line 8, but you may still have issues at line 9 depending on the number of results returned by the query.
You can do two optimizations here:
Create an index on one of the column of your list
Use that column in <ViewFields> section of your CAML query so that only that indexed column is retrieved.
This should speed up. See this article on how to create index on column:
http://sharepoint.microsoft.com/Blogs/GetThePoint/Lists/Posts/Post.aspx?ID=162