ZF2 paginator performance with fetchAll without limit? - zend-framework2

Most of cases in ZF2 I would do a paginator like this:
public function fetchAll()
{
$resultSet = $this->tableGateway->select();
return $resultSet;
}
$iteratorAdapter = new \Zend\Paginator\Adapter\Iterator($this->getSampleTable()->fetchAll());
$paginator = new \Zend\Paginator\Paginator($iteratorAdapter);
The problem of this aproch is that It is not limiting the query result, and fetchAll returns all the results (producing big trafic between db and php).
In pure php I would do a pagination like this:
$PAGE_NUM = $_GET['page_num'];
$result_num = mysq_query("SELECT * FROM table WHERE some condition");
$result = mysq_query("SELECT * FROM table WHERE some condition LIMIT $PAGE_NUM, 20");
echo do_paginator($result_num, 20, $PAGE_NUM);//THIS JUST MAKE THE CLASSIC < 1 2 3 4 5 >
The advantage of this is that I fetch from the DB only the data I need in the pagination thanks to LIMIT option in mysql. This is translate in good performance on query that returns too much records.
Can I use the ZF2 paginator with a effective limit to the results, or I need to make my own paginator?

Use the Zend\Paginator\Adapter\DbSelect adapter instead, it does exactly what you're asking for, applying limit and offset to the query you give it to begin with, and just fetching those records.
Additionally this adapter does not fetch all records from the database in order to count them. Instead, the adapter manipulates the original query to produce a corresponding COUNT query. Paginator then executes that COUNT query to get the number of rows. This does require an extra round-trip to the database, but this is many times faster than fetching an entire result set and using count(), especially with large collections of data.

Related

apply ordering on limit query of active record

I need to find 10 records first and then apply ordering on that.
Model.all.limit(10).order('cast(code as integer)')
result of above code is - first it applies order on model and then limit query. So, I get same codes in my listing for given model. But I want to apply limit query first and then order fetched result.
When you call .all on model, it executes the query on DB and returns all records, to apply limit you have to write it before .all - Model.limit(10).all, but after that you can't use SQL function to operate data. So to get first 10 records and apply order to it, try this:
records = Model.limit(10).all.sort_by{|o| o.code.to_i}
or
records = Model.first(10).sort_by{|o| o.code.to_i}
Try this:
Model.limit(10).sort{|p, q| q.cost <=> p.cost}
All you need to do is to remove the .all method:
Model.limit(10).order('cast(code as integer)')
If you want to get 10 random records and then sort them without getting all records from the database then you could use
Model.limit(10).order("RANDOM()")
This uses the PostgreSQL rand() function to randomise the records.
Now you have an array of 10 random records and you can use .sort_by as described by BitOfUniverse.

How to reduce Azure Table Storage latency?

I have a rather huge (30 mln rows, up to 5–100Kb each) Table on Azure.
Each RowKey is a Guid and PartitionKey is a first Guid part, for example:
PartitionKey = "1bbe3d4b"
RowKey = "1bbe3d4b-2230-4b4f-8f5f-fe5fe1d4d006"
Table has 600 reads and 600 writes (updates) per second with an average latency of 60ms. All queries use both PartitionKey and RowKey.
BUT, some reads take up to 3000ms (!). In average, >1% of all reads take more than 500ms and there's no correlation with entity size (100Kb row may be returned in 25ms and 10Kb one – in 1500ms).
My application is an ASP.Net MVC 4 web-site running on 4-5 Large instances.
I have read all MSDN articles regarding Azure Table Storage performance goals and already did the following:
UseNagle is turned Off
Expect100Continue is also disabled
MaxConnections for table client is set to 250 (setting 1000–5000 doesn't make any sense)
Also I checked that:
Storage account monitoring counters have no throttling errors
There are some kind of "waves" in performance, though they does not depend on load
What could be the reason of such performance issues and how to improve it?
I use the MergeOption.NoTracking setting on the DataServiceContext.MergeOption property for extra performance if I have no intention of updating the entity anytime soon. Here is an example:
var account = CloudStorageAccount.Parse(RoleEnvironment.GetConfigurationSettingValue("DataConnectionString"));
var tableStorageServiceContext = new AzureTableStorageServiceContext(account.TableEndpoint.ToString(), account.Credentials);
tableStorageServiceContext.RetryPolicy = RetryPolicies.Retry(3, TimeSpan.FromSeconds(1));
tableStorageServiceContext.MergeOption = MergeOption.NoTracking;
tableStorageServiceContext.AddObject(AzureTableStorageServiceContext.CloudLogEntityName, newItem);
tableStorageServiceContext.SaveChangesWithRetries();
Another problem might be that you are retrieving the entire enity with all its properties even though you intend only use one or two properties - this is of course wasteful but can't be easily avoided. However, If you use Slazure then you can use query projections to only retrieve the entity properties that you are interested in from the table storage and nothing more, which would give you better query performance. Here is an example:
using SysSurge.Slazure;
using SysSurge.Slazure.Linq;
using SysSurge.Slazure.Linq.QueryParser;
namespace TableOperations
{
public class MemberInfo
{
public string GetRichMembers()
{
// Get a reference to the table storage
dynamic storage = new QueryableStorage<DynEntity>("UseDevelopmentStorage=true");
// Build table query and make sure it only return members that earn more than $60k/yr
// by using a "Where" query filter, and make sure that only the "Name" and
// "Salary" entity properties are retrieved from the table storage to make the
// query quicker.
QueryableTable<DynEntity> membersTable = storage.WebsiteMembers;
var memberQuery = membersTable.Where("Salary > 60000").Select("new(Name, Salary)");
var result = "";
// Cast the query result to a dynamic so that we can get access its dynamic properties
foreach (dynamic member in memberQuery)
{
// Show some information about the member
result += "LINQ query result: Name=" + member.Name + ", Salary=" + member.Salary + "<br>";
}
return result;
}
}
}
Full disclosure: I coded Slazure.
You could also consider pagination if you are retrieving large data sets, example:
// Retrieve 50 members but also skip the first 50 members
var memberQuery = membersTable.Where("Salary > 60000").Take(50).Skip(50);
Typically, if a specific query requires scanning a large number of rows, that will take longer time. Is the behavior you are seeing specific a query / data? Or, are you seeing the performance varies for the same data and query?

Limit the amount of results returned in CloudKit

Is there any way to limit the amount of results that are returned in a CKQuery?
In SQL, it is possible to run a query like SELECT * FROM Posts LIMIT 10,15. Is there anything like the last part of the query, LIMIT 10,15 in CloudKit?
For example, I would like to load the first 5 results, then, once the user scrolls down, I would like to load the next 5 results, and so on. In SQL, it would be LIMIT 0,5, then LIMIT 6,10, and so on.
One thing that would work is to use a for loop, but it would be very intensive, as I would have to select all of the values from iCloud, and then loop through them to figure out which 5 to select, and I'm anticipating there to be a lot of different posts in the database, so I would like to only load the ones that are needed.
I'm looking for something like this:
var limit: NSLimitDescriptor = NSLimitDescriptor(5, 10)
query.limit = limit
CKContainer.defaultContainer().publicCloudDatabase.addOperation(CKQueryOperation(query: query)
//fetch values and return them
You submit your CKQuery to a CKQueryOperation. The CKQueryOperation has concepts of cursor and of resultsLimit; they will allow you to bundle your query results. As described in the documentation:
To perform a new search:
1) Initialize a CKQueryOperation object with a CKQuery object containing
the search criteria and sorting information for the records you want.
2) Assign a block to the queryCompletionBlock property so that you can
process the results and execute the operation.
If the search yields many records, the operation object may deliver a
portion of the total results to your blocks immediately, along with a
cursor for obtaining the remaining records. If a cursor is provided,
use it to initialize and execute a separate CKQueryOperation object
when you are ready to process the next batch of results.
3) Optionally configure the return results by specifying values for the
resultsLimit and desiredKeys properties.
4) Pass the query operation object to the addOperation: method of the
target database to execute the operation against that database.
So it looks like:
var q = CKQuery(/* ... */)
var qop = CKQueryOperation (query: q)
qop.resultsLimit = 5
qop.queryCompletionBlock = { (c:CKQueryCursor!, e:NSError!) -> Void in
if nil != c {
// there is more to do; create another op
var newQop = CKQueryOperation (cursor: c!)
newQop.resultsLimit = qop.resultsLimit
newQop.queryCompletionBlock = qop.queryCompletionBlock
// Hang on to it, if we must
qop = newQop
// submit
....addOperation(qop)
}
}
....addOperation(qop)

ASP.NET MVC3 -Performance Improvement through paging concept, I Need an example?

I am working on application built on ASP.NET MVC 3.0 and displaying the data in MVC WebGrid.
I am using LINQ to get the records from Entities to EntityViewModel. In doing this I have to convert the records from entity to EntityViewModel.
I have 30K records to be displayed in the grid, for each and every record there are 3 flags where It has to go 3 other tables and compare the existence of the record and paint with true or false and display the same in grid.
I am displaying 10 records at a time, but it is bit very slow as I am getting all the records and storing in my application.
The Paging is in place (I mean to say -only 10 records are being displayed in web grid) but all the records are getting loaded into the application which is taking 15-20 seconds. I have checked the place where this time is being spent by the processor. It's happening in the painting place(where every record is being compared with 3 other tables).
I have converted LINQ query to SQL and I can see my SQL query is getting executed under 2 seconds. By this , I can strongly say that, I do not want to spend time on SQL indexing as the speed of SQL query is good enough.
I have two options to implement
1) Caching for MVC
2) Paging(where I should get only first ten records).
I want to go with the paging technique for performance improvement .
Now my question is how do I pass the number 10(no of records to service method) so that It brings up only ten records. And also how do I get the next 10 records when clicking on the next page.
I would post the code, but I cannot do it as it has some sensitive data.
Any example how to tackle this situation, many thanks.
If you're using SQL 2005 + you could use ROW_NUMBER() in your stored procedure:
http://msdn.microsoft.com/en-us/library/ms186734(v=SQL.90).aspx
or else if you just want to do it in LINQ try the Skip() and Take() methods.
As simple as:
int page = 2;
int pageSize = 10;
var pagedStuff = query.Skip((page - 1) * pageSize).Take(pageSize);
You should always, always, always be limiting the amount of rows you get from the database. Unbounded reads kill applications. 30k turns into 300k and then you are just destroying your sql server.
Jfar is on the right track with .Skip and .Take. The Linq2Sql engine (and most entity frameworks) will convert this to SQL that will return a limited result set. However, this doesn't preclude caching the results as well. I recommend doing that as well. That fastest trip to SQL Server is the one you don't have to take. :) I do something like this where my controller method handles paged or un-paged results and caches whatever comes back from SQL:
[AcceptVerbs("GET")]
[OutputCache(Duration = 360, VaryByParam = "*")]
public ActionResult GetRecords(int? page, int? items)
{
int limit = items ?? defaultItemsPerPage;
int pageNum = page ?? 0;
if (pageNum <= 0) { pageNum = 1; }
ViewBag.Paged = (page != null);
var records = null;
if (page != null)
{
records = myEntities.Skip((pageNum - 1) * limit).Take(limit).ToList();
}
else
{
records = myEntities.ToList();
}
return View("GetRecords", records);
}
If you call it with no params, you get the entire results set (/GetRecords). Calling it will params will get you the restricted set (/GetRecords?page=3&items=25).
You could extend this method further by adding .Contains and .StartsWith functionality.
If you do decide to go the custom stored procedure route, I'd recommend using "TOP" and "ROW_NUMBER" to restrict results rather than a temp table.
Personally I would create a custom stored procedure to do this and then call it through Linq to SQL. e.g.
CREATE PROCEDURE [dbo].[SearchData]
(
#SearchStr NVARCHAR(50),
#Page int = 1,
#RecsPerPage int = 50,
#rc int OUTPUT
)
AS
SET NOCOUNT ON
SET FMTONLY OFF
DECLARE #TempFound TABLE
(
UID int IDENTITY NOT NULL,
PersonId UNIQUEIDENTIFIER
)
INSERT INTO #TempFound
(
PersonId
)
SELECT PersonId FROM People WHERE Surname Like '%' + SearchStr + '%'
SET #rc = ##ROWCOUNT
-- Calculate the final offset for paging --
DECLARE #FirstRec int, #LastRec int
SELECT #FirstRec = (#Page - 1) * #RecsPerPage
SELECT #LastRec = (#Page * #RecsPerPage + 1)
-- Final select --
SELECT p.* FROM People p INNER JOIN #TempFound tf
ON p.PersonId = tf.PersonId
WHERE (tf.UID > #FirstRec) AND (tf.UID < #LastRec)
The #rc parameter is the total number of records found.
You obviously have to model it to your own table, but it should run extremely fast..
To bind it to an object in Linq to SQL, you just have to make sure that the final selects fields match the fields of the object it is to be bound to.

Select certain number of records for batch processing

Hi is it possible using Entity Framework and/or linq to select a certain number of rows? For example i want to select rows 0 - 500000 and assign these records to the List VariableAList object, then select rows 500001 - 1000000 and assign this to the List VariableBList object, etc. etc.
Where the Numbers object is like ID,Number,DateCreated, DateAssigned, etc.
Sounds like you're looking for the .Take(int) and .Skip(int) methods
using (YourEntities db = new YourEntities())
{
var VariableAList = db.Numbers
.Take(500000);
var VariableBList = db.Numbers
.Skip(500000)
.Take(500000);
}
You may want to be wary of the size of these lists in memory.
Note: You also may need an .OrderBy clause prior to using .Skip or .Take--I vaguely remember running into this problem in the past.

Resources