Querying local cache first before querying server in Breeze JS - breeze

I have an app using Breeze to query the data. I want to first check the local cache and then the server cache if no results are returned (I followed John Papa's SPA jumpstart course). However, I have found a flaw in my logic which I am not sure how to fix. Assuming I have 10 items that match my query.
Situation 1 (which works): I go to list page (Page A) displaying all 10. Hits server as cache is empty and adds all 10 to the cache. Then go to page displaying 1 result (Page B) which is found in the cache. So all good.
Situation 2 (the problem): I go to the page displaying 1 record first (Page B). Then I go to my list page (Page A) which checks the cache and finds 1 record and because of this line ( if (recordsInCache.length > 0)) it exits and only shows that 1 record.
I somehow need to know that there are more records on the server (9) that are NOT in the cache, ie. the total records for this query is actually 10, I have 1 therefore I have to hit server for the other 9.
Here is my query for Page A:
function getDaresToUser(daresObservable, criteria, forceServerCall)
{
var query = EntityQuery.from('Dares')
.where('statusId', '!=', enums.dareStatus.Deleted)
.where('toUserId', '==', criteria.userId)
.expand("fromUser, toUser")
.orderBy('deadlineDate, changedDate');
return dataServiceHelper.executeQuery(query, daresObservable, false, forceServerCall);
}
and here is my query for Page B (single item)
function getDare(dareObservable, criteria, forceServerCall)
{
var query = EntityQuery.from('Dares')
.expand("fromUser, toUser")
.where('dareId', '==', criteria.dareId);
return dataServiceHelper.executeQuery(query, dareObservable, true, forceServerCall);
}
function executeQuery(query, itemsObservable, singleEntity, forceServerCall)
{
//check local cache first
if (!manager.metadataStore.isEmpty() && !forceServerCall)
{
var recordsInCache = executeLocalQuery(query, itemsObservable, singleEntity);
if (recordsInCache.length > 0)
{
callCompleted();
return Q.resolve();
}
}
return manager.executeQuery(query)
.then(querySucceeded)
.fail(queryFailed);
}
function executeLocalQuery(query, itemsObservable, singleEntity)
{
var recordsInCache = manager.executeQueryLocally(query);
if (recordsInCache.length > 0)
{
processQueryResults(recordsInCache, itemsObservable, singleEntity, true);
}
return recordsInCache;
}
Any advice appreciated...

If you want to just hit the server for comparison purposes then at some point (either when loading up your app or when you hit the list page) call inlineCount to compare total on server vs what you already have like shown in this answer stackoverflow.com/questions/16390897/counts-in-breeze-js/…
A way you can use this creatively while you are querying for the single record would be like this -
Set some variable in your view model or somewhere equal to total count
var totalCount = 0;
When you query the single record get the inline count -
var query = EntityQuery.from('Dares')
.expand("fromUser, toUser")
.where('dareId', '==', criteria.dareId)
.inlineCount(true);
and set totalCount = data.inlineCount; Same thing when you get the total items list, just set the totalCount to inlineCount then too so you always know if you have all of the entities.

I’ve been thinking about this problem more in the last year (and have since moved from Durandal + Breeze to Angular + Breeze : In Angular you can cache the service call easily using
return $resource(xyz + params}, {'query': { method:'GET', cache: true, isArray:true }}).query(successArrayDataLoad, errorDataLoad);
I guess Angular caches the params of this query and knows when it has it already. So when I switch this method to use Breeze I lose this functionality and all my List calls are hit on every time.
So the real problem here is List data. Single Entities can always check the local cache and if nothing is returned then check the server (because you expect exactly 1).
However, List data varies by params. For example, if I have a GetGames call which takes in a CreatedByUserId, every time I supply a new CreatedByUserId I have to go back to the server.
So I think what I really need to do here to cache my List calls is to cache the Key for each call which is a combination of the QueryName and the Params.
For example, GetGames1 for UserID 1 and then GetGames2 for UserId 2.
The logic would be: Check the Angular cache to see if this call has been made before in this session. If it has, then check the local cache first. If nothing is returned, check the server.
If it has not, check the server as the local cache MIGHT have some data in it for this query but it's not guaranteed to be the full set.
The only other way around it would be to hit the server each time first to get a count for that Query + Params and then hit the local cache and compare the count, but that is more inefficient.
Thoughts?

Related

ASP.Net MVC PagedList not working properly

I am using PagedList to display paging on my search payment results page. I want to display only 5 payments on each page. The search criteria I am testing returns 15 records. I am expecting only 5 records on first page with page numbers 1,2,3 at bottom. I see the page numbers as expected at the bottom but all 15 records get displayed on every page. I have debugged the code and found out that StaticPagedList function is returning 15 records instead of 5. My controller action code is as given below:
public ViewResult ViewPayment(int? billerId, int? billAccount, int? page)
{
var pageIndex = (page ?? 1) - 1;
var pageSize = 5;
List<Payment> paymentList = new List<Payment>();
paymentList = _paymentBusiness.GetPayments(billerId, billAccount);
var paymentsAsIPagedList = new StaticPagedList<Payment>(paymentList, pageIndex + 1, pageSize, paymentList.Count);
ViewBag.OnePageOfPayments = paymentsAsIPagedList;
return View(paymentList);
}
Please let me know if I have mistaken anything.
You should be querying only 5 records from your business layer. Right now you are not passing the page number or anything there. It's a bit of a waste to query all of them if you are going to only display some of them anyway.
public ViewResult ViewPayment(int? billerId, int? billAccount, int? page)
{
int pageNum = page ?? 1;
int pageSize = 5;
IPagedList<Payment> paymentPage = _paymentBusiness.GetPayments(billerId, billAccount, page, pageSize);
return View(paymentPage);
}
// Business layer
public IPagedList<Payment> GetPayments(int? billerId, int? billAccount, int page, int pageSize)
{
IQueryable<Payment> payments = db.Payments.Where(p => ....).OrderBy(p => ...);
return new PagedList<Payment>(payments, page, pageSize);
}
I would suggest you do something like the above. Change it so the business/data layer gives you back the paged list. It can get the 5 results, and the total count with two queries, and return your controller the page model.
The example gets a page using PagedList<T> which runs Skip() and Take() internally. Remember to order your results before creating the page.
Importantly, now we do not fetch all the items from the database, only the small subset we are interested in.
If you are using e.g. ADO.NET that requires you to use plain SQL, you can use a query like:
SELECT * FROM Payments ORDER BY id OFFSET 0 ROWS FETCH NEXT 10 ROWS ONLY;
Offset should be set to (page - 1) * pageSize, and the number after FETCH NEXT is the page size. Note this only works on SQL Server 2012+. Other databases have similar abilities.
Also, with ADO.NET you will have to make the two queries needed manually (page + total count), and use StaticPagedList instead of PagedList, which allows you to give it the subset directly.
An alternate approach to using PagedList (which does not provide async methods), is to use DataTables.net (https://datatables.net).
It is a client side javascript framework for paged tables and can be configured down to very low levels. This would allow you to do what you need, and also have the ability for custom sorting, caching, searching, and many other features out of the box.
Just a suggestion, as I have used PagedList library myself in the past, and since discovering DataTables.Net, I have not looked back. Great library, and makes your life easy.

Observed Lists and Maps, and Firebase, oh my. How can I improve this mess?

So this is what I ended up with to get realtime starring/liking (of communities, in my case) working, with a Firebase datastore. It's a mess and surely I'm missing some fundamentals.
Here my element gets communities, each as a Map community stored in an observed List communities. It has to rewrite that List several times as it changes each community Map based on the results of the changed star count and the user's starred state, and some other fun:
getCommunities() {
// Since we call this method a second time after user
// signed in, clear the communities list before we recreate it.
if (communities.length > 0) { communities.clear(); }
var firebaseRoot = new db.Firebase(firebaseLocation);
var communityRef = firebaseRoot.child('/communities');
// TODO: Undo the limit of 20; https://github.com/firebase/firebase-dart/issues/8
communityRef.limit(20).onChildAdded.listen((e) {
var community = e.snapshot.val();
// snapshot.name is Firebase's ID, i.e. "the name of the Firebase location",
// so we'll add that to our local item list.
community['id'] = e.snapshot.name();
print(community['id']);
// If the user is signed in, see if they've starred this community.
if (app.user != null) {
firebaseRoot.child('/users/' + app.user.username + '/communities/' + community['id']).onValue.listen((e) {
if (e.snapshot.val() == null) {
community['userStarred'] = false;
// TODO: Add community star_count?!
} else {
community['userStarred'] = true;
}
print("${community['userStarred']}, star count: ${community['star_count']}");
// Replace the community in the observed list w/ our updated copy.
communities
..removeWhere((oldItem) => oldItem['alias'] == community['alias'])
..add(community)
..sort((m1, m2) => m1["updatedDate"].compareTo(m2["updatedDate"]));
communities = toObservable(communities.reversed.toList());
});
}
// If no updated date, use the created date.
if (community['updatedDate'] == null) {
community['updatedDate'] = community['createdDate'];
}
// Handle the case where no star count yet.
if (community['star_count'] == null) {
community['star_count'] = 0;
}
// The live-date-time element needs parsed dates.
community['updatedDate'] = DateTime.parse(community['updatedDate']);
community['createdDate'] = DateTime.parse(community['createdDate']);
// Listen for realtime changes to the star count.
communityRef.child(community['alias'] + '/star_count').onValue.listen((e) {
int newCount = e.snapshot.val();
community['star_count'] = newCount;
// Replace the community in the observed list w/ our updated copy.
// TODO: Re-writing the list each time is ridiculous!
communities
..removeWhere((oldItem) => oldItem['alias'] == community['alias'])
..add(community)
..sort((m1, m2) => m1["updatedDate"].compareTo(m2["updatedDate"]));
communities = toObservable(communities.reversed.toList());
});
// Insert each new community into the list.
communities.add(community);
// Sort the list by the item's updatedDate, then reverse it.
communities.sort((m1, m2) => m1["updatedDate"].compareTo(m2["updatedDate"]));
communities = toObservable(communities.reversed.toList());
});
}
Here we toggle the star, which again replaces the observed communities List a few times as we update the count in the affected community Maps and thus rewrite the List to reflect that:
toggleStar(Event e, var detail, Element target) {
// Don't fire the core-item's on-click, just the icon's.
e.stopPropagation();
if (app.user == null) {
app.showMessage("Kindly sign in first.", "important");
return;
}
bool isStarred = (target.classes.contains("selected"));
var community = communities.firstWhere((i) => i['id'] == target.dataset['id']);
var firebaseRoot = new db.Firebase(firebaseLocation);
var starredCommunityRef = firebaseRoot.child('/users/' + app.user.username + '/communities/' + community['id']);
var communityRef = firebaseRoot.child('/communities/' + community['id']);
if (isStarred) {
// If it's starred, time to unstar it.
community['userStarred'] = false;
starredCommunityRef.remove();
// Update the star count.
communityRef.child('/star_count').transaction((currentCount) {
if (currentCount == null || currentCount == 0) {
community['star_count'] = 0;
return 0;
} else {
community['star_count'] = currentCount - 1;
return currentCount - 1;
}
});
// Update the list of users who starred.
communityRef.child('/star_users/' + app.user.username).remove();
} else {
// If it's not starred, time to star it.
community['userStarred'] = true;
starredCommunityRef.set(true);
// Update the star count.
communityRef.child('/star_count').transaction((currentCount) {
if (currentCount == null || currentCount == 0) {
community['star_count'] = 1;
return 1;
} else {
community['star_count'] = currentCount + 1;
return currentCount + 1;
}
});
// Update the list of users who starred.
communityRef.child('/star_users/' + app.user.username).set(true);
}
// Replace the community in the observed list w/ our updated copy.
communities.removeWhere((oldItem) => oldItem['alias'] == community['alias']);
communities.add(community);
communities.sort((m1, m2) => m1["updatedDate"].compareTo(m2["updatedDate"]));
communities = toObservable(communities.reversed.toList());
print(communities);
}
There's also some other craziness where we have to get the list of communities again when app.changes because we only load app.user after the app and list initially load, and now that we have the user we need to turn on the appropriate stars. So my attached() looks like:
attached() {
app.pageTitle = "Communities";
getCommunities();
app.changes.listen((List<ChangeRecord> records) {
if (app.user != null) {
getCommunities();
}
});
}
There, it seems I could just be getting the stars and updating said each affected community Map, then repopulating the observed communities List, but that's the least of it.
The full thing: https://gist.github.com/DaveNotik/5ccdc9e74429cf87d641
How can I improve all this Map/List management, e.g. where every time I change a community Map, I have to rewrite the whole communities List? Should I be thinking of it differently?
What about all this querying Firebase? Surely, there's a better way, but it seems I need to do a lot to keep it realtime, and also the element gets attached and detached, so it seems I need to run getCommunities() each time. Unless the OOP way is objects get created, and they're always there to be observed whenever the element is attached? I'm missing those fundamentals.
This app.changes business to handle the case where we load the list before we have the app.user (which then means we want to load her stars) - is there a better way?
Other ridiculousness?
Big question, I know. Thank you for helping me get a handle on the right approach as I move forward!
I think there is two different ways to choose, if you want to keep a data of your application in real time sync with server database:
1 Polling (pull method ie. a client pulls the data from server)
Application polls ie. requests the updated data from the server. Polling can be automatic (for example with interval of 60s) or requested by user (= refresh). The short automatic interval will cause high load on server and with long interval you lose real time feeling.
2 Full-duplex (push method ie. server can push the data to the client)
An application and a server have full-duplex connection in between and server is able to send the data or a notification of the data available to the client. Then the client can decide whether or not to retrieve the data.
This method is a modern one, because it'll keep the net traffic and the server load in minimum and yet providing a real time updates.
The firebase boasts with this kind of updates, but I'm not sure is it full-dublex or just a clever way of polling. Websocket protocol is a real full-duplex connection and dart server supports it.
The updated data from a server can include:
1 A full dataset
Basically the server sends a full dataset (=initial query) and the server doesn't "know" anything about updated data. This is easiest way to go, if you have reasonable small datasets. Many times you'll have a very small datasets among the big ones, so this way can be useful.
2 A dataset including a new data only
The server can send a dataset based on modified timestamp ie. every time a record in the database changes, a timestamp for update will be saved and the query can be filtered based on this timestamp. In other words application knows when it has last updated the data and then requests newer data.
3 A changed record
A server keeps a track of updated data and sends it to the application. The data can be sent record by record when changes occurs or server can collect the data for a bigger chunks to be sent. This method requires a server to keep a track of every client connected in order to send a correct data to each client. When you add an authentication process for clients ie. not every data can be send to all, it can get quite complicated.
I think the easiest way is to use the method number 2 for updated data.
Last thing...
What to do with the data received?
1 Handle everything as a new
If application receives an updated data, it will destroy/clear all the lists and maps and recreate/refill them with the new data. Typical problems with this are a user loses a current position on a page or the data user were looking jumps around. If application has modified or extended an old data for some reason, all those modifications will be lost. This method works ok, if a user requests a refresh.
2 Update only the changed data
The application never clears initial list or maps, it just updates them with a new received data. Typically you will construct a new combined map from queried data for your specific need (for example a certain view). The combined map has already all information you want to show in the specific view (default values even if the initial queries didn't had the data for the field) and you just update a new values in it.
If the updated information needs a new member in the list you just add it in the end.
If the updated information requires a deletion from the list, it might be a good thing to use extra field "active" and filter the list/map with it. With filtering you won't lose any referencies or so.
If you need to sort a data or filter it, it should be done by a view or user request. Basically the data is stored in the application and updated as needed. When a user needs to see the data in a specific way, the view should show the data a proper way. This is called model-controller-view and the main idea is to separate the data from the view.
I'm sorry this long answer didn't answer any of your questions, but I tried to cut this challenge to a smaller chunks. Many times you can see an interface between these chunks and you can design and organize your code nicely by using these interfaces.

Invalidate or clear certain entity types from breeze.js cache

Given a fairly standard scenario in an application, of having a list of entities that can be added/edited/deleted etc. When I first load the page, I query the entites with breeze. All good. If I edit an entity, breeze sees this and the entity is saved AND updated in the cache, so when I return to the page with the list, the item shows any relevant changes. If I delete an entity, breeze will again save that change AND update it's cache so the entity no longer appears when I return to the list page and requery the data (locally from the cache I should point out).
However, if I add a new entity, it does not appear on the list page (assuming it satisfies the requirements of the query). I'm guessing that breeze is caching the results of a query specific to that query, rather than actually querying the cache (does that make sense)?
Assuming that is the case, is there a way of telling breeze to remove or invalidate cache items relating to a specific entity type, as opposed to clearing the cache completely? I can always avoid querying the cache and go directly to the server, but that seems a waste when I know breeze has the newly created entity in the cache, it just isn't showing it to me.
I use a single method for a lot of my queries, so it may just be the way I'm handling the local querying in that method, therefore I have included it below, in case that is the cause.
var defaultQuery = function (observable, resourceName, orderby, where, expand, forceRemote, localEntityName, localCountThreshold, page, count) {
var query = EntityQuery.from(resourceName);
if (orderby) {
query = query.orderBy(orderby);
if (page) {
query = query.skip(pageSize * (page()));
query = query.take(pageSize);
query = query.inlineCount();
}
}
if (where)
query = query.where(where);
if (expand)
query = query.expand(expand);
if (!forceRemote) {
if (localEntityName) {
query = query.toType(localEntityName);
}
var localResult = manager.executeQueryLocally(query);
if (localCountThreshold) {
if (localResult.length > localCountThreshold) {
observable(localResult);
return Q.resolve();
}
} else {
if (localResult.length > 0) {
observable(localResult);
return Q.resolve();
}
}
}
return query.using(manager)
.execute()
.then(function (data) {
if (observable) {
observable(data.results);
}
if (count) {
count(data.inlineCount);
}
var logMsg = 'Retrieved ' + resourceName + ' from remote data source with order by {' + orderby + '}, where clause {' + where + '} and expand properties {' + expand + '}';
if (page)
logMsg += ' for page {' + page() + '} with total record count {' + count() + '}';
log(logMsg, data, true);
})
.fail(queryFailed);
};
Usually in questions like this, I have just misunderstood some aspect of how breeze works, so if anyone can correct me, I'd be very appreciative.
Thanks!
Update
I have included the flow of actions step by step to show in more detail what I meant.
Execute Query for all objects
Query is executed locally first, returns nothing as no cache data exists.
Query is then executed against the database - returns [A,B,C]
Edit A - rename to AA - call saveChanges()
Execute same query for all objects
Query is executed locally first, returns [AA,B,C]
Query skips executing against database as results have been found from cache.
Delete B (setDeleted()) - call saveChanges()
Execute same query for all objects
Query is executed locally first, returns [AA,C]
Query skips executing against database as results have been found from cache.
Create new entity - set name to D - call saveChanges()
Execute same query for all objects
Query is executed locally first, returns [AA,C] (does not include new D object!)
The point I'm trying to discover is that local queries return changes saved for edits and deletes, but not for add operations. I could remove entities from the cache by using setDetached as Jay suggested, but I would need to do that for all entities of a specific type one-by-one. That could be a big process.
Update again
It appears the behaviour I saw was the result of some mistake by myself. Having double and triple checked everything as a result of Jay's assertion that the results should be there (see below), the 'added' objects are now appearing, but I honestly can't explain what I did to prevent them in the first place.
Just to be clear, Breeze updates the cache when you edit an entity, it does NOT save that entity until you call EntityManager.saveChanges(). So until you call "saveChanges" the cache and your database will be in different states.
What you might be seeing is the result of the idea that when you requery an entity that has already been changed in the EntityManager, the action of merging the server side data with the client side cache is controlled by the EntityQuery.queryOptions.mergeStrategy. By default, an EntityQuery has a MergeStrategy of PreserveChanges which means that a server side result will NOT overwrite any "modified" records in the cache.
Per the later part of your post, you can remove any entity from the local cache simply by calling the entity's "entityAspect.setDetached" method.

Returning Updated Results from DBSet.SqlQuery

I want to use the following method to flag people in the Person table so that they can be processed. These people must be flagged as "In Process" so that other threads do not operate on the same rows.
In SQL Management Studio the query works as expected. When I call the method in my application I receive the row for the person but with the old status.
Status is one of many navigation properties off of Person and when this query returns it is the only property returned as a proxy object.
// This is how I'm calling it (obvious, I know)
var result = PersonLogic.GetPeopleWaitingInLine(100);
// And Here is my method.
public IList<Person> GetPeopleWaitingInLine(int count)
{
const string query =
#"UPDATE top(#count) PERSON
SET PERSON_STATUS_ID = #inProcessStatusId
OUTPUT INSERTED.PERSON_ID,
INSERTED.STATUS_ID
FROM PERSON
WHERE PERSON_STATUS_ID = #queuedStatusId";
var queuedStatusId = StatusLogic.GetStatus("Queued").Id;
var inProcessStatusId = StatusLogic.GetStatus("In Process").Id;
return Context.People.SqlQuery(query,
new SqlParameter("count", count),
new SqlParameter("queuedStateId", queuedStateId),
new SqlParameter("inProcessStateId", inProcessStateId)
}
// update | if I refresh the result set then I get the correct results
// but I'm not sure about this solution since it will require 2 DB calls
Context.ObjectContext().Refresh(RefreshMode.StoreWins, results);
I know it is an old question but this could help somebody.
It seems you are using a global Context for your query, EF is designed to retain cache info, if you allways need fresh data must use a fresh context to retrieve it. as this:
using (var tmpContext = new Contex())
{
// your query here
}
This create the context and recycle it. This means no cache was stored and next time it gets fresh data from database not from cache.

How to implement pagination when using amazon Dynamo DB in rails

I want to use amazon Dynamo DB with rails.But I have not found a way to implement pagination.
I will use AWS::Record::HashModel as ORM.
This ORM supports limits like this:
People.limit(10).each {|person| ... }
But I could not figured out how to implement following MySql query in Dynamo DB.
SELECT *
FROM `People`
LIMIT 1 , 30
You issue queries using LIMIT. If the subset returned does not contain the full table, a LastEvaluatedKey value is returned. You use this value as the ExclusiveStartKey in the next query. And so on...
From the DynamoDB Developer Guide.
You can provide 'page-size' in you query to set the result set size.
The response of DynamoDB contains 'LastEvaluatedKey' which will indicate the last key as per the page size. If response does't contain 'LastEvaluatedKey' it means there are no results left to fetch.
Use the 'LastEvaluatedKey' as 'ExclusiveStartKey' while fetching next time.
I hope this helps.
DynamoDB Pagination
Here's a simple copy-paste-run proof of concept (Node.js) for stateless forward/reverse navigation with dynamodb. In summary; each response includes the navigation history, allowing user to explicitly and consistently request either the next or previous page (while next/prev params exist):
GET /accounts -> first page
GET /accounts?next=A3r0ijKJ8 -> next page
GET /accounts?prev=R4tY69kUI -> previous page
Considerations:
If your ids are large and/or users might do a lot of navigation, then the potential size of the next/prev params might become too large.
Yes you do have to store the entire reverse path - if you only store the previous page marker (per some other answers) you will only be able to go back one page.
It won't handle changing pageSize midway, consider baking pageSize into the next/prev value.
base64 encode the next/prev values, and you could also encrypt.
Scans are inefficient, while this suited my current requirement it won't suit all!
// demo.js
const mockTable = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
const getPagedItems = (pageSize = 5, cursor = {}) => {
// Parse cursor
const keys = cursor.next || cursor.prev || [] // fwd first
let key = keys[keys.length-1] || null // eg ddb's PK
// Mock query (mimic dynamodb response)
const Items = mockTable.slice(parseInt(key) || 0, pageSize+key)
const LastEvaluatedKey = Items[Items.length-1] < mockTable.length
? Items[Items.length-1] : null
// Build response
const res = {items:Items}
if (keys.length > 0) // add reverse nav keys (if any)
res.prev = keys.slice(0, keys.length-1)
if (LastEvaluatedKey) // add forward nav keys (if any)
res.next = [...keys, LastEvaluatedKey]
return res
}
// Run test ------------------------------------
const runTest = () => {
const PAGE_SIZE = 6
let x = {}, i = 0
// Page to end
while (i == 0 || x.next) {
x = getPagedItems(PAGE_SIZE, {next:x.next})
console.log(`Page ${++i}: `, x.items)
}
// Page back to start
while (x.prev) {
x = getPagedItems(PAGE_SIZE, {prev:x.prev})
console.log(`Page ${--i}: `, x.items)
}
}
runTest()
I faced a similar problem.
The generic pagination approach is, use "start index" or "start page" and the "page length". 
The "ExclusiveStartKey" and "LastEvaluatedKey" based approach is very DynamoDB specific.
I feel this DynamoDB specific implementation of pagination should be hidden from the API client/UI.
Also in case, the application is serverless, using service like Lambda, it will be not be possible to maintain the state on the server. The other side is the client implementation will become very complex.
I came with a different approach, which I think is generic ( and not specific to DynamoDB)
When the API client specifies the start index, fetch all the keys from
the table and store it into an array.
Find out the key for the start index from the array, which is
specified by the client.
Make use of the ExclusiveStartKey and fetch the number of records, as
specified in the page length.
If the start index parameter is not present, the above steps are not
needed, we don't need to specify the ExclusiveStartKey in the scan
operation.
This solution has some drawbacks -
We will need to fetch all the keys when the user needs pagination with
start index.
We will need additional memory to store the Ids and the indexes.
Additional database scan operations ( one or multiple to fetch the
keys )
But I feel this will be very easy approach for the clients, which are using our APIs. The backward scan will work seamlessly. If the user wants to see "nth" page, this will be possible.
In fact I faced the same problem and I noticed that LastEvaluatedKey and ExclusiveStartKey are not working well especially when using Scan So I solved Like this.
GET/?page_no=1&page_size=10 =====> first page
response will contain count of records and first 10 records
retry and increase number of page until all record come.
Code is below
PS: I am using python
first_index = ((page_no-1)*page_size)
second_index = (page_no*page_size)
if (second_index > len(response['Items'])):
second_index = len(response['Items'])
return {
'statusCode': 200,
'count': response['Count'],
'response': response['Items'][first_index:second_index]
}

Resources