I have a table called table 1 which contains a field called parent field which contains object(Objectid) of table 2.Now i dont want to get the duplicated objectId's and arrange them on the basis of ascending order.Here is the data of the table.
Table1
parentfield(id)
790112
790000
790001
790112
790000
790001
Now the result would be the first three elements but i dont know the number of id's matched.Is there a way to do that?
Unfortunately, there is not SELECT DISTINCT / GROUP BY operation in Parse.
See this thread: https://parse.com/questions/retrieving-unique-values
Suggested team solution:
There's no built-in query constraint that would return distinct values
based on a column.
As a workaround, you can query for all the rows, then iterate through
them and track the distinct values for the desired column
So, the sad, bad, horrible idea, is to make a Cloud Function that fetch all the possible elements ( keep in mind that Parse allow you to fetch at maximum 1000 elements for each query ) and then remove the duplicates from the resulting list. You can do this all on the Cloud code function, and then returning the cleaned list to the client, or you can do this directly on your client devices. All this means that if you want to retrieve the real select distinct equivalent at this conditions, you should first fetch all the element ( a loop of query, retrieving 1000 items at time ) and then apply your custom algorithm for removing the duplicates. I know, it's really long and frustrating, considering the fact that the Parse cloud function execution has a timeout limit of 7-10 seconds. Maybe moving to the Parse backgroud jobs, you can populate a distinct temporary table, since you should have up to 15 minutes of execution before the timeout.
Another drastic solution is to move your data on another server that support an ER databases ( like on Openshift, that keep a free tier ) and with some Parse background job, you synchronize the elements from parse to the ER db, so you redirect the client request to the ER db instead of Parse.
Hope it helps
i have an app with similar requirements to display each company name only once and the number of items it has
here is how I implemented the solution at client side (swift 3), my database is relatively small:
var companyItemsCount = [Int: Int]() // stores unique companyId and number of items (count)
query.findObjectsInBackground(block: { (objects: [PFObject]?, error: Error?) in
var companyId: Int = 0
if error == nil {
// The find succeeded.
// Do something with the found objects
if let objects = objects {
for object in objects {
companyId = object["companyId"] as! Int
if self.companyItemsCount[companyId] == nil {
self.companyNames.append(object["companyName"] as! String)
self.companyIds.append(object["companyId"] as! Int)
self.companyItemsCount[companyId] = 1
}else{
self.companyItemsCount[companyId]! += 1
}
self.tableView.reloadData()
}
}
} else {
// Log details of the failure
print("Error: \(error!) \(error!.localizedDescription)")
}
})
self.tableView.reloadData()
Related
I am using Google Firestore geoqueries Following this documentation. Querying documents within a distance using geohash works fine. When I introduce a new condition: `.whereField("createdOn", isGreaterThan: <value for time interval since 1970 7 days ago>)
This throws an error saying, any inequality on a field must have this field as the first 'OrderBy' parameter. When I add order by parameter for this field, it no longer returns the document that is still within the distance searched but shows no error.
Is it even possible to use the firestore geoqueries with additional query conditions?
I need to be able to limit the query by objects created within a certain timeframe, otherwise this will return a very large number of documents. Sorting these post-query will surely impact app performance. Maybe I am missing a more practical way of using geoqueries in Firestore?
let queryBounds = GFUtils.queryBounds(forLocation: center,
withRadius: distanceM)
//test
let ref = Ref().databaseJobs
let currentTime = NSDate().timeIntervalSince1970
let intervalLastWeek = currentTime - (10080 * 60)
print("current time is: \(currentTime) and 7 days ago interval was \(intervalLastWeek)")
ref.whereField("createdOn", isGreaterThan: intervalLastWeek)
let queries = queryBounds.compactMap { (any) -> Query? in
guard let bound = any as? GFGeoQueryBounds else { return nil }
return ref
.order(by: "geohash")
.start(at: [bound.startValue])
.end(at: [bound.endValue])
.whereField("createdOn", isGreaterThan: intervalLastWeek)
Firestore can only filter on range on a single field. Or simpler: you can only have a single orderBy clause in your query.
What you are trying to do requires two orderBy clauses, one for geohash and one for createdOn, which isn't possible. If you were to need an equality check on a second field though, that would be possible as thstt doesn't require an orderBy clause.
What I'm wondering is whether you can add a field createdOnDay that contains just the day part of createdOn in a fixed format, and then perform an in on that with 7 values (for the days of the past week?
I work with R2DBC and i need to execute query, whiсh on request returns Flux of my
entities and after that i need to convert this entities to DTO's,but to create an DTO i need to make another query to the database for each entity, which returns some special info from another tables, for example:
This code doesn't work when total number of Ids exceeds 512
orderRepository.findByIds(listIds).flatMap{ order->
eventRepostiry.findByOrderId(order.id).map{events->
entityToDtoMapper.map(order,events,OrderWithEventsDto::class.java)
}
}
concatMap doesn't help.
But this code works
orderRepository.findByIds(listIds).collectList().flatMapMany{orders->
Flux.fromIterable(orders)
}.flatMap{ order->{
eventRepository.findByOrderId(order.id).collectList().flatMapMany{ events->
Flux.fromIterable(events)
}.map { event->
entityToDtoMapper.map(order,events,OrderWithEventsDto::class.java)
}
}
}
I think there’s a better solution to this problem. How am I supposed to do these queries right?
I have a rather huge (30 mln rows, up to 5–100Kb each) Table on Azure.
Each RowKey is a Guid and PartitionKey is a first Guid part, for example:
PartitionKey = "1bbe3d4b"
RowKey = "1bbe3d4b-2230-4b4f-8f5f-fe5fe1d4d006"
Table has 600 reads and 600 writes (updates) per second with an average latency of 60ms. All queries use both PartitionKey and RowKey.
BUT, some reads take up to 3000ms (!). In average, >1% of all reads take more than 500ms and there's no correlation with entity size (100Kb row may be returned in 25ms and 10Kb one – in 1500ms).
My application is an ASP.Net MVC 4 web-site running on 4-5 Large instances.
I have read all MSDN articles regarding Azure Table Storage performance goals and already did the following:
UseNagle is turned Off
Expect100Continue is also disabled
MaxConnections for table client is set to 250 (setting 1000–5000 doesn't make any sense)
Also I checked that:
Storage account monitoring counters have no throttling errors
There are some kind of "waves" in performance, though they does not depend on load
What could be the reason of such performance issues and how to improve it?
I use the MergeOption.NoTracking setting on the DataServiceContext.MergeOption property for extra performance if I have no intention of updating the entity anytime soon. Here is an example:
var account = CloudStorageAccount.Parse(RoleEnvironment.GetConfigurationSettingValue("DataConnectionString"));
var tableStorageServiceContext = new AzureTableStorageServiceContext(account.TableEndpoint.ToString(), account.Credentials);
tableStorageServiceContext.RetryPolicy = RetryPolicies.Retry(3, TimeSpan.FromSeconds(1));
tableStorageServiceContext.MergeOption = MergeOption.NoTracking;
tableStorageServiceContext.AddObject(AzureTableStorageServiceContext.CloudLogEntityName, newItem);
tableStorageServiceContext.SaveChangesWithRetries();
Another problem might be that you are retrieving the entire enity with all its properties even though you intend only use one or two properties - this is of course wasteful but can't be easily avoided. However, If you use Slazure then you can use query projections to only retrieve the entity properties that you are interested in from the table storage and nothing more, which would give you better query performance. Here is an example:
using SysSurge.Slazure;
using SysSurge.Slazure.Linq;
using SysSurge.Slazure.Linq.QueryParser;
namespace TableOperations
{
public class MemberInfo
{
public string GetRichMembers()
{
// Get a reference to the table storage
dynamic storage = new QueryableStorage<DynEntity>("UseDevelopmentStorage=true");
// Build table query and make sure it only return members that earn more than $60k/yr
// by using a "Where" query filter, and make sure that only the "Name" and
// "Salary" entity properties are retrieved from the table storage to make the
// query quicker.
QueryableTable<DynEntity> membersTable = storage.WebsiteMembers;
var memberQuery = membersTable.Where("Salary > 60000").Select("new(Name, Salary)");
var result = "";
// Cast the query result to a dynamic so that we can get access its dynamic properties
foreach (dynamic member in memberQuery)
{
// Show some information about the member
result += "LINQ query result: Name=" + member.Name + ", Salary=" + member.Salary + "<br>";
}
return result;
}
}
}
Full disclosure: I coded Slazure.
You could also consider pagination if you are retrieving large data sets, example:
// Retrieve 50 members but also skip the first 50 members
var memberQuery = membersTable.Where("Salary > 60000").Take(50).Skip(50);
Typically, if a specific query requires scanning a large number of rows, that will take longer time. Is the behavior you are seeing specific a query / data? Or, are you seeing the performance varies for the same data and query?
Is there any way to limit the amount of results that are returned in a CKQuery?
In SQL, it is possible to run a query like SELECT * FROM Posts LIMIT 10,15. Is there anything like the last part of the query, LIMIT 10,15 in CloudKit?
For example, I would like to load the first 5 results, then, once the user scrolls down, I would like to load the next 5 results, and so on. In SQL, it would be LIMIT 0,5, then LIMIT 6,10, and so on.
One thing that would work is to use a for loop, but it would be very intensive, as I would have to select all of the values from iCloud, and then loop through them to figure out which 5 to select, and I'm anticipating there to be a lot of different posts in the database, so I would like to only load the ones that are needed.
I'm looking for something like this:
var limit: NSLimitDescriptor = NSLimitDescriptor(5, 10)
query.limit = limit
CKContainer.defaultContainer().publicCloudDatabase.addOperation(CKQueryOperation(query: query)
//fetch values and return them
You submit your CKQuery to a CKQueryOperation. The CKQueryOperation has concepts of cursor and of resultsLimit; they will allow you to bundle your query results. As described in the documentation:
To perform a new search:
1) Initialize a CKQueryOperation object with a CKQuery object containing
the search criteria and sorting information for the records you want.
2) Assign a block to the queryCompletionBlock property so that you can
process the results and execute the operation.
If the search yields many records, the operation object may deliver a
portion of the total results to your blocks immediately, along with a
cursor for obtaining the remaining records. If a cursor is provided,
use it to initialize and execute a separate CKQueryOperation object
when you are ready to process the next batch of results.
3) Optionally configure the return results by specifying values for the
resultsLimit and desiredKeys properties.
4) Pass the query operation object to the addOperation: method of the
target database to execute the operation against that database.
So it looks like:
var q = CKQuery(/* ... */)
var qop = CKQueryOperation (query: q)
qop.resultsLimit = 5
qop.queryCompletionBlock = { (c:CKQueryCursor!, e:NSError!) -> Void in
if nil != c {
// there is more to do; create another op
var newQop = CKQueryOperation (cursor: c!)
newQop.resultsLimit = qop.resultsLimit
newQop.queryCompletionBlock = qop.queryCompletionBlock
// Hang on to it, if we must
qop = newQop
// submit
....addOperation(qop)
}
}
....addOperation(qop)
I have a mnesia table for this record.
-record(peer, {
peer_key, %% key is the tuple {FileId, PeerId}
last_seen,
last_event,
uploaded = 0,
downloaded = 0,
left = 0,
ip_port,
key
}).
Peer_key is a tuple {FileId, ClientId}, now I need to extract the ip_port field from all peers that have a specific FileId.
I came up with a workable solution, but I'm not sure if this is a good approach:
qlc:q([IpPort || #peer{peer_key={FileId,_}, ip_port=IpPort} <- mnesia:table(peer), FileId=:=RequiredFileId])
Thanks.
Using on ordered_set table type with a tuple primary key like { FileId, PeerId } and then partially binding a prefix of the tuple like { RequiredFileId, _ } will be very efficient as only the range of keys with that prefix will be examined, not a full table scan. You can use qlc:info/1 to examine the query plan and ensure that any selects that are occurring are binding the key prefix.
Your query time will grow linearly with the table size, as it requires scanning through all rows. So benchmark it with realistic table data to see if it really is workable.
If you need to speed it up you should focus on being able to quickly find all peers that carry the file id. This could be done with a table of bag-type with [fileid, peerid] as attributes. Given a file-id you would get all peers ids. With that you could construct your peer table keys to look up.
Of course, you would also need to maintain that bag-type table inside every transaction that change the peer-table.
Another option would be to repeat fileid and add a mnesia index on that column. I am just not that into mnesia's own secondary indexes.