Load Large Data from multiple tables in parallel using multithreading

Load Large Data from multiple tables in parallel using multithreading - ios

I'm trying load data about 10K records from 6 different tables from my Ultralite DB.
I have created different functions for 6 different tables.
I have tried to load these in parallel using NSInvokeOperations, NSOperations, GCD, Subclassing NSOperation but nothing is working out.
Actually, loading 10K from 1 table takes 4 Sec, and from another 5 Sec, if i keep these 2 in queue it is taking 9 secs. This means my code is not running in parallel.
How to improve performance problem?

There may be multiple ways of doing it.
What i suggest will be :
Set the number of rows for table view to be exact count (10k in your case)
Table view is optimised to create only few number of cells at start(follows pull model). so cellForRowAtIndexPath will be called only for few times at start.
Have an array and fetch only 50 entries at start. Have a counter variable.
When user scrolls table view and count reaches after 50 fetch next 50 items(it will take very less time) and populate cells with next 50 data.
keep on doing same thing.
Hope it works.

You should fetch records in chunks(i.e. fetch 50-60 records at a time in a table). And then when user reach end of the table load another 50 -60 records. Try hands with this library: Bottom Pull to refresh more data in a UITableView
Regarding parallelism go with GCD, and reload respective table when GCD's success block called.

Ok you have to use Para and Time functions look them up online for more info

Related

How to Decrease Request Size to insert data to multiple table? (Back4App)

I was inserting the data (Back4app) to two tables in the same time. But it took a lot of requests. I've tried it for 3 hours, and I've inserted 3.28k data. (There's no other action in that program) In the end, it took 6.88k requests. Is it ok about it? And how to decrease the request size?

Optimize array data size while pagination

I know how to implement pagination with UITableview but my question is we always append data of next page with existing complete data array so every next page array is increasing array size.
For example - We get 50 records in first page and we request for next page and we again get 50 records and then we will append that records in existing complete array so complete array is now having 100 records. I am requesting data with around 100 pages so my array will have 5000 records as we know holding some starting page array data is not good idea as we hardly come back for starting page after visited 100 pages .
Is there any way to optimize array size? please help me on this as i searched a lot but didn't find good answer for this.
I would be very grateful for help and sorry for my bad english.

I think you can achieve that by writing the "old" data to a local storage, and retrieve and insert back into your array.
So, imagine that you've already fetched, lets say 200 items. So when the user scrolls down, and you fetched the next page (the next 20 items), you "cut" from your array the items from 0 to 99 and write to a file. Now your array has 120 items. Then, when the user continues scrolling and again reached 220 (array.count >= 220), repeat the same logic, and so on.
Now the most interesting part. If the user scrolls back and the index of the top visible cell is <100, you read the previously written data from the file (and remove from the file) and insert into your array at 0 position.
And of course it'd be better to clear all that kind of files on the app launch.
Of course the numbers I wrote below are magic numbers and you should play with them to find the right ones that best fit your needs.

SELECT queries performance impact when the Clickhouse table is continuously populated with INSERT INTO

The Clickhouse table, MergeTree Engine, is continuously populated with “INSERT INTO … FORMAT CSV” queries, starting empty. The average input rate is 7000 rows per sec. The insertion is happening in batches of few thousand rows. This has severe performance impact when SELECT queries are executed concurrently. As described in the Clickhouse documentation, the system needs at most 10 minutes to merge the data of a specific table (re-index). But this is not happening as the table is continuously populated.
This is also evident in the file system. The table folder has thousands of sub-folders and the index is over-segmented. If the data ingestion stops, after a few minutes the table is fully merged, and the number of sub-folders becomes a dozen.
In order to encounter the above weakness, the Buffer Engine was used to buffer the table data ingestion for 10 minutes. Consequently, the buffer maximum number of rows is on average 4200000.
The initial table is remaining at most 10 minutes behind as the buffer is keeping the most recently ingested rows. The table is finally merged, and the behaviour is the same as in case where the table has stopped to be populated for a few minutes.
But the Buffer table, which corresponds to the combination of the buffer and the initial table, is getting severely slower.
From the above appears that, if the table is continuously populated, it is not merging, and indexing suffers. Is there a way to avoid this weakness?

The number of sub-folders in the table data directory is not so representative value.
Indeed, each sub-folder contains a data part consisting of sorted (indexed) rows. If several data parts are merged into a new bigger one the new sub-folder appears.
However, source data parts are not removed instantly after the merge. There is a <merge_tree> setting old_parts_lifetime defining a delay after which the parts will be removed, by default it set to 8 minutes. Also, there is cleanup_delay_period setting defining how often a background cleaner checks and removes outdated parts, it is 30 seconds by default.
So, it is normal to have such amount of sub-folders for about 8 minutes and 30 seconds after the ingestion starts. If it is unacceptable to you, you can change these settings.
It makes sense to check the amount of active parts in a table only (i.e. parts which have not been merged into a bigger one). To do so, you could run the following query: SELECT count() FROM system.parts WHERE database='db' AND table='table' AND active.
Moreover, ClickHouse does such checks internally if the amount of active parts in a partition is greater than parts_to_delay_insert=150, it will slow down INSERTs, but if it is greater than parts_to_throw_insert=300 it will abort insertions.

How to implement infinite scroll with multiple filter on data that get from Firebase in Swift?

I'm using Firebase for my iOS application and I'm having trouble implement infinite scroll and filtering data together.
What I need to do is:
Display items with order/filter on multiple property (location, category, status . . .)
Implement infinite scroll when the user scrolled to bottom of the screen.
I tried to think about some solutions:
The first, I think that I'll query the data with the necessary conditions then limit the number of records by use queryLimitedToFirst(N), and increase N when need to load the next items. But because Firebase can only filter on one property at a time and it's also a waste to reload data. So, I was thinking about the second solution.
As approaches are suggested from Frank van Puffelen (Query based on multiple where clauses in firebase):
filter most on the server, do the rest on the client
Yes, exactly like that. I'll execute queryOrderedByKey, queryStartingAtValue, queryEndingAtValue to implement infinite scroll, pull down the remaining data and filter that on client. But there is one problem that is I would not have enough items to display for the user if execute filter on the client.
For example: each time run the query, I receive 10 items. After data filtering process on the client, I just left 5 (can be 0) items meet the conditions to display to the user.
I don't want this because user may think there is a problem
Can I please get some pointers on this? If I didn't structured the data properly, can I also get some tips there?

Data sorting and update of UIcollectionViewCells. Is this a lost cause?

I have core data entries displayed in a collectionView, sorted from 1 2 3 ... n. New batches of entries are added as the user flips through the first n. Data is built from a JSON response obtained from a web server.
Because the first entry of the fetch request is associated to cell 0 - via the datasource delegate -, it's not possible to add a new batch at the bottom of the collection view. If it's added from cell 0, old cell contents are replaced by new ones, or in short the whole page seems to be replaced by new stuff, and the data the user was looking at is offset by the number of new entry. If the batch is large, it's simply buried. Furthermore, if the update is done from cell 0, all entries are made visible, which takes time and memory.
There are several options that I considered:
1) data-redorder, meaning instead of getting the fetch result as 1 2 3 4 ... n, I need the opposite, n ... 3 2 1 (nothing to do with a fetch using reverse order sorting) straight from the fetch request. I'm not sure it's possible? is there a CD gotcha allowing to re-order the fetch result before it is presented to the UICollectionViewDataSource delegate ?
2)Change the Index path/viewCell association in "collectionView cellForItemAtIndexPath:", Use (numberOfItemsInSection - IndexPath.Item). It creates several edges cases, as entries can be removed/updated in the view (hence numberOfItemsInSection changes). So I'd rather avoid it if I can...
3) adding new data from cell 0, ruled out for the reason I explained. There may be a solution: has anyone achieved a satisfactory result by setting a view offset? For example, if 20 new entries are added, then the content of cell 0 is moved to cell 20. So, we just need to tell the view controller to display from cell 20 onwards. Any image flipping or side effects I might expect?
4) download a big chunk of the data, and simply using the built-in core data faulting mechanism. But that's below optimal, because I'm not sure exactly how much I should download - user dependent - and the initial request (JSON+Core Data) might take too long. That's why lazy fetching is here for anyway.
Any advice someone facing the same problem could share ?
Thanks !

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart