Swift concurrent operation slower 2x times - ios

I have a large JSON array that I need to save to Realm, the problem is that this operation lasts around 45 seconds and that's too long. I tried running the save operation concurrently for each element in JSON array like this:
for element in jsonArray { // jsonArray has about 25 elements
DispatchQueue.global(qos: .userInitiated).async {
let realm = try! Realm()
let savedObject = realm.objects(MyObject.self).filter("name == '\(element.name)'")
for subElement in element { // element is an array that has around 1000 elements
let myModel = MyModel(initWith: subElement) // MyModel initialization is a simple light weight process that copies values from one model to another
savedObject.models.append(myModel)
}
}
}
When I try to run the same code but with DispatchQueue.main.async it finished around 2x faster even though it's not concurrent. I also tried running the code above with quality of service .userInteractive but it is the same speed.
When I run this code CPU utilization is about 30%, and memory about 45 MB. Is it possible to speed up this operation or I reached dead end?

The entire loop should be inside the DispatchQueue.global(qos: .userInitiated).async block.
As documented on the Realm website:
Realm write operations are synchronous and blocking, not asynchronous. If thread A starts a write operation, then thread B starts a write operation on the same Realm before thread A is finished, thread A must finish and commit its transaction before thread B’s write operation takes place. Write operations always refresh automatically on beginWrite(), so no race condition is created by overlapping writes.
This means you won't get any benefit by trying to write in multiple threads.

Related

Asynchronous Xarray writing to Zarr

all. I'm using a Dask Distributed cluster to write Zarr+Dask-backed Xarray Datasets inside of a loop, and the dataset.to_zarr is blocking. This can really slow things down when there are straggler chunks that block the continuation of the loop. Is there a way to do the .to_zarr asynchronously, so that the loop can continue with the next dataset write without being held up by a few straggler chunks?
With the distributed scheduler, you get async behaviour without any special effort. For example, if you are doing arr.to_zarr, then indeed you are going to wait for completion. However, you could do the following:
client = Client(...)
out = arr.to_zarr(..., compute=False)
fut = client.compute(out)
This will return a future, fut, whose status reflects the current state of the whole computation, and you can choose whether to wait on it or to continue submitting new work. You could also display it to a progress bar (in the notebook) which will update asynchronously whenever the kernel is not busy.

How to limit gcd queue buffer size

I am trying to process series of UIImages using CoreImage & Metal and also display them. My problem is I want to drop the incoming image if my gcd block is busy. How do I achieve this GCD queues, how do I define the maximum buffer size of queue?
There’s no native mechanism for this, but you can achieve what you want algorithmically with semaphores.
Before I dive into the “process 4 at a time, but discard any that come if we’re busy” scenario, let me first consider the simpler “process all, but not more than 4 at any given time” pattern. (I’m going to answer your question below, but building on this simpler situation.)
For example, let’s imagine that you had some preexisting array of objects and you want to process them concurrently, but not more than four at any given time (perhaps to minimize peak memory usage):
DispatchQueue.global().async {
let semaphore = DispatchSemaphore(value: 4)
for object in objects {
semaphore.wait()
processQueue.async {
self.process(object)
semaphore.signal()
}
}
}
Basically, the wait function will, as the documentation says, “Decrement the counting semaphore. If the resulting value is less than zero, this function waits for a signal to occur before returning.”
So, we’re starting our semaphore with a count of 4. So if objects had 10 items in it, the first four would start immediately, but the fifth wouldn’t start until one of the earlier ones finished and sent a signal (which increments the semaphore counter back up by 1), and so on, achieving a “run concurrently, but a max of 4 at any given time” behavior.
So, let’s return to your question. Let’s say you wanted to process no more than four images at a time and drop any incoming image if there were already four images currently being processed. You can accomplish that by telling wait to not really wait at all, i.e., check right .now() whether the semaphore counter has hit zero already, i.e., something like:
let semaphore = DispatchSemaphore(value: 4)
let processQueue = DispatchQueue(label: "com.domain.app.process", attributes: .concurrent)
func submit(_ image: UIImage) {
if semaphore.wait(timeout: .now()) == .timedOut { return }
processQueue.async {
self.process(image)
self.semaphore.signal()
}
}
Note, we generally want to avoid blocking the main thread (like wait can do), but because I’m using a timeout of .now(), it will never block, we’re just use the semaphore to keep track of where we are in a nice, thread-safe manner.
One final approach is to consider operation queues:
// create queue that will run no more than four at a time (no semaphores needed; lol)
let processQueue: OperationQueue = {
let queue = OperationQueue()
queue.maxConcurrentOperationCount = 4
return queue
}()
func submit(_ image: UIImage) {
// cancel all but the last three unstarted operations
processQueue.operations
.filter { $0.isReady && !$0.isFinished && !$0.isExecuting && !$0.isCancelled }
.dropLast(3)
.forEach { $0.cancel() }
// now add new operation to the queue
processQueue.addOperation(BlockOperation {
self.process(image)
})
}
The behavior is slightly different (keeping the most recent four images queued up, ready to go), but is something to consider.
GCD queues don't have a maximum queue size.
You can use a semaphore for this. Initialize it with the maximum queue length you want to support. Use dispatch_semaphore_wait() with DISPATCH_TIME_NOW as the timeout to try to reserve a spot before submitting a task to the queue. If it times out, don't enqueue the task (discard it, or whatever). Have the task signal the semaphore when it's complete to release the spot you reserved for it to be used for another task, later.

Block Operation - Completion Block returning random results

My block operation completion handler is displaying random results. Not sure why. I've read this and all lessons say it is similar to Dispatch Groups in GCD
Please find my code below
import Foundation
let sentence = "I love my car"
let wordOperation = BlockOperation()
var wordArray = [String]()
for word in sentence.split(separator: " ") {
wordOperation.addExecutionBlock {
print(word)
wordArray.append(String(word))
}
}
wordOperation.completionBlock = {
print(wordArray)
print("Completion Block")
}
wordOperation.start()
I was expecting my output to be ["I", "love", "my", "car"] (it should display all these words - either in sequence or in random order)
But when I run my output is either ["my"] or ["love"] or ["I", "car"] - it prints randomly without all expected values
Not sure why this is happening. Please advice
The problem is that those separate execution blocks may run concurrently with respect to each other, on separate threads. This is true if you start the operation like you have, or even if you added this operation to an operation queue with maxConcurrentOperationCount of 1. As the documentation says, when dealing with addExecutionBlock:
The specified block should not make any assumptions about its execution environment.
On top of this, Swift arrays are not a thread-safe. So in the absence of synchronization, concurrent interaction with a non-thread-safe object may result in unexpected behavior, such as what you’ve shared with us.
If you turn on TSAN, the thread sanitizer, (found in “Product” » “Scheme” » “Edit Scheme...”, or press ⌘+<, and then choose “Run” » “Diagnostics” » “Thread Sanitizer”) it will warn you about the data race.
So, bottom line, the problem isn’t addExecutionBlock, per se, but rather the attempt to mutate the array from multiple threads at the same time. If you used concurrent queue in conjunction with dispatch group, you can experience similar problems (though, like many race conditions, sometimes it is hard to manifest).
Theoretically, one could add synchronization code to your code snippet and that would fix the problem. But then again, it would be silly to try to initiate a bunch of concurrent updates, only to then employ synchronization within that to prevent concurrent updates. It would work, but would be inefficient. You only employ that pattern when the work on the background threads is substantial in comparison to the amount of time spent synchronizing updates to some shared resource. But that’s not the case here.

Why object's mutability is relevant to its Thread safe?

I am working on CoreImage on IOS, you guys know CIContext and CIImage are immutable so they can be shared on threads. The problem is i am not sure why objects' mutability is closely relevant to its thread safe.
I can guess the rough reason is to prevent multiple threads from do something on a certain object at the same time. But can anybody provide some theoretical evidence or give a specific answer to it?
You are correct, mutable objects are not thread safe because multiple threads can write to that data at the same time. This is in contrast to reading data, an operation multiple threads can do simultaneously without causing problems. Immutable types can only be read.
However, when multiple threads are writing to the same data, these operations may interfere. For example, an NSMutableArray which two threads are editing could easily be corrupted. One thread is editing at one location, suddenly changing the memory the other thread was updating. iOS will use what is called an atomic operation for simple data types. What this means is that iOS requires the edit operation to fully finish before anything else can happen. This has an efficiency advantage over locks. Just Google about this stuff if you want to know more.
Well, actually, you are right.
If you have an immutable objects, all you can do is to read data from them. And every thread will get the same data, since the object can not be changed.
When you have a mutable object, the problem is that (in general) read and write operations are not atomic. It means that they are not performed instantaneously, they take time. So they can overlap each other. Even if we have a single-core processor, it can switch between threads at arbitrary moments of time. Possible problems it might cause:
1) Two threads write at the same object simultaneously. In this case the object might become corrupted (for instance, half of data comes from the first thread, the other half – from the second, the result is unpredictable.
2) One thread writes the data, another one reads it. One problem is that thread might read already outdated data. Another one is that it might read a corrupted data (if the writing thread haven't finished writing yet).
Take the example with an array of values. Let's say I'm doing some computation and I find the count of the array to be 4
var array = [1,2,3,4]
let count = array.count
Now maybe I want to loop through all the elements in the array, so I setup a loop that goes through index i<4 or something along those lines. So far so good.
This array is mutable though, so on a separate thread, I could easily remove an element. So perhaps I'm on thread 1 and I get the count to be 4, I perhaps start looping through the array. Now we switch to thread 2 and I remove an element, so now this same array has only 3 values in it. If I end up going back to thread 1 and I still assume I have 4 values while looping through the array, my program is going to crash when it tries to access the 4th element.
This is why immutability is desirable, it can guarantee some level of consistency across threads.

Implementing concurrent read exclusive write model with GCD

I am trying to understand the proper way of using Grand Central Dispatch (GCD) to implement concurrent read exclusive write model of controlling access to a resource.
Suppose there is a NSMutableDictionary that is read a lot and once in awhile updated. What is the proper way of ensuring that reads always work with consistent state of the dictionary? Sure I can use a queue and serialize all read and write access to the dictionary, but that would unnecessarily serialize reads which should be allowed to access the dictionary concurrently. At first the use of groups here sounds promising. I could create a 'read' group and add every read operation to it. That would allow reads to happen at the same time. And then when the time comes to do an update, I could dispatch_notify() or dispatch_wait() as part of a write operation to make sure that all reads complete before the update is allowed to go on. But then how do I make sure that a subsequent read operation does not start until the write operation completes?
Here's an example with the dictionary I mentioned above:
R1: at 0 seconds, a read comes in which needs 5 seconds to complete
R2: at 2 seconds another read comes in which needs 5 seconds to complete
W1: at 4 seconds a write operation comes needing access to dictionary for 3 sec
R3: at 6 seconds another read comes in which needs 5 seconds to complete
W2: at 8 seconds another write operation comes in also needing 3 seconds to complete
Ideally the above should play out like this:
R1 starts at 0 seconds, ends at 5
R2 starts at 2 seconds, ends at 7
W1 starts at 7 seconds, ends at 10
R3 starts at 10 seconds, ends at 15
W2 starts at 15 seconds, ends at 18
Note: even though R3 came at 6 seconds, it was not allowed to start before W1 because W1 came earlier.
What is the best way to implement the above with GCD?
You've got the right idea, I think. Conceptually, what you want is a private concurrent queue that you can submit "barrier" blocks to, such that the barrier block waits until all previously submitted blocks have finished executing, and then executes all by itself.
GCD doesn't (yet?) provide this functionality out-of-the-box, but you could simulate it by wrapping your read/write requests in some additional logic and funnelling these requests through an intermediary serial queue.
When a read request reaches the front of the serial queue, dispatch_group_async the actual work onto a global concurrent queue. In the case of a write request, you should dispatch_suspend the serial queue, and call dispatch_group_notify to submit the work onto the concurrent queue only after the previous requests have finished executing. After this write request has executed, resume the queue again.
Something like the following could get you started (I haven't tested this):
dispatch_block_t CreateBlock(dispatch_block_t block, dispatch_group_t group, dispatch_queue_t concurrentQueue) {
return Block_copy(^{
dispatch_group_async(concurrentQueue, group, block);
});
}
dispatch_block_t CreateBarrierBlock(dispatch_block_t barrierBlock, dispatch_group_t group, dispatch_queue_t concurrentQueue) {
return Block_copy(^{
dispatch_queue_t serialQueue = dispatch_get_current_queue();
dispatch_suspend(serialQueue);
dispatch_group_notify(group, concurrentQueue, ^{
barrierBlock();
dispatch_resume(serialQueue);
});
});
}
Use dispatch_async to push these wrapped blocks onto a serial queue.

Resources