I'm processing ~10k events between two different classes. One is fetching them, and the other is storing them in a dictionary. Now since the fetching class is also doing more stuff with the data than just passing it to the second class, it really doesn't make a lot of sense to send them over as a big bulk, but rather I'm processing them like
actor Fetcher {
let someProcessor = Processor()
func getData() async {
let results = await Rest.getData()
for result in results {
await someProcessor.doStuff(with: result)
await someOtherObject.handle(result)
}
}
}
actor Processor {
func doStuff(with result: Result) async {
// ...
}
}
now maybe you can see the problem. With both of them being actors, I keep sending around data between threads. Processing ~10k results thus takes 8 seconds. This is mostly because of thread switches. If I make my code non-thread-safe by removing the actor keyword it takes less than a second. It would remove some functionality of my code if I did that though. Is there a way to tell swift that these two actors should always run in the same Thread to avoid the switching?
Related
Currently, I have the following code
class ShopViewController: UIViewController {
#IBAction func buy(_ sender: Any) {
Task {
// Will run in main thread, because ShopViewController is
// marked as #MainActor
let success = await Store.INSTANCE.purchase(selectedShop)
}
}
I want the Task executes in non-main thread, I refactor the code to the following
class ShopViewController: UIViewController {
#IBAction func buy(_ sender: Any) {
Task.detached { [weak self] in
// Will run in non main thread.
guard let self = self else { return }
let success = await Store.INSTANCE.purchase(self.selectedShop)
}
}
Now, the Task runs in non-main thread.
But, I was wondering, is using Task.detached a best practice and correct approach, to make sure the Task is executed in non-main thread?
There are a few considerations:
Which actor should this Task use?
Using the main actor with Task { … } is fine. As soon as it hits the await keyword, it will suspend execution on the main actor, free it to go do other stuff, and await the results from purchase. In fact, in practice, you are probably doing something with the success result (e.g. updating the UI), so you probably want to use Task { … } to keep it on the current actor.
Which actor should purchase use?
If Store.INSTANCE.purchase requires a particular actor, it would define which actor it wants to use, not the caller. It might be an actor type. It might be decorated with #MainActor. This is dictated by how Store.INSTANCE and purchase are defined. The actor used by buy(_:) is largely immaterial.
Does it matter which actor purchase uses?
You say that purchase is primarily performing network i/o. In that case, it doesn’t matter if it is on the main actor or not. If it was doing something slow and synchronous (e.g., parsing some exceptionally large responses, image processing, etc.), when maybe you would want to explicitly push those tasks off of the current actor, but otherwise it doesn’t matter.
Bottom line, the use of Task.detached is technically permissible, but generally would not be used in this case.
I recently saw that Swift had introduced concurrency support with the Actor model in Swift 5.5. This model enables safe concurrent code to avoid data races when we have a shared, mutable state.
I want to avoid main thread data races in my app's UI. For this, I am wrapping DispatchQueue.main.async at the call site wherever I set a UIImageView.image property or a UIButton style.
// Original function
func setImage(thumbnailName: String) {
myImageView.image = UIImage(named: thumbnailName)
}
// Call site
DispatchQueue.main.async {
myVC.setImage(thumbnailName: "thumbnail")
}
This seems unsafe because I have to remember to dispatch the method manually on the main queue. The other solution looks like:
func setImage(thumbnailName: String) {
DispatchQueue.main.async {
myImageView.image = UIImage(named: thumbnailName)
}
}
But this looks like a lot of boilerplate, and I wouldn't say I like using this for complex functions with more than one level of nesting.
The release of Swift support for Actors looks like a perfect solution for this. So, is there a way to make my code safer, i.e. always call UI functions on the main thread using Actors?
Actors in Swift 5.5 🤹♀️
Actor isolation and re-entrancy are now implemented in the Swift stdlib. So, Apple recommends using the model for concurrent logic with many new concurrency features to avoid data races. Instead of lock-based synchronisation (lots of boilerplate), we now have a much cleaner alternative.
Some UIKit classes, including UIViewController and UILabel, now have out of the box support for #MainActor. So we only need to use the annotation in custom UI-related classes. For example, in the code above, myImageView.image would automatically be dispatched on the main queue. However, the UIImage.init(named:) call is not automatically dispatched on the main thread outside of a view controller.
In the general case, #MainActor is useful for concurrent access to UI-related state, and is the easiest to do even though we can manually dispatch too. I've outlined potential solutions below:
Solution 1
The simplest possible. This attribute could be useful in UI-Related classes. Apple have made the process much cleaner using the #MainActor method annotation:
#MainActor func setImage(thumbnailName: String) {
myImageView.image = UIImage(image: thumbnailName)
}
This code is equivalent to wrapping in DispatchQueue.main.async, but the call site is now:
await setImage(thumbnailName: "thumbnail")
Solution 2
If you have Custom UI-related classes, we can consider applying #MainActor to the type itself. This ensures that all methods and properties are dispatched on the main DispatchQueue.
We can then manually opt out from the main thread using the nonisolated keyword for non-UI logic.
#MainActor class ListViewModel: ObservableObject {
func onButtonTap(...) { ... }
nonisolated func fetchLatestAndDisplay() async { ... }
}
We don't need to specify await explicitly when we call onButtonTap within an actor.
Solution 3 (Works for blocks, as well as functions)
We can also call functions on the main thread outside an actor with:
func onButtonTap(...) async {
await MainActor.run {
....
}
}
Inside a different actor:
func onButtonTap(...) {
await MainActor.run {
....
}
}
If we want to return from within a MainActor.run, simply specify that in the signature:
func onButtonTap(...) async -> Int {
let result = await MainActor.run { () -> Int in
return 3012
}
return result
}
This solution is slightly less cleaner than the above two solutions which are most suited for wrapping an entire function on the MainActor. However, actor.run also allows for inter threaded code between actors in one func (thx #Bill for the suggestion).
Solution 4 (Block solution that works within non-async functions)
An alternative way to schedule a block on the #MainActor to Solution 3:
func onButtonTap(...) {
Task { #MainActor in
....
}
}
The advantage here over Solution 3 is that the enclosing func doesn't need to be marked as async. Do note however that this dispatches the block later rather than immediately as in Solution 3.
Summary
Actors make Swift code safer, cleaner and easier to write. Don't overuse them, but dispatching UI code to the main thread is a great use case. Note that since the feature is still in beta, the framework may change/improve further in the future.
Since we can easily use the actor keyword interchangeably with class or struct, I want to advise limiting the keyword only to instances where concurrency is strictly needed. Using the keyword adds extra overhead to instance creation and so doesn't make sense when there is no shared state to manage.
If you don't need a shared state, then don't create it unnecessarily. struct instance creation is so lightweight that it's better to create a new instance most of the time. e.g. SwiftUI.
I have a set of asynchronous operations of an unknown amount that need to executed one after another because they are updating the same resource. At the end of all the executions, I want a single point of completion to be notified that they're all complete.
e.g. I have a basket that has an unknown number of eggs in it (numEggs). I need to call api.removeEgg(eggID:fromBasket:completion:) for numEggs times - but i don't want the subsequent egg to be removed until the previous egg is completely removed, as they cannot modify the basket at the same time. When all of the eggs are removed, the client code should be notified once.
Which is the best mechanism to achieve this given the amount of tasks is unknown? I've attempted to use DispatchGroup, but it seems the asynchronous tasks are kicked off at the same time. OperationQueue would work the same way.
NOTE: This is not a duplicate of this question: Calling asynchronous tasks one after the other in Swift (iOS)
The difference is that I do not know the number of asynchronous tasks that need to be completed one-after-the-other. In the referenced post, the type of the tasks are known at compile time and can simply be chained - I don't know until runtime how many asynchronous tasks I'll need to execute.
You can add a dependency on a certain operation that has to happen afterwards.
For example,
let operation = RandomOperation()
let laterOperation = RandomOperation()
laterOperation.addDependency(operation)
what it does is the laterOperation waits to start until the previous operation ends its process.
-- it seems that you have an unknown number of operation to take care of.
I recommend you make a custom class of "group operation".
And all the custom class does is to have a queue which is OperationQueue
and variable operations.
at the end, you add operations to queue by
queue.addOperation(operations:waitUntilFinished:)
Recursive function:
func removeAllEggs(eggIDs: [String], completion:#escaping () -> ()) {
var ids = eggIDs
let id = ids.last
Alamofire.request(endpoint, method: .post, parameters: parametes, encoding: JSONEncoding.default, headers: headers).responseData { response in
// check response etc...
if ids.count > 1 {
ids.removeLast()
removeAllEggs(eggIDs: ids, completion: completion)
} else {
// you just sent the last item
return completion()
}
}
}
and to use it:
func callItHere() {
removeAllEggs(eggIDs: allEggs) {
// Done! poof! all gone!
}
}
I am fetching data (news articles) in JSON format from a web service. The fetched data needs to be converted to an Article object and that object should be stored or updated in the database. I am using Alamofire for sending requests to the server and Core Data for database management.
My approach to this was to create a DataFetcher class for fetching JSON data and converting it to Article object:
class DataFetcher {
var delegate:DataFetcherDelegate?
func fetchArticlesFromUrl(url:String, andCategory category:ArticleCategory) {
//convert json to article
//send articles to delegate
getJsonFromUrl(url) { (json:JSON?,error:NSError?) in
if error != nil {
print("An error occured while fetching json : \(error)")
}
if json != nil {
let articles = self.getArticleFromJson(json!,andCategory: category)
self.delegate?.receivedNewArticles(articles, fromCategory: category)
}
}
}
After I fetch the data I send it to DataImporter class to store it in database:
func receivedNewArticles(articles: [Article], fromCategory category:ArticleCategory) {
//update the database with new articles
//send articles to delegate
delegate?.receivedUpdatedArticles(articles, fromCategory:category)
}
The DataImporter class sends the articles to its delegate that is in my case the ViewController. This pattern was good when I had only one API call to make (that is fetchArticles), but now I need to make another call to the API for fetching categories. This call needs to be executed before the fetchArticles call in the ViewController.
This is the viewDidLoad method of my viewController:
override func viewDidLoad() {
super.viewDidLoad()
self.dataFetcher = DataFetcher()
let dataImporter = DataImporter()
dataImporter.delegate = self
self.dataFetcher?.delegate = dataImporter
self.loadCategories()
self.loadArticles()
}
My questions are:
What is the best way to ensure that one the call to the API gets executed before the other one?
Is the pattern that I implemented good since I need to make different method for different API calls?
What is the best way to ensure that one the call to the API gets executed before the other one?
If you want to ensure that two or more asynchronous functions execute sequentially, you should first remember this:
If you implement a function which calls an asynchronous function, the calling function becomes asynchronous as well.
An asynchronous function should have a means to signal the caller that it has finished.
If you look at the network function getJsonFromUrl - which is an asynchronous function - it has a completion handler parameter which is one approach to signal the caller that the underlying task (a network request) has finished.
Now, fetchArticlesFromUrl calls the asynchronous function getJsonFromUrl and thus becomes asynchronous as well. However, in your current implementation it has no means to signal the caller that its underlying task (getJsonFromUrl) has finished. So, you first need to fix this, for example, through adding an appropriate completion handler and ensuring that the completion handler will eventually be called from within the body.
The same is true for your function loadArticles and loadCategories. I assume, these are asynchronous and require a means to signal the caller that the underlying task has finished - for example, by adding a completion handler parameter.
Once you have a number of asynchronous functions, you can chain them - that is, they will be called sequentially:
Given, two asynchronous functions:
func loadCategories(completion: (AnyObject?, ErrorType?) -> ())
func loadArticles(completion: (AnyObject?, ErrorType?) -> ())
Call them as shown below:
loadCategories { (categories, error) in
if let categories = categories {
// do something with categories:
...
// Now, call loadArticles:
loadArticles { (articles, error) in
if let articles = articles {
// do something with the articles
...
} else {
// handle error:
...
}
}
} else {
// handler error
...
}
}
Is the pattern that I implemented good since I need to make different method for different API calls?
IMHO, you should not merge two functions into one where one performs the network request and the other processes the returned data. Just let them separated. The reason is, you might want to explicitly specify the "execution context" - that is, the dispatch queue, where you want the code to be executed. Usually, Core Data, CPU bound functions and network functions should not or cannot share the same dispatch queue - possibly also due to concurrency constraints. Due to this, you may want to have control over where your code executes through a parameter which specifies a dispatch queue.
If processing data may take perceivable time (e.g. > 100ms) don't hesitate and execute it asynchronously on a dedicated queue (not the main queue). Chain several asynchronous functions as shown above.
So, your code may consist of four asynchronous functions, network request 1, process data 1, network request 2, process data 2. Possibly, you need another function specifically for storing the data into Core Data.
Other hints:
Unless there's a parameter which can be set by the caller and which explicitly specifies the "execution context" (e.g. a dispatch queue) where the completion handler should be called on, it is preferred to submit the call of the completion handler on a concurrent global dispatch queue. This performs faster and avoids dead locks. This is in contrast to Alamofire that usually calls the completion handlers on the main thread per default and is prone to dead locks and also performs suboptimal. If you can configure the queue where the completion handler will be executed, please do this.
Prefere to execute functions and code on a dispatch queue which is not associated to the main thread - e.g. not the main queue. In your code, it seems, the bulk of processing the data will be executed on the main thread. Just ensure that UIKit methods will execute on the main thread.
I have an App which does hundreds of different network calls (HTTP GET requests) to a REST service. The calls are done from every single page of the app and there are much of them. But there is the requirement that two requests have to be done (on startup or awake) before any other network requests happens. The result of these two requests is some config data which is needed before all other following requests. (this requirement has many reason)
I have one central method for all GET requests. It uses AFNetworking and (of course) asynchronous handlers:
func GET(path: String, var parameters: Dictionary<String, String>? = nil) -> Future<AnyObject!> {
let promise = Promise<AnyObject!>()
manager.GET(path, parameters: parameters, success: { (task: NSURLSessionDataTask!, response: AnyObject!) -> Void in
// some processing...
promise.success(response)
}) { (task: NSURLSessionDataTask!, error: NSError!) -> Void in
// some failure handling...
promise.failure(error)
}
return promise.future
}
The problem now is - how to do this first two calls and block all other calls until those two succeed? The obvious solution would be a semaphore which blocks the thread (not main!) until those two calls arrive successfully but if possible I want to avoid this solution. (because of deadlocks, race conditions, how to do error handling, etc... the usual suspects)
So is there any better solution for this?
The synchronous order basically has to be:
1st call
wait for successful response of 1st call
2nd call
wait or successful response of 2nd call
allow all other calls (async) in any order
I can not do this logic on the upper layers of the app because the GET requests could come from every part of the app, so I would need to rewrite everthing. I want to do it centrally on this single GET request.
Maybe this is also possible with the Promise/Future pattern I already use, any hints are welcome.
Thanks.
There are a couple of approaches to tackle this class of problem:
You can use GCD (as shown in the answer you linked to) using semaphores or dispatch groups.
You can use asynchronous NSOperation custom subclass in conjunction with NSOperationQueue.
Or you can use promises.
But I'd generally suggest you pick one of these three, but don't try to introduce dispatch/operation queue patterns with your existing futures/promises code. People will suggest dispatch/operation queue approaches simply because this is how one generally solves this type of problem (using futures is not gained popular acceptance yet, and there are competing promises/futures libraries out there). But promises/futures is designed to solve precisely this problem, and if you're using that, just stick with it.
On the specifics of the dispatch/operation approaches, I could explain why I don't like the GCD approach and try to convince you to use the NSOperationQueue approach, but that's moot if you're using promises/futures.
The challenge here, though, is that you appear to be using an old version of Thomvis/BrightFutures. The current version takes two types for the generic. So my code below probably won't work for you. But I'll offer it up as as suggestion, as it may be illustrative.
For example, let's imagine that you had a loginURL (the first request), and some second request that you wanted to perform after that, but then had an array urls that you wanted to run concurrently with respect to each other, but only after the first two requests were done. Using 3.2.2 of BrightFutures, it might look like:
GET(loginURL).flatMap { responseObject -> Future<AnyObject!, NSError> in
return self.GET(secondRequestURL)
}.flatMap { responseObject -> Future<Int, NSError> in
return urls.map { self.GET($0) }.fold(0) { sum, _ in sum + 1 }
}.onSuccess { count in
print("\(count) concurrent requests completed successfully")
}.onFailure { error in
print("not successful: \(error)")
}
Judging from your code snippet, you must be using an old version of BrightFutures, so the above probably won't work as written, but hopefully this illustrates the basic idea. Use the capabilities of BrightFutures to manage these asynchronous tasks and control which are done sequentially and which are done concurrently.