Combine sink only receives outputs after they have all been published - ios

I'm trying to perform some iterative work, and use Combine to publish the progress (0.0 - 100.0) using a CurrentValueSubject, which my ViewModel will then subscribe to
(Edit: the ViewModel controls a SwiftUI ProgressView, which is why receive(on: DispatchQueue.main) is used)
What I'm seeing is that the outputs are being published, but sink doesn't receive any of them until the publisher has completed.
Here's a simplified example:
// Class that performs iterative calculations and publish its progress
class JobWorker {
private var subject: CurrentValueSubject<Double, Never>
private var progress = 0.0
init() {
self.subject = CurrentValueSubject<Double, Never>(progress)
}
func getPublisher() {
return subject.eraseToAnyPublisher()
}
func doWork() {
let tasks = [1,2,3,4,5]
tasks.forEach { num in
// ... perform some calculations ...
self.incrementProgress(20.0)
}
}
func incrementProgress(_ by: Double) {
progress += by
if progress >= 100.0 {
print("PUBLISH completion")
subject.send(completion: .finished)
} else {
print("PUBLISH value \(progress)")
subject.send(progress)
}
}
}
// ViewModel that subscribes to JobWorker's publisher and updates the progress in the view
final class JobViewModel: ObservableObject {
#Published var progress: Double = 0.0
private var cancellables = Set<AnyCancellable>()
private var jobWorker: JobWorker
init() {
self.jobWorker = JobWorker()
}
func runJob() {
self.jobWorker
.getPublisher()
.receive(on: DispatchQueue.main)
.handleEvents(
receiveSubscription: { _ in
print("RECEIVE subscription")
},
receiveOutput: { value in
print("RECEIVE output \(value)")
},
receiveCompletion: { _ in
print("RECEIVE completion")
},
receiveCancel: {
print("RECEIVE cancel")
},
receiveRequest: { _ in
print("RECEIVE demand")
}
)
.sink { [weak self] (completion) in
guard let self = self else { return }
print("SINK completion")
} receiveValue: { [weak self] (value) in
guard let self = self else { return }
print("SINK output \(value)")
self.progress = value
}
.store(in: &cancellables)
print("*** DO WORK ***")
self.jobWorker.doWork()
}
}
Calling JobViewModel.runJob results in the following output:
RECEIVE subscription
RECEIVE demand
RECEIVE output 0.0
SINK output 0.0
*** DO WORK ***
PUBLISH value 20.0
PUBLISH value 40.0
PUBLISH value 60.0
PUBLISH value 80.0
PUBLISH value 100.0
PUBLISH completion
RECEIVE output 20.0
SINK output 20.0
RECEIVE output 40.0
SINK output 40.0
RECEIVE output 60.0
SINK output 60.0
RECEIVE output 80.0
SINK output 80.0
RECEIVE output 100.0
SINK output 100.0
RECEIVE completion
SINK completion
After the CurrentValueSubject is first initialized, all of the outputs are published before handleEvents or sink receives anything.
Instead, I would have expected the output to show PUBLISH output x, RECEIVE output x, SINK output x for each of the outputs, followed by the completion.

The problem is that you are running your worker on the same thread where you are collecting the results.
Because you are doing a receive(on:) on the main DispatchQueue each value that passes through receive(on:) is roughly equivalent to putting a new block on the main queue to be executed when the queue is free.
Your worker fires up, executing synchronously on the main queue. While it's running, the main queue is tied up and not available for other work.
As the worker does its thing, it is publishing results to the subject, and as part of the publisher pipeline receive(on:) queues up the delivery of each result to the main queue, waiting for that queue to be free. The critical point, however, is that the main queue won't be free to handle those blocks, and report results, until the worker is done because the worker itself is tying up the main queue.
So none of your results are reported until after all the work is one.
I suspect what you want to do is run your work in a different context, off the main thread, so that it can complete asynchronously and only report the results on the main thread.
Here's a playground, based on your code, that does that:
import UIKit
import Combine
import PlaygroundSupport
class JobWorker {
private var subject = CurrentValueSubject<Double, Never>(0)
var publisher: AnyPublisher<Double, Never> {
get { subject.eraseToAnyPublisher() }
}
func doWork() async {
do {
for subtask in 1...5 {
guard !Task.isCancelled else { break }
print("doing task \(subtask)")
self.incrementProgress(by: 20)
try await Task.sleep(nanoseconds: 1 * NSEC_PER_SEC)
}
} catch is CancellationError {
print("The Tasks were cancelled")
} catch {
print("An unexpected error occured")
}
}
private func incrementProgress(by: Double) {
subject.value = subject.value + by;
if subject.value >= 100 {
subject.send(completion: .finished)
}
}
}
let worker = JobWorker()
let subscription = worker.publisher
.print()
.receive(on: DispatchQueue.main)
.sink { _ in
print("done")
} receiveValue: { value in
print("New Value Received \(value)")
}
Task {
await worker.doWork()
}
PlaygroundPage.current.needsIndefiniteExecution = true
I made your doWork function an async function so I could execute it from an independent Task. I also added a delay because it makes the asynchronous nature of the code a bit easier to see.
In the "main thread, I create a JobWorker and subscribe to its publisher, but to do the work I create a task and run doWork in that separate task. Progress is reported in the main thread, but the work is being done (and completed) in a different execution context.

Related

How to execute code *after* async function has finished and returned to caller's context?

I have an async function and I want to execute some task after the function has returned to the original context.
Here's a sample:
actor MyActor {
func doTask() await -> Int {
let result = await operation()
Task {
print("2. All tasks done!")
}
return result
}
}
let actor = MyActor()
let taskResult = await actor.doTask()
print("1. Task done")
I want output always to be
Task done
All tasks done!
However, sometimes the output is reversed.
I suppose this is because Swift might be switching threads when returning to the original caller.
Is there any way to actually wait for it so that "2. All tasks done" is called last?
I want to execute some task after the function has returned to the original context.
There's no magic here. If you want a task to execute at a certain point, then you need to start it at that point, not before.
What you are asking for is almost literally the same as
How do I make the print("2 ...") appear after the print("1 ...") in the following.
struct MyStruct {
func doTask() -> Int {
let result = operation()
print("2. All tasks done!")
return result
}
}
let actor = MyStruct()
let taskResult = actor.doTask()
print("1. Task done")
In your case, the only way to ensure the print("2 ...") appears after the print("1 ...") is to start the asynchronous task after you have done the print("1 ...").
You'll need to split the actor function in two.
actor MyActor {
func doOperation() async -> Int {
let result = await operation()
return result
}
func doTask(_ result: Int) async {
Task {
print("2. All tasks done!")
// Use result in some way if you need to
}
}
}
let actor = MyActor()
let taskResult = await actor.doOperation()
print("1. Task done")
await actor.doTask(taskResult)

EXC_BAD_ACCESS when initializing Dictionary of CurrentValueSubject in Swift

I am trying to create a class that executes data loading once and returns the data to all callers of the method while the data was loading to not perform the data loading for the same item (identifier) more than once. The issue I am having is that it seems to crash on the first initialization of CurrentValueSubject for an identifier. This only happens if the downloadStuff returns an Error I have no idea what's wrong. Here is a reproduction of the issue.
Class that does the synchronization:
class FetchSynchronizer<T, ItemIdentifier: Hashable> {
typealias CustomParams = (isFirstLoad: Bool, result: Result<T, Error>)
enum FetchCondition {
// executes data fetching only once
case executeFetchOnlyOnce
// re-executes fetch if request failed
case retryOnlyIfFailure
// always executes fetch even if response is cached
case noDataCache
// custom condition
case custom((CustomParams) -> Bool)
}
struct LoadingState<T> {
let result: Result<T, Error>
let isLoading: Bool
init(result: Result<T, Error>? = nil, isLoading: Bool = false) {
self.result = result ?? .failure(NoResultsError())
self.isLoading = isLoading
}
}
private var cancellables = Set<AnyCancellable>()
private var isLoading: [ItemIdentifier: CurrentValueSubject<LoadingState<T>, Never>] = [:]
func startLoading(identifier: ItemIdentifier,
fetchCondition: FetchCondition = .executeFetchOnlyOnce,
loaderMethod: #escaping () async -> Result<T, Error>) async -> Result<T, Error> {
// initialize loading tracker for identifier on first execution
var isFirstExecution = false
if isLoading[identifier] == nil {
print("----0")
isLoading[identifier] = CurrentValueSubject<LoadingState<T>, Never>(LoadingState<T>())
isFirstExecution = true
}
guard let currentIsLoading = isLoading[identifier] else {
assertionFailure("Should never be nil because it's set above")
return .failure(NoResultsError())
}
if currentIsLoading.value.isLoading {
// loading in progress, wait for finish and call pending callbacks
return await withCheckedContinuation { continuation in
currentIsLoading.filter { !$0.isLoading }.sink { currentIsLoading in
continuation.resume(returning: currentIsLoading.result)
}.store(in: &cancellables)
}
} else {
// no fetching in progress, check if it can be executed
let shouldFetchData: Bool
switch fetchCondition {
case .executeFetchOnlyOnce:
// first execution -> fetch data
shouldFetchData = isFirstExecution
case .retryOnlyIfFailure:
// no cached data -> fetch data
switch currentIsLoading.value.result {
case .success:
shouldFetchData = false
case .failure:
shouldFetchData = true
}
case .noDataCache:
// always fetch
shouldFetchData = true
case .custom(let completion):
shouldFetchData = completion((isFirstLoad: isFirstExecution,
result: currentIsLoading.value.result))
}
if shouldFetchData {
currentIsLoading.send(LoadingState(isLoading: true))
// fetch data
return await withCheckedContinuation { continuation in
Task {
// execute loader method
let result = await loaderMethod()
let state = LoadingState(result: result,
isLoading: false)
currentIsLoading.send(state)
continuation.resume(returning: result)
}
}
} else {
// use existing data
return currentIsLoading.value.result
}
}
}
}
Example usage:
class Executer {
let fetchSynchronizer = FetchSynchronizer<Data?, String>()
func downloadStuff() async -> Result<Data?, Error> {
await fetchSynchronizer.startLoading(identifier: "1") {
return await withCheckedContinuation { continuation in
sleep(UInt32.random(in: 1...3))
print("-------request")
continuation.resume(returning: .failure(NSError() as Error))
}
}
}
init() {
start()
}
func start() {
Task {
await downloadStuff()
print("-----3")
}
DispatchQueue.global(qos: .utility).async {
Task {
await self.downloadStuff()
print("-----2")
}
}
DispatchQueue.global(qos: .background).async {
Task {
await self.downloadStuff()
print("-----1")
}
}
}
}
Start the execution:
Executer()
Crashes at
isLoading[identifier] = CurrentValueSubject<LoadingState<T>, Never>(LoadingState<T>())
Any guidance would be appreciated.
Swift Dictionary is not thread-safe.
You need to make sure it is being accessed from only one thread (i.e queue) or using locks.
EDIT - another solution suggested by #Bogdan the question writer is to make the class an actor class which the concurrency safety is taken care of by the compiler!
By dispatching to a global queue, you increase the chance that two threads will try and write into the dictionary “at the same time” which probably causes the crash
Take a look at these examples.
How to implement a Thread Safe HashTable (PhoneBook) Data Structure in Swift?
https://github.com/iThink32/Thread-Safe-Dictionary/blob/main/ThreadSafeDictionary.swift

Combine pipeline not receiving values

In the following code, which is a simplified version of a more elaborate pipeline, "Done processing" is never called for 2.
Why is that?
I suspect this is a problem due to the demand, but I cannot figure out the cause.
Note that if I remove the combineLatest() or the compactMap(), the value 2 is properly processed (but I need these combineLatest and compactMap for correctness, in my real example they are more involved).
var cancellables = Set<AnyCancellable>([])
func process<T>(_ value: T) -> AnyPublisher<T, Never> {
return Future<T, Never> { promise in
print("Starting processing of \(value)")
DispatchQueue.global(qos: .userInitiated).asyncAfter(deadline: .now() + 0.1) {
promise(.success(value))
}
}.eraseToAnyPublisher()
}
let s = PassthroughSubject<Int?, Never>()
s
.print("Combine->Subject")
.combineLatest(Just(true))
.print("Compact->Combine")
.compactMap { value, _ in value }
.print("Sink->Compact")
.flatMap(maxPublishers: .max(1)) { process($0) }
.sink {
print("Done processing \($0)")
}
.store(in: &cancellables)
s.send(nil)
// Give time for flatMap to finish
Thread.sleep(forTimeInterval: 1)
s.send(2)
It sounds like a bug of combineLatest. When a downstream request additional demand "synchronously" (as per-print publisher output), that demand doesn't flow upstream.
One way to overcome this is to wrap the downstream of combineLatest in a flatMap:
s
.combineLatest(Just(true))
.flatMap(maxPublishers: .max(1)) {
Just($0)
.compactMap { value, _ in value }
.flatMap { process($0) }
}
.sink {
print("Done processing \($0)")
}
.store(in: &cancellables)
The outer flatMap now creates the back pressure, and the inner flatMap doesn't need it anymore.

Wait for DispatchQueue [duplicate]

How could I make my code wait until the task in DispatchQueue finishes? Does it need any CompletionHandler or something?
func myFunction() {
var a: Int?
DispatchQueue.main.async {
var b: Int = 3
a = b
}
// wait until the task finishes, then print
print(a) // - this will contain nil, of course, because it
// will execute before the code above
}
I'm using Xcode 8.2 and writing in Swift 3.
If you need to hide the asynchronous nature of myFunction from the caller, use DispatchGroups to achieve this. Otherwise, use a completion block. Find samples for both below.
DispatchGroup Sample
You can either get notified when the group's enter() and leave() calls are balanced:
func myFunction() {
var a = 0
let group = DispatchGroup()
group.enter()
DispatchQueue.main.async {
a = 1
group.leave()
}
// does not wait. But the code in notify() is executed
// after enter() and leave() calls are balanced
group.notify(queue: .main) {
print(a)
}
}
or you can wait:
func myFunction() {
var a = 0
let group = DispatchGroup()
group.enter()
// avoid deadlocks by not using .main queue here
DispatchQueue.global(qos: .default).async {
a = 1
group.leave()
}
// wait ...
group.wait()
print(a) // you could also `return a` here
}
Note: group.wait() blocks the current queue (probably the main queue in your case), so you have to dispatch.async on another queue (like in the above sample code) to avoid a deadlock.
Completion Block Sample
func myFunction(completion: #escaping (Int)->()) {
var a = 0
DispatchQueue.main.async {
let b: Int = 1
a = b
completion(a) // call completion after you have the result
}
}
// on caller side:
myFunction { result in
print("result: \(result)")
}
In Swift 3, there is no need for completion handler when DispatchQueue finishes one task.
Furthermore you can achieve your goal in different ways
One way is this:
var a: Int?
let queue = DispatchQueue(label: "com.app.queue")
queue.sync {
for i in 0..<10 {
print("Ⓜ️" , i)
a = i
}
}
print("After Queue \(a)")
It will wait until the loop finishes but in this case your main thread will block.
You can also do the same thing like this:
let myGroup = DispatchGroup()
myGroup.enter()
//// Do your task
myGroup.leave() //// When your task completes
myGroup.notify(queue: DispatchQueue.main) {
////// do your remaining work
}
One last thing: If you want to use completionHandler when your task completes using DispatchQueue, you can use DispatchWorkItem.
Here is an example how to use DispatchWorkItem:
let workItem = DispatchWorkItem {
// Do something
}
let queue = DispatchQueue.global()
queue.async {
workItem.perform()
}
workItem.notify(queue: DispatchQueue.main) {
// Here you can notify you Main thread
}
Swift 5 version of the solution
func myCriticalFunction() {
var value1: String?
var value2: String?
let group = DispatchGroup()
group.enter()
//async operation 1
DispatchQueue.global(qos: .default).async {
// Network calls or some other async task
value1 = //out of async task
group.leave()
}
group.enter()
//async operation 2
DispatchQueue.global(qos: .default).async {
// Network calls or some other async task
value2 = //out of async task
group.leave()
}
group.wait()
print("Value1 \(value1) , Value2 \(value2)")
}
Use dispatch group
dispatchGroup.enter()
FirstOperation(completion: { _ in
dispatchGroup.leave()
})
dispatchGroup.enter()
SecondOperation(completion: { _ in
dispatchGroup.leave()
})
dispatchGroup.wait() // Waits here on this thread until the two operations complete executing.
In Swift 5.5+ you can take advantage of Swift Concurrency which allows to return a value from a closure dispatched to the main thread
func myFunction() async {
var a : Int?
a = await MainActor.run {
let b = 3
return b
}
print(a)
}
Task {
await myFunction()
}
Swift 4
You can use Async Function for these situations. When you use DispatchGroup(),Sometimes deadlock may be occures.
var a: Int?
#objc func myFunction(completion:#escaping (Bool) -> () ) {
DispatchQueue.main.async {
let b: Int = 3
a = b
completion(true)
}
}
override func viewDidLoad() {
super.viewDidLoad()
myFunction { (status) in
if status {
print(self.a!)
}
}
}
Somehow the dispatchGroup enter() and leave() commands above didn't work for my case.
Using sleep(5) in a while loop on the background thread worked for me though. Leaving here in case it helps someone else and it didn't interfere with my other threads.

Combine framework serialize async operations

How do I get the asynchronous pipelines that constitute the Combine framework to line up synchronously (serially)?
Suppose I have 50 URLs from which I want to download the corresponding resources, and let's say I want to do it one at a time. I know how to do that with Operation / OperationQueue, e.g. using an Operation subclass that doesn't declare itself finished until the download is complete. How would I do the same thing using Combine?
At the moment all that occurs to me is to keep a global list of the remaining URLs and pop one off, set up that one pipeline for one download, do the download, and in the sink of the pipeline, repeat. That doesn't seem very Combine-like.
I did try making an array of the URLs and map it to an array of publishers. I know I can "produce" a publisher and cause it to publish on down the pipeline using flatMap. But then I'm still doing all the downloading simultaneously. There isn't any Combine way to walk the array in a controlled manner — or is there?
(I also imagined doing something with Future but I became hopelessly confused. I'm not used to this way of thinking.)
Use flatMap(maxPublishers:transform:) with .max(1), e.g.
func imagesPublisher(for urls: [URL]) -> AnyPublisher<UIImage, URLError> {
Publishers.Sequence(sequence: urls.map { self.imagePublisher(for: $0) })
.flatMap(maxPublishers: .max(1)) { $0 }
.eraseToAnyPublisher()
}
Where
func imagePublisher(for url: URL) -> AnyPublisher<UIImage, URLError> {
URLSession.shared.dataTaskPublisher(for: url)
.compactMap { UIImage(data: $0.data) }
.receive(on: RunLoop.main)
.eraseToAnyPublisher()
}
and
var imageRequests: AnyCancellable?
func fetchImages() {
imageRequests = imagesPublisher(for: urls).sink { completion in
switch completion {
case .finished:
print("done")
case .failure(let error):
print("failed", error)
}
} receiveValue: { image in
// do whatever you want with the images as they come in
}
}
That resulted in:
But we should recognize that you take a big performance hit doing them sequentially, like that. For example, if I bump it up to 6 at a time, it’s more than twice as fast:
Personally, I’d recommend only downloading sequentially if you absolutely must (which, when downloading a series of images/files, is almost certainly not the case). Yes, performing requests concurrently can result in them not finishing in a particular order, but we just use a structure that is order independent (e.g. a dictionary rather than a simple array), but the performance gains are so significant that it’s generally worth it.
But, if you want them downloaded sequentially, the maxPublishers parameter can achieve that.
I've only briefly tested this, but at first pass it appears that each request waits for the previous request to finish before starting.
I'm posting this solution in search of feedback. Please be critical if this isn't a good solution.
extension Collection where Element: Publisher {
func serialize() -> AnyPublisher<Element.Output, Element.Failure>? {
// If the collection is empty, we can't just create an arbititary publisher
// so we return nil to indicate that we had nothing to serialize.
if isEmpty { return nil }
// We know at this point that it's safe to grab the first publisher.
let first = self.first!
// If there was only a single publisher then we can just return it.
if count == 1 { return first.eraseToAnyPublisher() }
// We're going to build up the output starting with the first publisher.
var output = first.eraseToAnyPublisher()
// We iterate over the rest of the publishers (skipping over the first.)
for publisher in self.dropFirst() {
// We build up the output by appending the next publisher.
output = output.append(publisher).eraseToAnyPublisher()
}
return output
}
}
A more concise version of this solution (provided by #matt):
extension Collection where Element: Publisher {
func serialize() -> AnyPublisher<Element.Output, Element.Failure>? {
guard let start = self.first else { return nil }
return self.dropFirst().reduce(start.eraseToAnyPublisher()) {
$0.append($1).eraseToAnyPublisher()
}
}
}
You could create custom Subscriber where receive returning Subscribers.Demand.max(1). In that case the subscriber will request next value only when received one. The example is for Int.publisher, but some random delay in map mimics network traffic :-)
import PlaygroundSupport
import SwiftUI
import Combine
class MySubscriber: Subscriber {
typealias Input = String
typealias Failure = Never
func receive(subscription: Subscription) {
print("Received subscription", Thread.current.isMainThread)
subscription.request(.max(1))
}
func receive(_ input: Input) -> Subscribers.Demand {
print("Received input: \(input)", Thread.current.isMainThread)
return .max(1)
}
func receive(completion: Subscribers.Completion<Never>) {
DispatchQueue.main.async {
print("Received completion: \(completion)", Thread.current.isMainThread)
PlaygroundPage.current.finishExecution()
}
}
}
(110...120)
.publisher.receive(on: DispatchQueue.global())
.map {
print(Thread.current.isMainThread, Thread.current)
usleep(UInt32.random(in: 10000 ... 1000000))
return String(format: "%02x", $0)
}
.subscribe(on: DispatchQueue.main)
.subscribe(MySubscriber())
print("Hello")
PlaygroundPage.current.needsIndefiniteExecution = true
Playground print ...
Hello
Received subscription true
false <NSThread: 0x600000064780>{number = 5, name = (null)}
Received input: 6e false
false <NSThread: 0x60000007cc80>{number = 9, name = (null)}
Received input: 6f false
false <NSThread: 0x60000007cc80>{number = 9, name = (null)}
Received input: 70 false
false <NSThread: 0x60000007cc80>{number = 9, name = (null)}
Received input: 71 false
false <NSThread: 0x60000007cc80>{number = 9, name = (null)}
Received input: 72 false
false <NSThread: 0x600000064780>{number = 5, name = (null)}
Received input: 73 false
false <NSThread: 0x600000064780>{number = 5, name = (null)}
Received input: 74 false
false <NSThread: 0x60000004dc80>{number = 8, name = (null)}
Received input: 75 false
false <NSThread: 0x60000004dc80>{number = 8, name = (null)}
Received input: 76 false
false <NSThread: 0x60000004dc80>{number = 8, name = (null)}
Received input: 77 false
false <NSThread: 0x600000053400>{number = 3, name = (null)}
Received input: 78 false
Received completion: finished true
UPDATE
finally i found .flatMap(maxPublishers: ), which force me to update this interesting topic with little bit different approach. Please, see that I am using global queue for scheduling, not only some random delay, just to be sure that receiving serialized stream is not "random" or "lucky" behavior :-)
import PlaygroundSupport
import Combine
import Foundation
PlaygroundPage.current.needsIndefiniteExecution = true
let A = (1 ... 9)
.publisher
.flatMap(maxPublishers: .max(1)) { value in
[value].publisher
.flatMap { value in
Just(value)
.delay(for: .milliseconds(Int.random(in: 0 ... 100)), scheduler: DispatchQueue.global())
}
}
.sink { value in
print(value, "A")
}
let B = (1 ... 9)
.publisher
.flatMap { value in
[value].publisher
.flatMap { value in
Just(value)
.delay(for: .milliseconds(Int.random(in: 0 ... 100)), scheduler: RunLoop.main)
}
}
.sink { value in
print(" ",value, "B")
}
prints
1 A
4 B
5 B
7 B
1 B
2 B
8 B
6 B
2 A
3 B
9 B
3 A
4 A
5 A
6 A
7 A
8 A
9 A
Based on written here
.serialize()?
defined by Clay Ellis accepted answer could be replaced by
.publisher.flatMap(maxPublishers: .max(1)){$0}
while "unserialzed" version must use
.publisher.flatMap{$0}
"real world example"
import PlaygroundSupport
import Foundation
import Combine
let path = "postman-echo.com/get"
let urls: [URL] = "... which proves the downloads are happening serially .-)".map(String.init).compactMap { (parameter) in
var components = URLComponents()
components.scheme = "https"
components.path = path
components.queryItems = [URLQueryItem(name: parameter, value: nil)]
return components.url
}
//["https://postman-echo.com/get?]
struct Postman: Decodable {
var args: [String: String]
}
let collection = urls.compactMap { value in
URLSession.shared.dataTaskPublisher(for: value)
.tryMap { data, response -> Data in
return data
}
.decode(type: Postman.self, decoder: JSONDecoder())
.catch {_ in
Just(Postman(args: [:]))
}
}
extension Collection where Element: Publisher {
func serialize() -> AnyPublisher<Element.Output, Element.Failure>? {
guard let start = self.first else { return nil }
return self.dropFirst().reduce(start.eraseToAnyPublisher()) {
return $0.append($1).eraseToAnyPublisher()
}
}
}
var streamA = ""
let A = collection
.publisher.flatMap{$0}
.sink(receiveCompletion: { (c) in
print(streamA, " ", c, " .publisher.flatMap{$0}")
}, receiveValue: { (postman) in
print(postman.args.keys.joined(), terminator: "", to: &streamA)
})
var streamC = ""
let C = collection
.serialize()?
.sink(receiveCompletion: { (c) in
print(streamC, " ", c, " .serialize()?")
}, receiveValue: { (postman) in
print(postman.args.keys.joined(), terminator: "", to: &streamC)
})
var streamD = ""
let D = collection
.publisher.flatMap(maxPublishers: .max(1)){$0}
.sink(receiveCompletion: { (c) in
print(streamD, " ", c, " .publisher.flatMap(maxPublishers: .max(1)){$0}")
}, receiveValue: { (postman) in
print(postman.args.keys.joined(), terminator: "", to: &streamD)
})
PlaygroundPage.current.needsIndefiniteExecution = true
prints
.w.h i.c hporves ht edownloadsa erh appeninsg eriall y.-) finished .publisher.flatMap{$0}
... which proves the downloads are happening serially .-) finished .publisher.flatMap(maxPublishers: .max(1)){$0}
... which proves the downloads are happening serially .-) finished .serialize()?
Seem to me very useful in other scenarios as well. Try to use default value of maxPublishers in next snippet and compare the results :-)
import Combine
let sequencePublisher = Publishers.Sequence<Range<Int>, Never>(sequence: 0..<Int.max)
let subject = PassthroughSubject<String, Never>()
let handle = subject
.zip(sequencePublisher.print())
//.publish
.flatMap(maxPublishers: .max(1), { (pair) in
Just(pair)
})
.print()
.sink { letters, digits in
print(letters, digits)
}
"Hello World!".map(String.init).forEach { (s) in
subject.send(s)
}
subject.send(completion: .finished)
From the original question:
I did try making an array of the URLs and map it to an array of publishers. I know I can "produce" a publisher and cause it to publish on down the pipeline using flatMap. But then I'm still doing all the downloading simultaneously. There isn't any Combine way to walk the array in a controlled manner — or is there?
Here's a toy example to stand in for the real problem:
let collection = (1 ... 10).map {
Just($0).delay(
for: .seconds(Double.random(in:1...5)),
scheduler: DispatchQueue.main)
.eraseToAnyPublisher()
}
collection.publisher
.flatMap() {$0}
.sink {print($0)}.store(in:&self.storage)
This emits the integers from 1 to 10 in random order arriving at random times. The goal is to do something with collection that will cause it to emit the integers from 1 to 10 in order.
Now we're going to change just one thing: in the line
.flatMap {$0}
we add the maxPublishers parameter:
let collection = (1 ... 10).map {
Just($0).delay(
for: .seconds(Double.random(in:1...5)),
scheduler: DispatchQueue.main)
.eraseToAnyPublisher()
}
collection.publisher
.flatMap(maxPublishers:.max(1)) {$0}
.sink {print($0)}.store(in:&self.storage)
Presto, we now do emit the integers from 1 to 10, in order, with random intervals between them.
Let's apply this to the original problem. To demonstrate, I need a fairly slow Internet connection and a fairly large resource to download. First, I'll do it with ordinary .flatMap:
let eph = URLSessionConfiguration.ephemeral
let session = URLSession(configuration: eph)
let url = "https://photojournal.jpl.nasa.gov/tiff/PIA23172.tif"
let collection = [url, url, url]
.map {URL(string:$0)!}
.map {session.dataTaskPublisher(for: $0)
.eraseToAnyPublisher()
}
collection.publisher.setFailureType(to: URLError.self)
.handleEvents(receiveOutput: {_ in print("start")})
.flatMap() {$0}
.map {$0.data}
.sink(receiveCompletion: {comp in
switch comp {
case .failure(let err): print("error", err)
case .finished: print("finished")
}
}, receiveValue: {_ in print("done")})
.store(in:&self.storage)
The result is
start
start
start
done
done
done
finished
which shows that we are doing the three downloads simultaneously. Okay, now change
.flatMap() {$0}
to
.flatMap(maxPublishers:.max(1) {$0}
The result now is:
start
done
start
done
start
done
finished
So we are now downloading serially, which is the problem originally to be solved.
append
In keeping with the principle of TIMTOWTDI, we can instead chain the publishers with append to serialize them:
let collection = (1 ... 10).map {
Just($0).delay(
for: .seconds(Double.random(in:1...5)),
scheduler: DispatchQueue.main)
.eraseToAnyPublisher()
}
let pub = collection.dropFirst().reduce(collection.first!) {
return $0.append($1).eraseToAnyPublisher()
}
The result is a publisher that serializes the delayed publishers in the original collection. Let's prove it by subscribing to it:
pub.sink {print($0)}.store(in:&self.storage)
Sure enough, the integers now arrive in order (with random intervals between).
We can encapsulate the creation of pub from a collection of publishers with an extension on Collection, as suggested by Clay Ellis:
extension Collection where Element: Publisher {
func serialize() -> AnyPublisher<Element.Output, Element.Failure>? {
guard let start = self.first else { return nil }
return self.dropFirst().reduce(start.eraseToAnyPublisher()) {
return $0.append($1).eraseToAnyPublisher()
}
}
}
Here is one page playground code that depicts possible approach. The main idea is to transform async API calls into chain of Future publishers, thus making serial pipeline.
Input: range of int from 1 to 10 that asynchrounosly on background queue converted into strings
Demo of direct call to async API:
let group = DispatchGroup()
inputValues.map {
group.enter()
asyncCall(input: $0) { (output, _) in
print(">> \(output), in \(Thread.current)")
group.leave()
}
}
group.wait()
Output:
>> 1, in <NSThread: 0x7fe76264fff0>{number = 4, name = (null)}
>> 3, in <NSThread: 0x7fe762446b90>{number = 3, name = (null)}
>> 5, in <NSThread: 0x7fe7624461f0>{number = 5, name = (null)}
>> 6, in <NSThread: 0x7fe762461ce0>{number = 6, name = (null)}
>> 10, in <NSThread: 0x7fe76246a7b0>{number = 7, name = (null)}
>> 4, in <NSThread: 0x7fe764c37d30>{number = 8, name = (null)}
>> 7, in <NSThread: 0x7fe764c37cb0>{number = 9, name = (null)}
>> 8, in <NSThread: 0x7fe76246b540>{number = 10, name = (null)}
>> 9, in <NSThread: 0x7fe7625164b0>{number = 11, name = (null)}
>> 2, in <NSThread: 0x7fe764c37f50>{number = 12, name = (null)}
Demo of combine pipeline:
Output:
>> got 1
>> got 2
>> got 3
>> got 4
>> got 5
>> got 6
>> got 7
>> got 8
>> got 9
>> got 10
>>>> finished with true
Code:
import Cocoa
import Combine
import PlaygroundSupport
// Assuming there is some Asynchronous API with
// (eg. process Int input value during some time and generates String result)
func asyncCall(input: Int, completion: #escaping (String, Error?) -> Void) {
DispatchQueue.global(qos: .background).async {
sleep(.random(in: 1...5)) // wait for random Async API output
completion("\(input)", nil)
}
}
// There are some input values to be processed serially
let inputValues = Array(1...10)
// Prepare one pipeline item based on Future, which trasform Async -> Sync
func makeFuture(input: Int) -> AnyPublisher<Bool, Error> {
Future<String, Error> { promise in
asyncCall(input: input) { (value, error) in
if let error = error {
promise(.failure(error))
} else {
promise(.success(value))
}
}
}
.receive(on: DispatchQueue.main)
.map {
print(">> got \($0)") // << sideeffect of pipeline item
return true
}
.eraseToAnyPublisher()
}
// Create pipeline trasnforming input values into chain of Future publishers
var subscribers = Set<AnyCancellable>()
let pipeline =
inputValues
.reduce(nil as AnyPublisher<Bool, Error>?) { (chain, value) in
if let chain = chain {
return chain.flatMap { _ in
makeFuture(input: value)
}.eraseToAnyPublisher()
} else {
return makeFuture(input: value)
}
}
// Execute pipeline
pipeline?
.sink(receiveCompletion: { _ in
// << do something on completion if needed
}) { output in
print(">>>> finished with \(output)")
}
.store(in: &subscribers)
PlaygroundPage.current.needsIndefiniteExecution = true
In all of the other Reactive frameworks this is really easy; you just use concat to concatenate and flatten the results in one step and then you can reduce the results into a final array. Apple makes this difficult because Publisher.Concatenate has no overload that accepts an array of Publishers. There is similar weirdness with Publisher.Merge. I have a feeling this has to do with the fact that they return nested generic publishers instead of just returning a single generic type like rx Observable. I guess you can just call Concatenate in a loop and then reduce the concatenated results into a single array, but I really hope they address this issue in the next release. There is certainly the need to concat more than 2 publishers and to merge more than 4 publishers (and the overloads for these two operators aren't even consistent, which is just weird).
EDIT:
I came back to this and found that you can indeed concat an arbitrary array of publishers and they will emit in sequence. I have no idea why there isn't a function like ConcatenateMany to do this for you but it looks like as long as you are willing to use a type erased publisher its not that hard to write one yourself. This example shows that merge emits in temporal order while concat emits in the order of combination:
import PlaygroundSupport
import SwiftUI
import Combine
let p = Just<Int>(1).append(2).append(3).delay(for: .seconds(0.25), scheduler: RunLoop.main).eraseToAnyPublisher()
let q = Just<Int>(4).append(5).append(6).eraseToAnyPublisher()
let r = Just<Int>(7).append(8).append(9).delay(for: .seconds(0.5), scheduler: RunLoop.main).eraseToAnyPublisher()
let concatenated: AnyPublisher<Int, Never> = [q,r].reduce(p) { total, next in
total.append(next).eraseToAnyPublisher()
}
var subscriptions = Set<AnyCancellable>()
concatenated
.sink(receiveValue: { v in
print("concatenated: \(v)")
}).store(in: &subscriptions)
Publishers
.MergeMany([p,q,r])
.sink(receiveValue: { v in
print("merge: \(v)")
}).store(in: &subscriptions)
What about the dynamic array of URLs, something like data bus ?
var array: [AnyPublisher<Data, URLError>] = []
array.append(Task())
array.publisher
.flatMap { $0 }
.sink {
}
// it will be finished
array.append(Task())
array.append(Task())
array.append(Task())
Another approach, if you want to collect all the results of the downloads, in order to know which one failed and which one not, is to write a custom publisher that looks like this:
extension Publishers {
struct Serialize<Upstream: Publisher>: Publisher {
typealias Output = [Result<Upstream.Output, Upstream.Failure>]
typealias Failure = Never
let upstreams: [Upstream]
init<C: Collection>(_ upstreams: C) where C.Element == Upstream {
self.upstreams = Array(upstreams)
}
init(_ upstreams: Upstream...) {
self.upstreams = upstreams
}
func receive<S>(subscriber: S) where S : Subscriber, Self.Failure == S.Failure, Self.Output == S.Input {
guard let first = upstreams.first else { return Empty().subscribe(subscriber) }
first
.map { Result<Upstream.Output, Upstream.Failure>.success($0) }
.catch { Just(Result<Upstream.Output, Upstream.Failure>.failure($0)) }
.map { [$0] }
.append(Serialize(upstreams.dropFirst()))
.collect()
.map { $0.flatMap { $0 } }
.subscribe(subscriber)
}
}
}
extension Collection where Element: Publisher {
func serializedPublishers() -> Publishers.Serialize<Element> {
.init(self)
}
}
The publisher takes the first download task, converts its output/failure to a Result instance, and prepends it to the "recursive" call for the rest of the list.
Usage: Publishers.Serialize(listOfDownloadTasks), or listOfDownloadTasks.serializedPublishers().
One minor inconvenient of this implementation is the fact that the Result instance needs to be wrapped into an array, just to be flattened three steps later in the pipeline. Perhaps someone can suggest a better alternative to that.

Resources