Swift Combine: Buffer upstream values and emit them at a steady rate? - ios

Using the new Combine framework in iOS 13.
Suppose I have an upstream publisher sending values at a highly irregular rate - sometimes seconds or minutes may go by without any values, and then a stream of values may come through all at once. I'd like to create a custom publisher that subscribes to the upstream values, buffers them and emits them at a regular, known cadence when they come in, but publishes nothing if they've all been exhausted.
For a concrete example:
t = 0 to 5000ms: no upstream values published
t = 5001ms: upstream publishes "a"
t = 5002ms: upstream publishes "b"
t = 5003ms: upstream publishes "c"
t = 5004ms to 10000ms: no upstream values published
t = 10001ms: upstream publishes "d"
My publisher subscribed to the upstream would produce values every 1 second:
t = 0 to 5000ms: no values published
t = 5001ms: publishes "a"
t = 6001ms: publishes "b"
t = 7001ms: publishes "c"
t = 7001ms to 10001ms: no values published
t = 10001ms: publishes "d"
None of the existing publishers or operators in Combine seem to quite do what I want here.
throttle and debounce would simply sample the upstream values at a certain cadence and drop ones that are missing (e.g. would only publish "a" if the cadence was 1000ms)
delay would add the same delay to every value, but not space them out (e.g. if my delay was 1000ms, it would publish "a" at 6001ms, "b" at 6002ms, "c" at 6003ms)
buffer seems promising, but I can't quite figure out how to use it - how to force it to publish a value from the buffer on demand. When I hooked up a sink to buffer it seemed to just instantly publish all the values, not buffering at all.
I thought about using some sort of combining operator like zip or merge or combineLatest and combining it with a Timer publisher, and that's probably the right approach, but I can't figure out exactly how to configure it to give the behavior I want.
Edit
Here's a marble diagram that hopefully illustrates what I'm going for:
Upstream Publisher:
-A-B-C-------------------D-E-F--------|>
My Custom Operator:
-A----B----C-------------D----E----F--|>
Edit 2: Unit Test
Here's a unit test that should pass if modulatedPublisher (my desired buffered publisher) works as desired. It's not perfect, but it stores events (including the time received) as they're received and then compares the time intervals between events, ensuring they are no smaller than the desired interval.
func testCustomPublisher() {
let expectation = XCTestExpectation(description: "async")
var events = [Event]()
let passthroughSubject = PassthroughSubject<Int, Never>()
let cancellable = passthroughSubject
.modulatedPublisher(interval: 1.0)
.sink { value in
events.append(Event(value: value, date: Date()))
print("value received: \(value) at \(self.dateFormatter.string(from:Date()))")
}
// WHEN I send 3 events, wait 6 seconds, and send 3 more events
passthroughSubject.send(1)
passthroughSubject.send(2)
passthroughSubject.send(3)
DispatchQueue.main.asyncAfter(deadline: .now() + .milliseconds(6000)) {
passthroughSubject.send(4)
passthroughSubject.send(5)
passthroughSubject.send(6)
DispatchQueue.main.asyncAfter(deadline: .now() + .milliseconds(4000)) {
// THEN I expect the stored events to be no closer together in time than the interval of 1.0s
for i in 1 ..< events.count {
let interval = events[i].date.timeIntervalSince(events[i-1].date)
print("Interval: \(interval)")
// There's some small error in the interval but it should be about 1 second since I'm using a 1s modulated publisher.
XCTAssertTrue(interval > 0.99)
}
expectation.fulfill()
}
}
wait(for: [expectation], timeout: 15)
}
The closest I've gotten is using zip, like so:
public extension Publisher where Self.Failure == Never {
func modulatedPublisher(interval: TimeInterval) -> AnyPublisher<Output, Never> {
let timerBuffer = Timer
.publish(every: interval, on: .main, in: .common)
.autoconnect()
return timerBuffer
.zip(self, { $1 }) // should emit one input element ($1) every timer tick
.eraseToAnyPublisher()
}
}
This properly attunes the first three events (1, 2, and 3), but not the second three (4, 5, and 6). The output:
value received: 1 at 3:54:07.0007
value received: 2 at 3:54:08.0008
value received: 3 at 3:54:09.0009
value received: 4 at 3:54:12.0012
value received: 5 at 3:54:12.0012
value received: 6 at 3:54:12.0012
I believe this is happening because zip has some internal buffering capacity. The first three upstream events are buffered and emitted on the Timer's cadence, but during the 6 second wait, the Timer's events are buffered - and when the second set ups upstream events are fired, there are already Timer events waiting in the queue, so they're paired up and fired off immediately.

This is an interesting problem. I played with various combinations of Timer.publish, buffer, zip, and throttle, but I couldn't get any combination to work quite the way you want. So let's write a custom subscriber.
What we'd really like is an API where, when we get an input from upstream, we also get the ability to control when the upstream delivers the next input. Something like this:
extension Publisher {
/// Subscribe to me with a stepping function.
/// - parameter stepper: A function I'll call with each of my inputs, and with my completion.
/// Each time I call this function with an input, I also give it a promise function.
/// I won't deliver the next input until the promise is called with a `.more` argument.
/// - returns: An object you can use to cancel the subscription asynchronously.
func step(with stepper: #escaping (StepEvent<Output, Failure>) -> ()) -> AnyCancellable {
???
}
}
enum StepEvent<Input, Failure: Error> {
/// Handle the Input. Call `StepPromise` when you're ready for the next Input,
/// or to cancel the subscription.
case input(Input, StepPromise)
/// Upstream completed the subscription.
case completion(Subscribers.Completion<Failure>)
}
/// The type of callback given to the stepper function to allow it to continue
/// or cancel the stream.
typealias StepPromise = (StepPromiseRequest) -> ()
enum StepPromiseRequest {
// Pass this to the promise to request the next item from upstream.
case more
// Pass this to the promise to cancel the subscription.
case cancel
}
With this step API, we can write a pace operator that does what you want:
extension Publisher {
func pace<Context: Scheduler, MySubject: Subject>(
_ pace: Context.SchedulerTimeType.Stride, scheduler: Context, subject: MySubject)
-> AnyCancellable
where MySubject.Output == Output, MySubject.Failure == Failure
{
return step {
switch $0 {
case .input(let input, let promise):
// Send the input from upstream now.
subject.send(input)
// Wait for the pace interval to elapse before requesting the
// next input from upstream.
scheduler.schedule(after: scheduler.now.advanced(by: pace)) {
promise(.more)
}
case .completion(let completion):
subject.send(completion: completion)
}
}
}
}
This pace operator takes pace (the required interval between outputs), a scheduler on which to schedule events, and a subject on which to republish the inputs from upstream. It handles each input by sending it through subject, and then using the scheduler to wait for the pace interval before asking for the next input from upstream.
Now we just have to implement the step operator. Combine doesn't give us too much help here. It does have a feature called “backpressure”, which means a publisher cannot send an input downstream until the downstream has asked for it by sending a Subscribers.Demand upstream. Usually you see downstreams send an .unlimited demand upstream, but we're not going to. Instead, we're going to take advantage of backpressure. We won't send any demand upstream until the stepper completes a promise, and then we'll only send a demand of .max(1), so we make the upstream operate in lock-step with the stepper. (We also have to send an initial demand of .max(1) to start the whole process.)
Okay, so need to implement a type that takes a stepper function and conforms to Subscriber. It's a good idea to review the Reactive Streams JVM Specification, because Combine is based on that specification.
What makes the implementation difficult is that several things can call into our subscriber asynchronously:
The upstream can call into the subscriber from any thread (but is required to serialize its calls).
After we've given promise functions to the stepper, the stepper can call those promises on any thread.
We want the subscription to be cancellable, and that cancellation can happen on any thread.
All this asynchronicity means we have to protect our internal state with a lock.
We have to be careful not to call out while holding that lock, to avoid deadlock.
We'll also protect the subscriber from shenanigans involving calling a promise repeatedly, or calling outdated promises, by giving each promise a unique id.
Se here's our basic subscriber definition:
import Combine
import Foundation
public class SteppingSubscriber<Input, Failure: Error> {
public init(stepper: #escaping Stepper) {
l_state = .subscribing(stepper)
}
public typealias Stepper = (Event) -> ()
public enum Event {
case input(Input, Promise)
case completion(Completion)
}
public typealias Promise = (Request) -> ()
public enum Request {
case more
case cancel
}
public typealias Completion = Subscribers.Completion<Failure>
private let lock = NSLock()
// The l_ prefix means it must only be accessed while holding the lock.
private var l_state: State
private var l_nextPromiseId: PromiseId = 1
private typealias PromiseId = Int
private var noPromiseId: PromiseId { 0 }
}
Notice that I moved the auxiliary types from earlier (StepEvent, StepPromise, and StepPromiseRequest) into SteppingSubscriber and shortened their names.
Now let's consider l_state's mysterious type, State. What are all the different states our subscriber could be in?
We could be waiting to receive the Subscription object from upstream.
We could have received the Subscription from upstream and be waiting for a signal (an input or completion from upstream, or the completion of a promise from the stepper).
We could be calling out to the stepper, which we want to be careful in case it completes a promise while we're calling out to it.
We could have been cancelled or have received completion from upstream.
So here is our definition of State:
extension SteppingSubscriber {
private enum State {
// Completed or cancelled.
case dead
// Waiting for Subscription from upstream.
case subscribing(Stepper)
// Waiting for a signal from upstream or for the latest promise to be completed.
case subscribed(Subscribed)
// Calling out to the stopper.
case stepping(Stepping)
var subscription: Subscription? {
switch self {
case .dead: return nil
case .subscribing(_): return nil
case .subscribed(let subscribed): return subscribed.subscription
case .stepping(let stepping): return stepping.subscribed.subscription
}
}
struct Subscribed {
var stepper: Stepper
var subscription: Subscription
var validPromiseId: PromiseId
}
struct Stepping {
var subscribed: Subscribed
// If the stepper completes the current promise synchronously with .more,
// I set this to true.
var shouldRequestMore: Bool
}
}
}
Since we're using NSLock (for simplicity), let's define an extension to ensure we always match locking with unlocking:
fileprivate extension NSLock {
#inline(__always)
func sync<Answer>(_ body: () -> Answer) -> Answer {
lock()
defer { unlock() }
return body()
}
}
Now we're ready to handle some events. The easiest event to handle is asynchronous cancellation, which is the Cancellable protocol's only requirement. If we're in any state except .dead, we want to become .dead and, if there's an upstream subscription, cancel it.
extension SteppingSubscriber: Cancellable {
public func cancel() {
let sub: Subscription? = lock.sync {
defer { l_state = .dead }
return l_state.subscription
}
sub?.cancel()
}
}
Notice here that I don't want to call out to the upstream subscription's cancel function while lock is locked, because lock isn't a recursive lock and I don't want to risk deadlock. All use of lock.sync follows the pattern of deferring any call-outs until after the lock is unlocked.
Now let's implement the Subscriber protocol requirements. First, let's handle receiving the Subscription from upstream. The only time this should happen is when we're in the .subscribing state, but .dead is also possible in which case we want to just cancel the upstream subscription.
extension SteppingSubscriber: Subscriber {
public func receive(subscription: Subscription) {
let action: () -> () = lock.sync {
guard case .subscribing(let stepper) = l_state else {
return { subscription.cancel() }
}
l_state = .subscribed(.init(stepper: stepper, subscription: subscription, validPromiseId: noPromiseId))
return { subscription.request(.max(1)) }
}
action()
}
Notice that in this use of lock.sync (and in all later uses), I return an “action” closure so I can perform arbitrary call-outs after the lock has been unlocked.
The next Subscriber protocol requirement we'll tackle is receiving a completion:
public func receive(completion: Subscribers.Completion<Failure>) {
let action: (() -> ())? = lock.sync {
// The only state in which I have to handle this call is .subscribed:
// - If I'm .dead, either upstream already completed (and shouldn't call this again),
// or I've been cancelled.
// - If I'm .subscribing, upstream must send me a Subscription before sending me a completion.
// - If I'm .stepping, upstream is currently signalling me and isn't allowed to signal
// me again concurrently.
guard case .subscribed(let subscribed) = l_state else {
return nil
}
l_state = .dead
return { [stepper = subscribed.stepper] in
stepper(.completion(completion))
}
}
action?()
}
The most complex Subscriber protocol requirement for us is receiving an Input:
We have to create a promise.
We have to pass the promise to the stepper.
The stepper could complete the promise before returning.
After the stepper returns, we have to check whether it completed the promise with .more and, if so, return the appropriate demand upstream.
Since we have to call out to the stepper in the middle of this work, we have some ugly nesting of lock.sync calls.
public func receive(_ input: Input) -> Subscribers.Demand {
let action: (() -> Subscribers.Demand)? = lock.sync {
// The only state in which I have to handle this call is .subscribed:
// - If I'm .dead, either upstream completed and shouldn't call this,
// or I've been cancelled.
// - If I'm .subscribing, upstream must send me a Subscription before sending me Input.
// - If I'm .stepping, upstream is currently signalling me and isn't allowed to
// signal me again concurrently.
guard case .subscribed(var subscribed) = l_state else {
return nil
}
let promiseId = l_nextPromiseId
l_nextPromiseId += 1
let promise: Promise = { request in
self.completePromise(id: promiseId, request: request)
}
subscribed.validPromiseId = promiseId
l_state = .stepping(.init(subscribed: subscribed, shouldRequestMore: false))
return { [stepper = subscribed.stepper] in
stepper(.input(input, promise))
let demand: Subscribers.Demand = self.lock.sync {
// The only possible states now are .stepping and .dead.
guard case .stepping(let stepping) = self.l_state else {
return .none
}
self.l_state = .subscribed(stepping.subscribed)
return stepping.shouldRequestMore ? .max(1) : .none
}
return demand
}
}
return action?() ?? .none
}
} // end of extension SteppingSubscriber: Publisher
The last thing our subscriber needs to handle is the completion of a promise. This is complicated for several reasons:
We want to protect against a promise being completed multiple times.
We want to protect against an older promise being completed.
We can be in any state when a promise is completed.
Thus:
extension SteppingSubscriber {
private func completePromise(id: PromiseId, request: Request) {
let action: (() -> ())? = lock.sync {
switch l_state {
case .dead, .subscribing(_): return nil
case .subscribed(var subscribed) where subscribed.validPromiseId == id && request == .more:
subscribed.validPromiseId = noPromiseId
l_state = .subscribed(subscribed)
return { [sub = subscribed.subscription] in
sub.request(.max(1))
}
case .subscribed(let subscribed) where subscribed.validPromiseId == id && request == .cancel:
l_state = .dead
return { [sub = subscribed.subscription] in
sub.cancel()
}
case .subscribed(_):
// Multiple completion or stale promise.
return nil
case .stepping(var stepping) where stepping.subscribed.validPromiseId == id && request == .more:
stepping.subscribed.validPromiseId = noPromiseId
stepping.shouldRequestMore = true
l_state = .stepping(stepping)
return nil
case .stepping(let stepping) where stepping.subscribed.validPromiseId == id && request == .cancel:
l_state = .dead
return { [sub = stepping.subscribed.subscription] in
sub.cancel()
}
case .stepping(_):
// Multiple completion or stale promise.
return nil
}
}
action?()
}
}
Whew!
With all that done, we can write the real step operator:
extension Publisher {
func step(with stepper: #escaping (SteppingSubscriber<Output, Failure>.Event) -> ()) -> AnyCancellable {
let subscriber = SteppingSubscriber<Output, Failure>(stepper: stepper)
self.subscribe(subscriber)
return .init(subscriber)
}
}
And then we can try out that pace operator from above. Since we don't do any buffering in SteppingSubscriber, and the upstream in general isn't buffered, we'll stick a buffer in between the upstream and our pace operator.
var cans: [AnyCancellable] = []
func application(_ application: UIApplication, didFinishLaunchingWithOptions launchOptions: [UIApplication.LaunchOptionsKey: Any]?) -> Bool {
let erratic = Just("A").delay(for: 0.0, tolerance: 0.001, scheduler: DispatchQueue.main).eraseToAnyPublisher()
.merge(with: Just("B").delay(for: 0.3, tolerance: 0.001, scheduler: DispatchQueue.main).eraseToAnyPublisher())
.merge(with: Just("C").delay(for: 0.6, tolerance: 0.001, scheduler: DispatchQueue.main).eraseToAnyPublisher())
.merge(with: Just("D").delay(for: 5.0, tolerance: 0.001, scheduler: DispatchQueue.main).eraseToAnyPublisher())
.merge(with: Just("E").delay(for: 5.3, tolerance: 0.001, scheduler: DispatchQueue.main).eraseToAnyPublisher())
.merge(with: Just("F").delay(for: 5.6, tolerance: 0.001, scheduler: DispatchQueue.main).eraseToAnyPublisher())
.handleEvents(
receiveOutput: { print("erratic: \(Double(DispatchTime.now().rawValue) / 1_000_000_000) \($0)") },
receiveCompletion: { print("erratic: \(Double(DispatchTime.now().rawValue) / 1_000_000_000) \($0)") }
)
.makeConnectable()
let subject = PassthroughSubject<String, Never>()
cans += [erratic
.buffer(size: 1000, prefetch: .byRequest, whenFull: .dropOldest)
.pace(.seconds(1), scheduler: DispatchQueue.main, subject: subject)]
cans += [subject.sink(
receiveCompletion: { print("paced: \(Double(DispatchTime.now().rawValue) / 1_000_000_000) \($0)") },
receiveValue: { print("paced: \(Double(DispatchTime.now().rawValue) / 1_000_000_000) \($0)") }
)]
let c = erratic.connect()
cans += [AnyCancellable { c.cancel() }]
return true
}
And here, at long last, is the output:
erratic: 223394.17115897 A
paced: 223394.171495405 A
erratic: 223394.408086369 B
erratic: 223394.739186984 C
paced: 223395.171615624 B
paced: 223396.27056174 C
erratic: 223399.536717127 D
paced: 223399.536782847 D
erratic: 223399.536834495 E
erratic: 223400.236808469 F
erratic: 223400.236886323 finished
paced: 223400.620542561 E
paced: 223401.703613078 F
paced: 223402.703828512 finished
Timestamps are in units of seconds.
The erratic publisher's timings are, indeed, erratic and sometimes close in time.
The paced timings are always at least one second apart even when the erratic events occur less than one second apart.
When an erratic event occurs more than one second after the prior event, the paced event is sent immediately following the erratic event without further delay.
The paced completion occurs one second after the last paced event, even though the erratic completion occurs immediately after the last erratic event. The buffer doesn't send the completion until it receives another demand after it sends the last event, and that demand is delayed by the pacing timer.
I've put the the entire implementation of the step operator in this gist for easy copy/paste.

EDIT
There's an even simpler approach to the original one outlined below, which doesn't require a pacer, but instead uses back-pressure created by flatMap(maxPublishers: .max(1)).
flatMap sends a demand of 1, until its returned publisher, which we could delay, completes. We'd need a Buffer publisher upstream to buffer the values.
// for demo purposes, this subject sends a Date:
let subject = PassthroughSubject<Date, Never>()
let interval = 1.0
let pub = subject
.buffer(size: .max, prefetch: .byRequest, whenFull: .dropNewest)
.flatMap(maxPublishers: .max(1)) {
Just($0)
.delay(for: .seconds(interval), scheduler: DispatchQueue.main)
}
ORIGINAL
I know this is an old question, but I think there's a much simpler way to implement this, so I thought I'd share.
The idea is similar to a .zip with a Timer, except instead of a Timer, you would .zip with a time-delayed "tick" from a previously sent value, which can be achieved with a CurrentValueSubject. CurrentValueSubject is needed instead of a PassthroughSubject in order to seed the first ever "tick".
// for demo purposes, this subject sends a Date:
let subject = PassthroughSubject<Date, Never>()
let pacer = CurrentValueSubject<Void, Never>(())
let interval = 1.0
let pub = subject.zip(pacer)
.flatMap { v in
Just(v.0) // extract the original value
.delay(for: .seconds(interval), scheduler: DispatchQueue.main)
.handleEvents(receiveOutput: { _ in
pacer.send() // send the pacer "tick" after the interval
})
}
What happens is that the .zip gates on the pacer, which only arrives after a delay from a previously sent value.
If the next value comes earlier than the allowed interval, it waits for the pacer.
If, however, the next value comes later, then the pacer already has a new value to provide instantly, so there would be no delay.
If you used it like in your test case:
let c = pub.sink { print("\($0): \(Date())") }
subject.send(Date())
subject.send(Date())
subject.send(Date())
DispatchQueue.main.asyncAfter(deadline: .now() + 1.0) {
subject.send(Date())
subject.send(Date())
}
DispatchQueue.main.asyncAfter(deadline: .now() + 10.0) {
subject.send(Date())
subject.send(Date())
}
the result would be something like this:
2020-06-23 19:15:21 +0000: 2020-06-23 19:15:21 +0000
2020-06-23 19:15:21 +0000: 2020-06-23 19:15:22 +0000
2020-06-23 19:15:21 +0000: 2020-06-23 19:15:23 +0000
2020-06-23 19:15:22 +0000: 2020-06-23 19:15:24 +0000
2020-06-23 19:15:22 +0000: 2020-06-23 19:15:25 +0000
2020-06-23 19:15:32 +0000: 2020-06-23 19:15:32 +0000
2020-06-23 19:15:32 +0000: 2020-06-23 19:15:33 +0000

Could Publishers.CollectByTime be useful here somewhere?
Publishers.CollectByTime(upstream: upstreamPublisher.share(), strategy: Publishers.TimeGroupingStrategy.byTime(RunLoop.main, .seconds(1)), options: nil)

Just wanted to mention that I adapted Rob's answer from earlier and converted it to a custom Publisher, in order to allow for a single unbroken pipeline (see comments below his solution). My adaptation is below, but all the credit still goes to him. It also still makes use of Rob's step operator and SteppingSubscriber, as this custom Publisher uses those internally.
Edit: updated with buffer as part of the modulated operator, otherwise that would be required to be attached to buffer the upstream events.
public extension Publisher {
func modulated<Context: Scheduler>(_ pace: Context.SchedulerTimeType.Stride, scheduler: Context) -> AnyPublisher<Output, Failure> {
let upstream = buffer(size: 1000, prefetch: .byRequest, whenFull: .dropNewest).eraseToAnyPublisher()
return PacePublisher<Context, AnyPublisher>(pace: pace, scheduler: scheduler, source: upstream).eraseToAnyPublisher()
}
}
final class PacePublisher<Context: Scheduler, Source: Publisher>: Publisher {
typealias Output = Source.Output
typealias Failure = Source.Failure
let subject: PassthroughSubject<Output, Failure>
let scheduler: Context
let pace: Context.SchedulerTimeType.Stride
lazy var internalSubscriber: SteppingSubscriber<Output, Failure> = SteppingSubscriber<Output, Failure>(stepper: stepper)
lazy var stepper: ((SteppingSubscriber<Output, Failure>.Event) -> ()) = {
switch $0 {
case .input(let input, let promise):
// Send the input from upstream now.
self.subject.send(input)
// Wait for the pace interval to elapse before requesting the
// next input from upstream.
self.scheduler.schedule(after: self.scheduler.now.advanced(by: self.pace)) {
promise(.more)
}
case .completion(let completion):
self.subject.send(completion: completion)
}
}
init(pace: Context.SchedulerTimeType.Stride, scheduler: Context, source: Source) {
self.scheduler = scheduler
self.pace = pace
self.subject = PassthroughSubject<Source.Output, Source.Failure>()
source.subscribe(internalSubscriber)
}
public func receive<S>(subscriber: S) where S : Subscriber, Failure == S.Failure, Output == S.Input {
subject.subscribe(subscriber)
subject.send(subscription: PaceSubscription(subscriber: subscriber))
}
}
public class PaceSubscription<S: Subscriber>: Subscription {
private var subscriber: S?
init(subscriber: S) {
self.subscriber = subscriber
}
public func request(_ demand: Subscribers.Demand) {
}
public func cancel() {
subscriber = nil
}
}

Related

iOS Combine Start new request only if previous has finished

I have network request that triggers every last cell in switui appearas. Sometimes if user scrolls fast enough down -> up -> request will trigger before first one finishes. Without combine or reactive approach I have completion block and bool value to handle this:
public func load() {
guard !isLoadingPosts else { return }
isLoadingPosts = true
postsDataProvider.loadMorePosts { _ in
self.isLoadingPosts = false
}
}
I was wondering if with combine this can be resolved more elegantly, without the need to use bool value. For example execute request only if previous has finished?
It looks like you want to skip making the call if it's already in progress.
Since you didn't share any of the Combine code you might have, I'll assume that you have a publisher-returning function like this:
func loadMorePosts() -> AnyPublisher<[Post], Error> {
//...
}
Then you can use a subject to initiate a load call, a flatMap(maxPublishers:_:) downstream, with a number of publishers limited to 1:
let loadSubject = PassthroughSubject<Void, Never>()
loadSubject
.flatMap(maxPublishers: .max(1)) {
loadMorePosts()
}
.sink(
receiveCompletion: { _ in },
receiveValue: { posts in
// update posts
})
.store(in: &cancellables)
The above pipeline subscribes to the subject, but if another value arrives before flatMap is ready to receive it, it would simply be dropped.
Then the load function becomes:
func load() {
loadSubject.send(())
}

How can I branch out multiple API calls from the result of one API call and collect them after all are finished with Combine?

So, I have this sequence of API calls, where I fetch a employee details, then fetch the company and project details that the employee is associated with. After both fetching are complete, I combine both and publish a fetchCompleted event. I've isolated the relevant code below.
func getUserDetails() -> AnyPublisher<UserDetails, Error>
func getCompanyDetails(user: UserDetails) -> AnyPublisher<CompanyDetails, Error>
func getProjectDetails(user: UserDetails) -> AnyPublisher<ProjectDetails, Error>
If I do this,
func getCompleteUserDetails() -> AnyPublisher<UserFetchState, Never> {
let cvs = CurrentValueSubject<UserFetchState, Error>(.initial)
let companyPublisher = getUserDetails()
.flatMap { getCompanyDetails($0) }
let projectPublisher = getUserDetails()
.flatMap { getProjectDetails($0) }
companyPublisher.combineLatest(projectPublisher)
.sink { cvs.send(.fetchComplete) }
return cvs.eraseToAnyPublisher()
}
getUserDetails() will get called twice. What I need is fetch the userDetails once and with that, branch the stream into two, map it to fetch the company details and project details and re-combine both.
Is there a elegant(flatter) way to do the following.
func getCompleteUserDetails() -> AnyPublisher<UserFetchState, Never> {
let cvs = CurrentValueSubject<UserFetchState, Error>(.initial)
getUserDetails()
.sink {
let companyPublisher = getCompanyDetails($0)
let projectPublisher = getProjectDetails($0)
companyPublisher.combineLatest(projectPublisher)
.sink { cvs.send(.fetchComplete) }
}
return cvs.eraseToAnyPublisher()
}
The whole idea of Combine is that you construct a pipeline down which data flows. Actually what flows down can be a value or a completion, where a completion could be a failure (error). So:
You do not need to make a signal that the pipeline has produced its value; the arrival of that value at the end of the pipeline is that signal.
Similarly, you do not need to make a signal that the pipeline's work has completed; a publisher that has produced all the values it is going to produce produces the completion signal automatically, so the arrival of that completion at the end of the pipeline is that signal.
After all, when you receive a letter, the post office doesn't call you up on the phone and say, "You've got mail." Rather, the postman hands you the letter. You don't need to be told you've received a letter; you simply receive it.
Okay, let's demonstrate. The key to understanding your own pipeline is simply to track what kind of value is traveling down it at any given juncture. So let's construct a model pipeline that does the sort of thing you need done. I will posit three types of value:
struct User {
}
struct Project {
}
struct Company {
}
And I will imagine that it is possible to go online and fetch all of that information: the User independently, and the Project and Company based on information contained in the User. I will simulate that by providing utility functions that return publishers for each type of information; in real life these would probably be deferred futures, but I will simply use Just to keep things simple:
func makeUserFetcherPublisher() -> AnyPublisher<User,Error> {
Just(User()).setFailureType(to: Error.self).eraseToAnyPublisher()
}
func makeProjectFetcherPublisher(user:User) -> AnyPublisher<Project,Error> {
Just(Project()).setFailureType(to: Error.self).eraseToAnyPublisher()
}
func makeCompanyFetcherPublisher(user:User) -> AnyPublisher<Company,Error> {
Just(Company()).setFailureType(to: Error.self).eraseToAnyPublisher()
}
Now then, let's construct our pipeline. I take it that our goal is to produce, as the final value in the pipeline, all the information we have collected: the User, the Project, and the Company. So our final output will be a tuple of those three things. (Tuples are important when you are doing Combine stuff. Passing a tuple down the pipeline is extremely common.)
Okay, let's get started. In the beginning there is nothing, so we need an initial publisher to kick off the process. That will be our user fetcher:
let myWonderfulPipeline = self.makeUserFetcherPublisher()
What's coming out the end of that pipeline is a User. We now want to feed that User into the next two publishers, fetching the corresponding Project and Company. The way to insert a publisher into the middle of a pipeline is with flatMap. And remember, our goal is to produce the tuple of all our info. So:
let myWonderfulPipeline = self.makeUserFetcherPublisher()
// at this point, the value is a User
.flatMap { (user:User) -> AnyPublisher<(User,Project,Company), Error> in
// ?
}
// at this point, the value is a tuple: (User,Project,Company)
So what goes into flatMap, where the question mark is? Well, we must produce a publisher that produces the tuple we have promised. The tuple-making publisher par excellence is Zip. We have three values in our tuple, so this is a Zip3:
let myWonderfulPipeline = self.makeUserFetcherPublisher()
.flatMap { (user:User) -> AnyPublisher<(User,Project,Company), Error> in
// ?
let result = Publishers.Zip3(/* ? */)
return result.eraseToAnyPublisher()
}
So what are we zipping? We must zip publishers. Well, we know two of those publishers — they are the publishers we have already defined!
let myWonderfulPipeline = self.makeUserFetcherPublisher()
.flatMap { (user:User) -> AnyPublisher<(User,Project,Company), Error> in
let pub1 = self.makeProjectFetcherPublisher(user: user)
let pub2 = self.makeCompanyFetcherPublisher(user: user)
// ?
let result = Publishers.Zip3(/* ? */, pub1, pub2)
return result.eraseToAnyPublisher()
}
We're almost done! What goes in the missing slot? Remember, it must be a publisher. And what's our goal? We want to pass on the very same User that arrived from upstream. And what's the publisher that does that? It's Just! So:
let myWonderfulPipeline = self.makeUserFetcherPublisher()
.flatMap { (user:User) -> AnyPublisher<(User,Project,Company), Error> in
let pub1 = self.makeProjectFetcherPublisher(user: user)
let pub2 = self.makeCompanyFetcherPublisher(user: user)
let just = Just(user).setFailureType(to:Error.self)
let result = Publishers.Zip3(just, pub1, pub2)
return result.eraseToAnyPublisher()
}
And we're done. No muss no fuss. This is a pipeline that produces a (User,Project,Company) tuple. Whoever subscribes to this pipeline does not need some extra signal; the arrival of the tuple is the signal. And now the subscriber can do something with that info. Let's create the subscriber:
myWonderfulPipeline.sink {
completion in
if case .failure(let error) = completion {
print("error:", error)
}
} receiveValue: {
user, project, company in
print(user, project, company)
}.store(in: &self.storage)
We didn't do anything very interesting — we simply printed the tuple contents. But you see, in real life the subscriber would now do something useful with that data.
You can use the zip operator to get a Publisher which emits a value whenever both of its upstreams emitted a value and hence zip together getCompanyDetails and getProjectDetails.
You also don't need a Subject to signal the fetch being finished, you can just call map on the flatMap.
func getCompleteUserDetails() -> AnyPublisher<UserFetchState, Error> {
getUserDetails()
.flatMap { getCompanyDetails(user: $0).zip(getProjectDetails(user: $0)) }
.map { _ in UserFetchState.fetchComplete }
.eraseToAnyPublisher()
}
However, you shouldn't need a UserFetchState to signal the state of your pipeline (and especially shouldn't throw away the fetched CompanyDetails and ProjectDetails objects in the middle of your pipeline. You should simply return the fetched CompanyDetails and ProjectDetails as a result of your flatMap.
func getCompleteUserDetails() -> AnyPublisher<(CompanyDetails, ProjectDetails), Error> {
getUserDetails()
.flatMap { getCompanyDetails(user: $0).zip(getProjectDetails(user: $0)) }
.eraseToAnyPublisher()
}

How do I stop RxSwift ble scanner once it has found a match?

I have a ble scanner that works and looks like this:
func scan(serviceId: String) -> Observable<[BleHandler.BlePeripheral]> {
knownDevices = []
return waitForBluetooth()
.flatMap { _ in self.scanForPeripheral(serviceId: serviceId) }
.map { _ in self.knownDevices }
}
private func waitForBluetooth() -> Observable<BluetoothState> {
return self.manager
.observeState()
.startWith(self.manager.state)
.filter { $0 == .poweredOn }
.take(1)
}
Then in the viewModel class it filters matches from core data:
func scanAndFilter() -> Observable<[LocalDoorCoreDataObject]> {
let persistingDoors: [LocalDoorCoreDataObject] = coreDataHandler.fetchAll(fetchRequest: NSFetchRequest<LocalDoorCoreDataObject>(entityName: "LocalDoorCoreDataObject"))
return communicationService
.scanForDevices(register: false)
.map{ peripherals in
print("🐶 THIS WILL GO ON FOR ETERNITY", peripherals.count)
self.knownDevices = peripherals
return persistingDoors
.filter { door in peripherals.contains(where: { $0.identifier.uuidString == door.dPeripheralId }) }
}
}
And in the view I want to connect when the scan is completed:
private func scanAndConnect(data: LocalDoorCoreDataObject) {
viewModel.scanRelay().subscribe(
onNext: {
print("🐶SCANNED NAME", $0.first?.dName)},
onCompleted: {
print("🐶COMPLETED SCAN")
self.connectToFilteredPeripheral(localDoor: data)
}).disposed(by: disposeBag)
}
It never reaches onCompleted as it will just scan for eternity even after having found and filtered the core data match. In Apple's framework coreBluetooth I could simply call manager.stopScan() after it has found what I want, but that doesn't seem to be available on the Rx counterpart. How does it work for RxSwift
You can create a new Observable that looks for devices and then completes as soon as it finds the device(s) you're looking for. This would be something like:
func scanAndFilter() -> Observable<[LocalDoorCoreDataObject]> {
return Observable.deferred { in
let persistingDoors: [LocalDoorCoreDataObject] = coreDataHandler.fetchAll(fetchRequest: NSFetchRequest<LocalDoorCoreDataObject>(entityName: "LocalDoorCoreDataObject"))
return communicationService
.scanForDevices(register: false)
.filter { /* verify if the device(s) you're looking for is/are in this list */ }
.take(1)
}
}
The filter operator will make sure that only lists that contain the device you're looking for are passed on and the take(1) operator will take the first emitted value and complete immediately.
The deferred call makes sure that the fetch request that is performed in the first line is not executed when you call scanAndFilter() but only when somebody actually subscribes to the resulting Observable.
If you only want one event to exit the filter operator, then just use .take(1). The Observable will shut down after it emits a single value. If the BLE function is written correctly, it will call stopScan() when the Disposable is disposed of.
I have no idea why the other answer says to "always make sure to wrap all function that return Observables into a .deferred. I've been using RxSwift since 2015 and I've only ever needed deferred once. Certainly not every time I called a function that returned an Observable.

Combine framework retry after delay?

I see how to use .retry directly, to resubscribe after an error, like this:
URLSession.shared.dataTaskPublisher(for:url)
.retry(3)
But that seems awfully simple-minded. What if I think that this error might go away if I wait awhile? I could insert a .delay operator, but then the delay operates even if there is no error. And there doesn't seem to be a way to apply an operator conditionally (i.e. only when there's an error).
I see how I could work around this by writing a RetryWithDelay operator from scratch, and indeed such an operator has been written by third parties. But is there a way to say "delay if there's an error", purely using the operators we're given?
My thought was that I could use .catch, because its function runs only if there is an error. But the function needs to return a publisher, and what publisher would we use? If we return somePublisher.delay(...) followed by .retry, we'd be applying .retry to the wrong publisher, wouldn't we?
It was a topic of conversation on the Using Combine project repo a while back - the whole thread: https://github.com/heckj/swiftui-notes/issues/164.
The long and short was we made an example that I think does what you want, although it does use catch:
let resultPublisher = upstreamPublisher.catch { error -> AnyPublisher<String, Error> in
return Publishers.Delay(upstream: upstreamPublisher,
interval: 3,
tolerance: 1,
scheduler: DispatchQueue.global())
// moving retry into this block reduces the number of duplicate requests
// In effect, there's the original request, and the `retry(2)` here will operate
// two additional retries on the otherwise one-shot publisher that is initiated with
// the `Publishers.Delay()` just above. Just starting this publisher with delay makes
// an additional request, so the total number of requests ends up being 4 (assuming all
// fail). However, no delay is introduced in this sequence if the original request
// is successful.
.retry(2)
.eraseToAnyPublisher()
}
This is referencing the a retry pattern I have in the book/online, which is basically what you describe (but wasn't what you asked about).
The person I was corresponding with on the issue provided a variant in that thread as an extension that might be interesting as well:
extension Publisher {
func retryWithDelay<T, E>()
-> Publishers.Catch<Self, AnyPublisher<T, E>> where T == Self.Output, E == Self.Failure
{
return self.catch { error -> AnyPublisher<T, E> in
return Publishers.Delay(
upstream: self,
interval: 3,
tolerance: 1,
scheduler: DispatchQueue.global()).retry(2).eraseToAnyPublisher()
}
}
}
I found a few quirks with the implementations in the accepted answer.
Firstly the first two attempts will be fired off without a delay since the first delay will only take effect after the second attempt.
Secondly if any one of the retry attempts succeed, the output value will also delayed which seems unnecessary.
Thirdly the extension is not flexible enough to allow the user to decide which scheduler it would like the retry attempts to be dispatched to.
After some tinkering back and forth I ended up with a solution like this:
public extension Publisher {
/**
Creates a new publisher which will upon failure retry the upstream publisher a provided number of times, with the provided delay between retry attempts.
If the upstream publisher succeeds the first time this is bypassed and proceeds as normal.
- Parameters:
- retries: The number of times to retry the upstream publisher.
- delay: Delay in seconds between retry attempts.
- scheduler: The scheduler to dispatch the delayed events.
- Returns: A new publisher which will retry the upstream publisher with a delay upon failure.
~~~
let url = URL(string: "https://api.myService.com")!
URLSession.shared.dataTaskPublisher(for: url)
.retryWithDelay(retries: 4, delay: 5, scheduler: DispatchQueue.global())
.sink { completion in
switch completion {
case .finished:
print("Success 😊")
case .failure(let error):
print("The last and final failure after retry attempts: \(error)")
}
} receiveValue: { output in
print("Received value: \(output)")
}
.store(in: &cancellables)
~~~
*/
func retryWithDelay<S>(
retries: Int,
delay: S.SchedulerTimeType.Stride,
scheduler: S
) -> AnyPublisher<Output, Failure> where S: Scheduler {
self
.delayIfFailure(for: delay, scheduler: scheduler)
.retry(retries)
.eraseToAnyPublisher()
}
private func delayIfFailure<S>(
for delay: S.SchedulerTimeType.Stride,
scheduler: S
) -> AnyPublisher<Output, Failure> where S: Scheduler {
self.catch { error in
Future { completion in
scheduler.schedule(after: scheduler.now.advanced(by: delay)) {
completion(.failure(error))
}
}
}
.eraseToAnyPublisher()
}
}
I remembered that the RxSwiftExt library had a really nice implementation of a custom retry + delay operator with many options (linear and exponential delay, plus an option to provide a custom closure) and I tried to recreate it in Combine. The original implementation is here.
/**
Provides the retry behavior that will be used - the number of retries and the delay between two subsequent retries.
- `.immediate`: It will immediatelly retry for the specified retry count
- `.delayed`: It will retry for the specified retry count, adding a fixed delay between each retry
- `.exponentialDelayed`: It will retry for the specified retry count.
The delay will be incremented by the provided multiplier after each iteration
(`multiplier = 0.5` corresponds to 50% increase in time between each retry)
- `.custom`: It will retry for the specified retry count. The delay will be calculated by the provided custom closure.
The closure's argument is the current retry
*/
enum RetryBehavior<S> where S: Scheduler {
case immediate(retries: UInt)
case delayed(retries: UInt, time: TimeInterval)
case exponentialDelayed(retries: UInt, initial: TimeInterval, multiplier: Double)
case custom(retries: UInt, delayCalculator: (UInt) -> TimeInterval)
}
fileprivate extension RetryBehavior {
func calculateConditions(_ currentRetry: UInt) -> (maxRetries: UInt, delay: S.SchedulerTimeType.Stride) {
switch self {
case let .immediate(retries):
// If immediate, returns 0.0 for delay
return (maxRetries: retries, delay: .zero)
case let .delayed(retries, time):
// Returns the fixed delay specified by the user
return (maxRetries: retries, delay: .seconds(time))
case let .exponentialDelayed(retries, initial, multiplier):
// If it is the first retry the initial delay is used, otherwise it is calculated
let delay = currentRetry == 1 ? initial : initial * pow(1 + multiplier, Double(currentRetry - 1))
return (maxRetries: retries, delay: .seconds(delay))
case let .custom(retries, delayCalculator):
// Calculates the delay with the custom calculator
return (maxRetries: retries, delay: .seconds(delayCalculator(currentRetry)))
}
}
}
public typealias RetryPredicate = (Error) -> Bool
extension Publisher {
/**
Retries the failed upstream publisher using the given retry behavior.
- parameter behavior: The retry behavior that will be used in case of an error.
- parameter shouldRetry: An optional custom closure which uses the downstream error to determine
if the publisher should retry.
- parameter tolerance: The allowed tolerance in firing delayed events.
- parameter scheduler: The scheduler that will be used for delaying the retry.
- parameter options: Options relevant to the scheduler’s behavior.
- returns: A publisher that attempts to recreate its subscription to a failed upstream publisher.
*/
func retry<S>(
_ behavior: RetryBehavior<S>,
shouldRetry: RetryPredicate? = nil,
tolerance: S.SchedulerTimeType.Stride? = nil,
scheduler: S,
options: S.SchedulerOptions? = nil
) -> AnyPublisher<Output, Failure> where S: Scheduler {
return retry(
1,
behavior: behavior,
shouldRetry: shouldRetry,
tolerance: tolerance,
scheduler: scheduler,
options: options
)
}
private func retry<S>(
_ currentAttempt: UInt,
behavior: RetryBehavior<S>,
shouldRetry: RetryPredicate? = nil,
tolerance: S.SchedulerTimeType.Stride? = nil,
scheduler: S,
options: S.SchedulerOptions? = nil
) -> AnyPublisher<Output, Failure> where S: Scheduler {
// This shouldn't happen, in case it does we finish immediately
guard currentAttempt > 0 else { return Empty<Output, Failure>().eraseToAnyPublisher() }
// Calculate the retry conditions
let conditions = behavior.calculateConditions(currentAttempt)
return self.catch { error -> AnyPublisher<Output, Failure> in
// If we exceed the maximum retries we return the error
guard currentAttempt <= conditions.maxRetries else {
return Fail(error: error).eraseToAnyPublisher()
}
if let shouldRetry = shouldRetry, shouldRetry(error) == false {
// If the shouldRetry predicate returns false we also return the error
return Fail(error: error).eraseToAnyPublisher()
}
guard conditions.delay != .zero else {
// If there is no delay, we retry immediately
return self.retry(
currentAttempt + 1,
behavior: behavior,
shouldRetry: shouldRetry,
tolerance: tolerance,
scheduler: scheduler,
options: options
)
.eraseToAnyPublisher()
}
// We retry after the specified delay
return Just(()).delay(for: conditions.delay, tolerance: tolerance, scheduler: scheduler, options: options).flatMap {
return self.retry(
currentAttempt + 1,
behavior: behavior,
shouldRetry: shouldRetry,
tolerance: tolerance,
scheduler: scheduler,
options: options
)
.eraseToAnyPublisher()
}
.eraseToAnyPublisher()
}
.eraseToAnyPublisher()
}
}
Using .catch is indeed the answer. We simply make a reference to the data task publisher and use that reference as the head of both pipelines — the outer pipeline that does the initial networking, and the inner pipeline produced by the .catch function.
Let's start by creating the data task publisher and stop:
let pub = URLSession.shared.dataTaskPublisher(for: url).share()
Now I can form the head of the pipeline:
let head = pub.catch {_ in pub.delay(for: 3, scheduler: DispatchQueue.main)}
.retry(3)
That should do it! head is now a pipeline that inserts a delay operator only just in case there is an error. We can then proceed to form the rest of the pipeline, based on head.
Observe that we do indeed change publishers; if there is a failure and the catch function runs, the pub which is the upstream of the .delay becomes the publisher, replacing the pub we started out with. However, they are the same object (because I said share), so this is a distinction without a difference.

Emitting progress items while forwarding result in a single observable

I am initiating an operation via a REST API in two steps:
Start operation and return a task id
Poll task with the given id and complete the sequence when the operation returns complete.
The polling the task id will return a 202 which indicates that the operation is still in progress and a 200 when it completes. Any other code is an error.
I need to communicate to the subscribers the response of each step.
Previously, I would have the do operator push the response in between steps to a ReplaySubject.
startReboot()
.do(onNext: { response in
operationStatus.next(response)
)
.flatMap({ response in
// If we could not get the task ID from the response we error
guard let taskID = getTaskIDFromJSON(response) else { return Observable.error(API.serverError) }
return Observable.just(taskID)
})
.flatMap({ taskID in
return pollTask(withID: taskID) // internally, it uses retryWhen to check the api again with a five second delay
})
.do(onNext: { response in
operationStatus.next(response)
})
.subscribe(onError: { _ in
showOperationFailedIcon()
}, onCompleted: {
showOperationCompleteIcon()
})
And somewhere else, a subscriber to the subject would do the following:
operationStatus.subscribe(onNext: { response
showResponse(response)
})
So essentially I am showing the progress of the operation and the response we get from each step at the same time.
At the time I was not familiar with Rx to come up with a cleaner solution. But now that I have familiarized my self with it, it seems to me that there should be a solution where we don't use side effects and contain this to a single final observable. Still, I cannot find a way to do it.
I was thinking about something like this:
let opObs = startReboot()
let pollingObs = pollTask(/* where does the id come from? */)
Observable.concat(opObs, pollingObs)
.subscribe(onNext : { response in
showResponse(response)
}, onError: { _ in
showOperationFailedIcon()
}, onCompleted: {
showOperationCompleteIcon()
})
But that would imply that once opObs is done I would need to save the task id in variable outside of the monad and wrap pollingObs to fetch it when it starts - once again introducing side effects.
Is there an operator or combination of operators that I can use to emit the response of each step to a subscriber and also pass it to another observable?
Something like this should work. Note the use of share() to avoid triggering twice the startReboot sequence.
let opObs = startReboot().share()
let pollingObs = opObs.flatMap {
guard let taskID = getTaskIDFromJSON(response) else { return Observable.error(API.serverError)
return pollTask(withID: taskID)
}
Observable.concat(opObs, pollingObs)
.subscribe(onNext : { response in
showResponse(response)
}, onError: { _ in
showOperationFailedIcon()
}, onCompleted: {
showOperationCompleteIcon()
})

Resources