I have seen a lot of blogs about throttle and debounce. Most of them said they are the same thing. But I get the different result form my example? Here is the example:
let disposeBag = DisposeBag()
Observable.of(1,2,3,4,5)
.debounce(1, scheduler: MainScheduler.instance)
.subscribe(onNext: {print($0)})
.addDisposableTo(disposeBag)
the result was 5. But when I used throttle, the result was 1
let disposeBag = DisposeBag()
Observable.of(1,2,3,4,5)
.throttle(1, scheduler: MainScheduler.instance)
.subscribe(onNext: {print($0)})
.addDisposableTo(disposeBag)
So,I can't understand about the throttle operator?
In earlier versions of RxSwift, throttle and debounce did the same thing, which is why you will see articles stating this. In RxSwift 3.0 they do a similar but opposite thing.
Both debounce and throttle are used to filter items emitted by an observable over time.
throttle emits only the first item emitted by the source observable in the time window.
debounce only emits an item after the specified time period has passed without another item being emitted by the source observable.
Both can be used to reduce the number of items emitted by an observable; which one you use depends on whether you want the "first" or "last" value emitted in a time period.
The term "debounce" comes from electronics and refers to the tendency of switch contacts to "bounce" between on and off very quickly when a switching action occurs. You won't notice this when you turn on a lightbulb but a microprocessor looking at an input thousands of times a second will see a rapid sequence of "ons" and "offs" before the switch settles into its final state. This is why debounce gives you the value of 5; the final item that was emitted in your time frame (1 ms). If you put a time delay into your code so that the items were emitted more slowly (more than 1ms apart) you would see a number of items emitted by debounce.
In an app you could use debounce for performing a search that is expensive (say it requires a network operation). The user is going to type a number of characters characters for their search string but you don't want to initiate a search as they enter each character, since the search is expensive and the earlier results will be obsolete by the time they return. Using debounce you can ensure that the search string is only emitted once the user stops typing for a period (say 500ms).
You might use throttle where an operation takes some time and you want to ignore further input until that time has elapsed. Say you have a button that initiates an operation. If the user taps the button multiple times in quick succession you only want to initiate the operation once. You can use throttle to ignore the subsequent taps within a specified time window. debounce could also work but would introduce a delay before the action item was emitted, while throttle allows you to react to the first action and ignore the rest.
Simple explanation is here https://medium.com/fantageek/throttle-vs-debounce-in-rxswift-86f8b303d5d4
Throttle: the original function is called at most once per specified period.
Debounce: the original function is called after the caller stops calling the decorated function after a specified period.
Related
i read about flink`s window assigners over here: https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/windows.html#window-assigners , but i cant find any solution for my problem.
as part of my project i need a windowing that the timer will start given the first element of the key and will be closed and set ready for processing after X minutes. for example:
first keyA comes at (hh:mm:ss) 00:00:02, i want all keyA will be windowing until 00:01:02, and then the timer of 1 minutes will start again only when keyA will be given as input.
Is it possible to do something like that in flink? is there a workaround?
hope i made it clear enough.
Implementing keyed windows that are aligned with the first event, rather than with the epoch, is quite difficult, in general, which I believe is why this isn't supported by Flink's window API. The problem is that with an out-of-order stream using event time processing, as earlier events arrive you may need to revise your notion of when the window began, and when it should end. For example, if the first keyA arrives at 00:00:02, but then some time later an event with keyA arrives with a timestamp of 00:00:01, now suddenly the window should end at 00:01:01, rather than 00:01:02. And if the out-of-orderness is large compared to the window length, handling this becomes quite complex -- imagine, for example, that the event from 00:00:01 arrives 2 minutes after the event from 00:00:02.
Rather than trying to implement this with the window API, I would use a KeyedProcessFunction. If you only need to support processing time windows, then these concerns about out-of-orderness do not apply, and the solution can be fairly simple. It suffices to keep one object in keyed state, which might be a list holding all of the events in the window, or a counter or other aggregator, depending on what you're trying to accomplish.
When an event arrives, if the state (for this key) is null, then there is no open window for this key. Initialize the state (i.e., create a new, empty list, or set the counter to zero), and create a Timer to fire at the appropriate time. Then regardless of whether the state had been null, add the incoming event to the state (i.e., append it to the list, or increment the counter).
When the timer fires, emit the window's result and reset the state to null.
If, on the other hand, you want to do this with event time windows, first sort the stream and then use the same approach. Note that you won't be able to handle late events, so plan your watermarking accordingly (reducing the likelihood of late events to a manageable level), or go for a more complex implementation.
I read that, "..The ordering operator has to buffer all elements it receives. Then, when it receives a watermark it can sort all elements that have a timestamp that is lower than the watermark and emit them in the sorted order. This is correct because the watermark signals that not more elements can arrive that would be intermixed with the sorted elements..." - https://cwiki.apache.org/confluence/display/FLINK/Time+and+Order+in+Streams
Hence, it seems that the watermark serves as a signal to the following operator, for beginning processing. I guess, that's what also a Trigger does. What's the difference between the two?
You can think of watermarks as special records that tell an operator what (event-) time it is. When an operator receives a watermark, it compares the watermark with its current time and other watermarks it received from different stream partitions. Depending on the comparison, the operator advances its own clock.
Some operators register timers (windows, time-based joins, custom implementations). An operator triggers a timer when the clock of the operator passes the time for which the timer was registered.
So, watermarks and timers are two different things. Watermarks tell an operator what time it is and the operator triggers a timer at the right point in time.
A Watermark can be thought of as an assertion that an event time stream is now complete up to a particular timestamp. When a Watermark is processed by an operator it will cause the firing of any relevant event time timers. The operators that use EventTimeTimers are EventTimeWindows and ProcessFunctions.
Triggers are part of the window API and define when Windows will produce results. An EventTimeTrigger wraps around an event time timer that is called when an suitably large Watermark is processed, indicating that the window is now complete.
I noticed that in RxJava we are not able to perform operations such as this method signature buffer(count, skip, timespan, TimeUnit)
Is there anyway to perform this operations without recreating another operation for buffer.
this method doesn't help a lot. Beside, in the doc there is no spec if it emits overlapping buffered items or not !??
UPDATE
What flatMapIterable adds here ? The desired behaviour is that if a have 10 items in the specified timespan then a buffered item is emitted. if the timespan has elapsed and no item is emitted i want to receive an empty buffer.
using the count and skip oblige to wait until the count is reached but returns nothing in between.
Let's say I have an unbounded pcollection of sentences keyed by userid, and I want a constantly updated value for whether the user is annoying, we can calculate whether a user is annoying by passing all of the sentences they've ever said into the funcion isAnnoying(). Forever.
I set the window to global with a trigger afterElement(1), accumulatingFiredPanes(), do GroupByKey, then have a ParDo that emits userid,isAnnoying
That works forever, keeps accumulating the state for each user etc. Except it turns out the vast majority of the time a new sentence does not change whether a user isAnnoying, and so most of the times the window fires and emits a userid,isAnnoying tuple it's a redundant update and the io was unnecessary. How do I catch these duplicate updates and drop while still getting an update every time a sentence comes in that does change the isAnnoying value?
Today there is no way to directly express "output only when the combined result has changed".
One approach that you may be able to apply to reduce data volume, depending on your pipeline: Use .discardingFiredPanes() and then follow the GroupByKey with an immediate filter that drops any zero values, where "zero" means the identity element of your CombineFn. I'm using the fact that associativity requirements of Combine mean you must be able to independently calculate the incremental "annoying-ness" of a sentence without reference to the history.
When BEAM-23 (cross-bundle mutable per-key-and-window state for ParDo) is implemented, you will be able to manually maintain the state and implement this sort of "only send output when the result changes" logic yourself.
However, I think this scenario likely deserves explicit consideration in the model. It blends the concepts embodied today by triggers and the accumulation mode.
I've created a music sequence using coreMIDI and AudioToolBox and would now like to display the current beat position of the looped sequence in the UI. If the loop is only four beats, I would like the display to toggle between 1,2,3,4 and return to 1 as the sequence loops (much like in the transport of a program such a Logic Pro). I've made a function that uses MusicSequenceGetBeatsForSeconds() to calculate the beats based on the amount of time (using CFAbsoluteTimeGetCurrent()) that has passed since play was pressed, but this keeps the value increasing linearly (on beat on of the next measure the display reads "5.0"). How do I read the current beat of the sequence?
Skip the end of track message; you can't rely on it.
Instead create a MusicSequenceUserCallback. Then add a track to your sequence that holds user events. The actual data in the event doesn't matter.
var event = MusicEventUserData(length: 1, data: (0xAA))
let status = MusicTrackNewUserEvent(markerTrack, endBeat, &event)
Then in the callback look for this arbitrary data value and do that thing you want to do.
let d = eventData.memory
if d.data == 0xAA {
In Swift 2.2, the callback cannot be a closure. So make it a global function.
Look at MuscPlayer's MusicPlayerGetTime function instead.
Depending on what you want to do, you can set up a virtual destination as an endpoint for your MusicSequence and in its read block respond to events as they happen.