Why does temporal workflow design numTaskqueueWritePartitions and numTaskqueueReadPartitions? - temporal

Can anyone explain why temporal need two paramters: numTaskqueueWritePartitions and numTaskqueueReadPartitions?
I think we use one parameters numTaskqueuePartitions for read and wirte which is enough.

It is needed to support shrinking the number of partitions without losing tasks. Set numTaskqueueWritePartitions < numTaskqueueReadPartitions to drain the backlog from the partitions to be removed. After no messages are left in those partitions set numTaskqueueReadPartitions to the value of numTaskqueueWritePartitions.

Related

max-series-per-database limit exceeded clarification needed / how to calculate number of series in use

We recently started to encounter this error:
{"error":"partial write: max-series-per-database limit exceeded: (1000000) dropped=1"}
When writing metric data like this:
resque_job,environment=beta,billing_status=active-current,billing_active=active,instance_id=1103,instance_testmode=0,instance_staging=0,server_addr=RESQUE,database_host=db11.msp1.our-domain.com,admin_sso_key=_EMPTY_,admin_is_internal=_EMPTY_,queue_priority=default seconds_spent_job=0.20966601371765,number_in_batch=1 1649203450783000002
I know that Influx recommends you keep your series cardinality low, and our impression was that series cardinality would mean keeping each tag individually to a small number of values. e.g. we felt comfortable sending instance_id=1103 as a tag, because we know that there will never be more than 2000 distinct instance_id tag values.
But after running into this error... I'm afraid maybe I was mistaken here. Do we actually need to keep the cardinality of all possible combinations of all tags low? e.g. do these two things count as two separate series towards the 1,000,000 default max, because the instance_id is different?
resque_job,environment=beta,billing_status=active-current,billing_active=active,instance_id=1111,instance_testmode=0,instance_staging=0,server_addr=RESQUE,database_host=db11.msp1.our-domain.com,admin_sso_key=_EMPTY_,admin_is_internal=_EMPTY_,queue_priority=default seconds_spent_job=0.20966601371765,number_in_batch=1 1649203450783000002
resque_job,environment=beta,billing_status=active-current,billing_active=active,instance_id=2222,instance_testmode=0,instance_staging=0,server_addr=RESQUE,database_host=db11.msp1.our-domain.com,admin_sso_key=_EMPTY_,admin_is_internal=_EMPTY_,queue_priority=default seconds_spent_job=0.20966601371765,number_in_batch=1 1649203450783000002
If those count as two separate series... then is there a better way to structure this data in Influx? 1,000,000 total seems like a tiny amount if each separate combination of tags is a separate series...
Does InfluxDB 2.x help with this?
Is there a better tool that can handle a large number of tags and not bump into limits like this?
There is no way to figure out what data was not recorded. Update the max-series-per-database configuration to be more than 1M in order to stop dropping data.
This can be an indication that you are creating a lot of series. i saw some documentation on why that isn't great.
Hope this helps!

InfluxDB WHERE clause on a 'High Cardinality' field (or a tag)

I'm playing with InfluxDB and trying to experiment it for a vehicle speed tracking usecase.
Every vehicle's speed at a given time is stored as a data point.
I'm modelling "vehicle_registration" as a tag and other values as fields. I'd want the where clause to be applied on the "vehicle_registration" and it got to be quick. Therefore I'm taking advantage of the indexing capabilities on a tag by default.
But the biggest stumbling block for me is that the tags need to have a lower cardinality.
What are the recommendations here? I want a high cardinal field to be applied in a "where" clause and the queries should be quick.
Any advice?
High cardinality means higher memory requirement. So it really depends what high cardinality means in your use case. 1k will be probably fine for 8GB memory, but 1M will be probably problem for 8GB. The best option is to try it. Simulate it and you will see real memory requirements. Then you will be able to configure proper sizing for InfluxDB based on that (and your budget of course).
Or you can try TSI https://docs.influxdata.com/influxdb/v1.8/concepts/tsi-details/

Wrapper for slow indicator for backtesting

If a technical indicator works very slow, and I wish to include it in an EA ( using iCustom() ), is there a some "wrapper" that could cache the indicator results to a file based on the particular indicator inputs?
This way I could get a better speed next time when I backtest it using the same set of parameters, since the "wrapper" could read the result from file rather than recalculate the result from the indicator.
I heard that some developers did that for their needs in order to speed up backtesting, but as far as i know, there's no publicly available solution.
If I had to solve this problem, I would create a class with two fields (datetime and indicator value, or N buffers of the indicator), and a collection class similar to CArrayObj.mqh but with an option to apply binary search, or to start looking for element from a specific index, not from the very beginning of the array.
Recent MT4 Builds added VERY restrictive conditions for Indicators
In early years of MT4, this was not so cruel as it is these days.
FACT#1: fileIO is 10.000x ~ 100.000x slower than memIO:
This means, there is no benefit from "pre-caching" values to disk.
FACT#2: Processing Performance has HARD CEILING:
All, yes ALL, Custom Indicators, that are being used in MetaTrader4 Terminal ( be it directly in GUI, or indirectly, via Template(s) or called via iCustom() calls & in Strategy Tester via .tpl + iCustom() ) ALL THESE SHARE A SINGLE THREAD ...
FACT#3: Strategy Tester has the most demanding needs for speed:
Thus - eliminate all, indeed ALL, non-core indicators from tester.tpl template and save it as "blank", to avoid any part of such non-core processing.
Next, re-design the Custom Indicator, where possible, so as to avoid any CPU-ops & MEM-allocation(s), that are not necessary.
I remember a Custom Indicatore designs with indeed deep-convolutions, which could have been re-engineered so as to keep just a triangular sparse-matrix with necessary updates, that has increased the speed of Indicator processing more than 10.000x, so code-revision is the way.
So, rather run a separate MetaTrader4 Terminal, just for BackTesting, than having to wait for many hours just due to un-compressible nature of numerical processing under a traffic-jam congestion in the shared use of the CustomIndicator-solo-Thread that none scheduling could improve.
FACT#4: O/S can increase a process priority:
Having got to the Devil's zone, it is a common practice to spin-up the PRIO for the StrategyTester MT4, up to the "RealTime PRIO" in the O/S tools.
One may even additionally "lock" this MT4-process onto a certain CPU-core(s) and setup all other processes with adjacent CPU-core-AFFINITY, so that these two distinct groups of processes do not jump one to the other group's CPU-core(s). Hard, but if squeezing the performance to the bleeding edge, this is a must.

Stream de-duplication on Dataflow | Running services on Dataflow services

I want to de-dupe a stream of data based on an ID in a windowed fashion. The stream we receive has and we want to remove data with matching within N-hour time windows. A straight-forward approach is to use an external key-store (BigTable or something similar) where we look-up for keys and write if required but our qps is extremely large making maintaining such a service pretty hard. The alternative approach I came up with was to groupBy within a timewindow so that all data for a user within a time-window falls within the same group and then, in each group, we use a separate key-store service where we look up for duplicates by the key. So, I have a few questions about this approach
[1] If I run a groupBy transform, is there any guarantee that each group will be processed in the same slave? If guaranteed, we can group by the userid and then within each group compare the sessionid for each user
[2] If it is feasible, my next question is to whether we can run such other services in each of the slave machines that run the job - in the example above, I would like to have a local Redis running which can then be used by each group to look up or write an ID too.
The idea seems off what Dataflow is supposed to do but I believe such use cases should be common - so if there is a better model to approach this problem, I am looking forward to that too. We essentially want to avoid external lookups as much as possible given the amount of data we have.
1) In the Dataflow model, there is no guarantee that the same machine will see all the groups across windows for the key. Imagine that a VM dies or new VMs are added and work is split across them for scaling.
2) Your welcome to run other services on the Dataflow VMs since they are general purpose but note that you will have to contend with resource requirements of the other applications on the host potentially causing out of memory issues.
Note that you may want to take a look at RemoveDuplicates and use that if it fits your usecase.
It also seems like you might want to be using session windows to dedupe elements. You would call:
PCollection<T> pc = ...;
PCollection<T> windowed_pc = pc.apply(
Window<T>into(Sessions.withGapDuration(Duration.standardMinutes(N hours))));
Each new element will keep extending the length of the window so it won't close until the gap closes. If you also apply an AfterCount speculative trigger of 1 with an AfterWatermark trigger on a downstream GroupByKey. The trigger would fire as soon as it could which would be once it has seen at least one element and then once more when the session closes. After the GroupByKey you would have a DoFn that filters out an element which isn't an early firing based upon the pane information ([3], [4]).
DoFn(T -> KV<session key, T>)
|
\|/
Window.into(Session window)
|
\|/
Group by key
|
\|/
DoFn(Filter based upon pane information)
It is sort of unclear from your description, can you provide more details?
Sorry for not being clear. I gave the setup you mentioned a try, except for the early and late firings part, and it is working on smaller samples. I have a couple of follow up questions, related to scaling this up. Also, I was hoping I could give you more information on what the exact scenario is.
So, we have incoming data stream, each item of which can be uniquely identified by their fields. We also know that duplicates occur pretty far apart and for now, we care about those within a 6 hour window. And regarding the volume of data, we have atleast 100K events every second, which span across a million different users - so within this 6 hour window, we could get a few billion events into the pipeline.
Given this background, my questions are
[1] For the sessioning to happen by key, I should run it on something like
PCollection<KV<key, T>> windowed_pc = pc.apply(
Window<KV<key,T>>into(Sessions.withGapDuration(Duration.standardMinutes(6 hours))));
where key is a combination of the 3 ids I had mentioned earlier. Based on the definition of Sessions, only if I run it on this KV would I be able to manage sessions per-key. This would mean that Dataflow would have too many open sessions at any given time waiting for them to close and I was worried if it would scale or I would run into any bottle-necks.
[2] Once I perform Sessioning as above, I have already removed the duplicates based on the firings since I will only care about the first firing in each session which already destroys duplicates. I no longer need the RemoveDuplicates transform which I found was a combination of (WithKeys, Combine.PerKey, Values) transforms in order, essentially performing the same operation. Is this the right assumption to make?
[3] If the solution in [1] going to be a problem, the alternative is to reduce the key for sessioning to be just user-id, session-id ignoring the sequence-id and then, running a RemoveDuplicates on top of each resulting window by sequence-id. This might reduce the number of open sessions but still would leave a lot of open sessions (#users * #sessions per user) which can easily run into millions. FWIW, I dont think we can session only by user-id since then the session might never close as different sessions for same user could keep coming in and also determining the session gap in this scenario becomes infeasible.
Hope my problem is a little more clear this time. Please let me know any of my approaches make the best use of Dataflow or if I am missing something.
Thanks
I tried out this solution at a larger scale and as long as I provide sufficient workers and disks, the pipeline scales well although I am seeing a different problem now.
After this sessionization, I run a Combine.perKey on the key and then perform a ParDo which looks into c.pane().getTiming() and only rejects anything other than an EARLY firing. I tried counting both EARLY and ONTIME firings in this ParDo and it looks like the ontime-panes are actually deduped more precisely than the early ones. I mean, the #early-firings still has some duplicates whereas the #ontime-firings is less than that and has more duplicates removed. Is there any reason this could happen? Also, is my approach towards deduping using a Combine+ParDo the right one or could I do something better?
events.apply(
WithKeys.<String, EventInfo>of(new SerializableFunction<EventInfo, String>() {
#Override
public java.lang.String apply(EventInfo input) {
return input.getUniqueKey();
}
})
)
.apply(
Window.named("sessioner").<KV<String, EventInfo>>into(
Sessions.withGapDuration(mSessionGap)
)
.triggering(
AfterWatermark.pastEndOfWindow()
.withEarlyFirings(AfterPane.elementCountAtLeast(1))
)
.withAllowedLateness(Duration.ZERO)
.accumulatingFiredPanes()
);

Total aggregate over an unbounded stream in Dataflow

A number of examples show aggregation over windows of an unbounded stream, but suppose we need to get a count-per-key of the entire stream seen up to some point in time. (Think word count that emits totals for everything seen so far rather than totals for each window.)
It seems like this could be a Combine.perKey and a trigger to emit panes at some interval. In this case the window is essentially global, and we emit panes for that same window throughout the life of the job. Is this safe/reasonable, or perhaps there is another way to compute a rolling, total aggregate?
Ryan your solution of using a global window and a periodic trigger is the recommended approach. Just make sure you use accumulation mode on the trigger and not discarding mode. The Triggers page should have more information.
Let us know if you need additional help.

Resources