Is there a way to configure a threshold violation trigger so that it only fires Monday-Friday and doesn't fire on weekends? I don't see any option in the Thresholds or Triggers menu.
As of perfino 4, such a feature is not available. I've added it to our issue tracker.
Related
I'm looking for clear documentation and/or and example on how to setup a time-based trigger on a global window in Apache beam.
The purpose is to perform a count of the events since the last trigger fired, even when 0 events have been added since.
You can use timers and state, if you need to use global window and emit result even if there was no event since the last firing. I think it is not possible to do it with the built in triggers.
You can keep the count in a state and use a timer to emit the results periodically.
These two blog post explains the usage of timers and state:
Stateful processing with Apache Beam
Timely (and Stateful) Processing with Apache Beam
I am designing a basket abandoning system for an Ecommerce company. The system will send a message to a user based on the below rules:
There is no interaction by the user on the site for 30 minutes.
Has added more than $50 worth of products to the basket.
Has not yet completed a transaction.
I use Google Cloud Dataflow to process the data and decide if a message should be sent. I have couple of options in below:
Use a Sliding window with a duration of 30 minutes.
A global window with a time based trigger with a delay of 30 minutes.
I think Sliding Window might work here. But my question is, can there be a solution based on using a global window with a processing time based trigger and a delay for this usecase?
As far as i understand the triggers based on Apache Beam documentation =>
Triggers allow Beam to emit early results, before a given window is closed. For example, emitting after a certain amount of time elapses, or after a certain number of elements arrives.
Triggers allow processing late data by triggering after the event time watermark passes the end of the window.
So, for my use case and as per the above trigger concepts, i don't think the trigger can be triggered after a set delay for each and every user (It is mentioned in above - can emit only after a certain number of elements it is mentioned above, but not sure if that could be 1). Can you confirm?
Both answers 1 - Sliding Windows and 2 - Global Window are incorrect
Sliding windows is not correct because - assuming there is one key per user, a message will be sent 30 minutes after they first started browsing even if they are still browsing
Global Windows is not correct because - it will cause messages to be sent out every 30 minutes to all users regardless of where they are in their current session
Even Fixed Windows would be incorrect in this case, because assuming there is one key per user, a message will be sent every 30 minutes
Correct answer would be - Use a session window with a gap duration of 30 minutes
This is correct because it will send a message per user after that user is inactive for 30 minutes
I think that sliding window is the correct approach from what you described, and I don't think you can solve this with trigger+delay. If event time sliding windowing makes sense from your business logic perspective, try to use it first, that's what it's for.
My understanding is that while you can use a trigger to produce early results, it is not guaranteed to fire at specific (server/processing) time or with exact number of elements (received so far for the window). The trigger condition enables/unblocks the runner to emit the window contents but it doesn't force it to do so.
In case of event time this makes sense, as it doesn't matter when the event arrives or when the trigger fires, because if the element has a timestamp within a window, then it will be assigned to the correct window no matter when it arrives. And when the trigger will fire for the window, the element will be guaranteed to be in that window if it has arrived.
With processing time you can't do this. If event arrives late, it will be accounted for at that time, and will be emitted next time the trigger fires, basically. And because the trigger doesn't guarantee the exact moment it fires you can potentially end up with unexpected data belonging to unexpected emitted panes. It is useful to get the early results in general but I am not sure if you can reason about windowing based on that.
Also, trigger delay only adds a fire delay (e.g. if it was supposed to fire at 12pm, not it will fire at 12.05pm) but it doesn't allow you to reliably stagger multiple trigger firings so that it goes off at specific intervals.
You can look at the design doc for triggers here: https://s.apache.org/beam-triggers , and possibly lateness doc may be relevant as well: https://s.apache.org/beam-lateness
Other docs can be found here, if you are interested: https://beam.apache.org/contribute/design-documents/ .
Update:
Rui pointed that this use case can be more complicated and probably not easily solvable by sliding windows. Maybe it's worth looking into session windows or manual logic on top of keys+state+timers
I find state[1] and timer[2] doc of Apache Beam, which should be able to handle this specific use case without using processing time trigger in global window.
Assuming the incoming data are events of users' actions, and each event(action) can be keyed by user_id.
The nice property that state and timer have is on per key and window basis. So you can accumulate state for each user_id and the state is amount of money in cart in this case. Timer can be set at the first time when amount in cart exceeds $50, and timer can be reset when user still have shopping actions within 30 mins in processing time.
Assume transaction completion is also a user_id keyed event. When an transaction completion event is seen, timer can be deleted[3].
update:
This idea is completely on processing time domain so it will have false alarm messages depending on lateness problem in system. So the question is how to improve this idea to event time domain so we have less false alarm. One possibility is event time based timer[4]. I am not clear what does event time based timer mean at this moment.
[1] https://beam.apache.org/blog/2017/02/13/stateful-processing.html
[2] https://docs.google.com/document/d/1zf9TxIOsZf_fz86TGaiAQqdNI5OO7Sc6qFsxZlBAMiA/edit#
[3] https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/state/Timers.java#L45
[4] https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/state/TimeDomain.java#L33
TL:DR; Is it possible to create a custom trigger that only fires if some flag is set? Is it possible to deploy the job with a trigger with a huge delay while we know a large data event is happening, and then deploy an update to the job with the trigger having a normal or no delay once that event is finished?
Following on from: Remove duplicates across window triggers/firings
The situation where this happens the most problematically (millions of duplicate firings) is when we're doing a backfill of old data. Given we know when this was happening I was wondering if we could implement a custom trigger that doesn't fire while a flag is set. Is that something that would be possible? Alternatively, could we deploy the job with a trigger that includes a huge delay while backfill is going on, and then issue an update with the normal trigger when it's finished?
Dataflow does not yet support custom triggers, or triggers based on some separate piece of metadata. However, you can change the frequency of a processing time trigger with Update; just change the value of the plusDelay() builder function and run with --update as normal.
I like to track time spent on JIRA issues when I click on Start Progress and then Stop Progress, or Resolve.
Is it possible to get JIRA to automatically allocate time to the task, like say:
14:20: Clicked on Start Progress
14:45: Clicked on Stop Progress > Logs 25 minutes to the task
15:30: Clicked on Start Progress
15:45: Clicked on Resolve > Logs 15 minutes to the task.
Is this possible?
Yes, it is possible.
You might want to have a look at Listeners:
"A Listener is a class that implements one of the Listener interfaces. It is then called whenever events occur in JIRA. Using those events, you can then perform any action you want."
In your case you could implement the issueStarted, issueStopped and issueResolved method.
On issueStarted you could somehow save the current timestamp (e.g. in an invisible customfield) and on issueStopped/issueResolved you could trigger the creation of a worklog-entry.
There is an app on the Atlassian Marketplace, Clockwork Automated Timesheets, that does exactly that.
It fully integrates with Jira's workflow so that the time logged corresponds to how much time an issue was In Progress or had any other active status.
If necessary, the timers can always be started or stopped manually by using Start/Stop buttons.
There are also reports available.
At the moment the app is free.
Cheers,
Jacek
We are considering to switch from FogBugz to YouTrack.
So far YouTrack ticks all our boxes except automatic time tracking. In FogBugz we just select that we started to work on a feature and it tracks time for us, while in YouTrack logging time is a manual process. Is it possible to automate time tracing with YouTrack, perhaps by using a third party app?
For a reference, here is how "working on" automatic time tracking feature works in FogBugz: http://www.fogcreek.com/fogbugz/docs/70/topics/schedules/Workingon.html
Thanks in advance
there's a workflow called workTimer available that you can attach to your project. It starts a timer each time you move an issue to 'In progress' state and logs a work item each time you trasfer from 'In progress' to 'Fixed'. Hope this helps.
You may use https://github.com/kleder/timetracker. This allows you to record the time you spend on your tasks and synchronize it with youtrack.
My team uses YouTrack, but we've decided to go with TMetric for time tracking. Both systems connect with each other via a browser plugin, and then you can record time spent working in YouTrack by clicking the TMetric button inside each task.