I'm trying to build an architecture where a single Lambda is triggered on a schedule with multiple parameter sets.
So for example if I have three sets of parameters and set schedule to ten minutes I expect to get three executions every ten minutes.
Is there a way to trigger an EventBridge scheduled events with custom properties so I can pass parameters to Lambda? I've noticed the details property in the event schema but couldn't find any reference to its usage with scheduled events.
To trigger a single lambda function for multiple parameter sets, you can create a separated schedule rule for each parameter set.
To provide input to your triggered lambda function you can set "configure input" when you select your lambda function as a target, for example you can provide your input in json format.
Related
I have an alert in GCP which checks that total_streaming_data_processed produces values for all active dataflow jobs within some period. The query for the alert is defined as:
fetch dataflow_job
| metric 'dataflow.googleapis.com/job/total_streaming_data_processed'
| filter
(resource.job_name =~ '.*dataflow.*')
| group_by 30m,
[value_total_streaming_data_processed_mean:
mean(value.total_streaming_data_processed)]
| every 30m
| absent_for 1800s
This alert seems to fire even for dataflow jobs which have been recently drained. I suppose the alert is working as intended but we would like to tune this alert to only check fire for jobs in a running state. I believe the metric to use here is dataflow.googleapis.com/job/status but I'm having trouble merging these two metrics in the same alert. What's the best way to have an alert check against two different metrics and only fire when both conditions are
Tried to add the second metric dataflow.googleapis.com/job/status but the mql editor returns "Line 5: Table operation 'metric' expects 'Resource' input, but input is 'Table'." when I try to pass a second metric
I have a SpringBoot application that is under moderate load. I want to collect metric data for a few of the operations of my app. I am majorly interested in Counters and Timers.
I want to count the number of times a method was invoked (# of invocation over a window, for example, #invocation over last 1 day, 1 week, or 1 month)
If the method produces any unexpected result increase failure count and publish a few tags with that metric
I want to time a couple of expensive methods, i.e. I want to see how much time did that method took, and also I want to publish a few tags with metrics to get more context
I have tried StatsD-SignalFx and Micrometer-InfluxDB, but both these solutions have some issues I could not solve
StatsD aggregates the data over flush window and due to aggregation metric tags get messed up. For example, if I send 10 events in a flush window with different tag values, and the StatsD agent aggregates those events and publishes only one event with counter = 10, then I am not sure what tag values it's sending with aggregated data
Micrometer-InfluxDB setup has its own problems, one of them being micrometer sending 0 values for counters if no new metric is produced and in that fake ( 0 value counter) it uses same tag values from last valid (non zero counter)
I am not sure how, but Micrometer also does some sort of aggregation at the client-side in MeterRegistry I believe, because I was getting a few counters with a value of 0.5 in InfluxDB
Next, I am planning to explore Micrometer/StatsD + Telegraf + Influx + Grafana to see if it suits my use case.
Questions:
How to avoid metric aggregation till it reaches the data store (InfluxDB). I can do the required aggregation in Grafana
Is there any standard solution to the problem that I am trying to solve?
Any other suggestion or direction for my use case?
I am trying to schedule a job to be triggered daily at 8 pm.
Currently, my Build periodically with parameters field looks somewhat like this:
0 20 * * * % name=somename; type=sometype
Now, I need to add another parameter called categories which can have multiple values, i.e., it is of the type Multi Select (Extended Choice Parameter plugin).
Can someone help me out with how this can be done? Thanks in advance.
Hi after performing a group by key on a KV Pcollection, I need to:-
1) Make every element in that PCollection a separate individual PCollection.
2) Insert the records in those individual PCollections into a BigQuery Table.
Basically my intention is to create a dynamic date partition in the BigQuery table.
How can I do this?
An example would really help.
For Google Dataflow to be able to perform the massive parallelisation which makes it as one of its kind (as a service on the public cloud), the job flow needs to be predefined before submitting it to on the Google cloud console. Everytime you execute the jar file that conatins your pipleline code (which includes pipeline options and the transforms), a json file with the description of the job is created and submitted to Google cloud platform. The managed service then uses this to execute your job.
For the use case mentioned in the question, it demands that the input PCollection be split into as many PCollections as their are unique dates. For the split, the Tuple Tags needed to split the collection should be created dynamically which is not possible at this time. Creating tuple tags dynamically is not allowed because that doesn't help in creating the job description json file and beats the whole design/purpose with which dataflow was built.
I can think of a couple of solutions to this problem (both having its own pros and cons) :
Solution 1 (a workaround for the exact use case in the question):
Write a dataflow transform that takes the input PCollection and for each element in the input -
1. Checks the date of the element.
2. Appends the date to a pre-defined Big Query Table Name as a decorator (in the format yyyyMMDD).
3. Makes an HTTP request to the BQ API to insert the row into the table with the table name added with a decorator.
You will have to take into consideration the cost perspective in this approach because there is single HTTP request for every element rather than a BQ load job that would have done it if we had used the BigQueryIO dataflow sdk module.
Solution 2 (best practice that should be followed in these type of use cases):
1. Run the dataflow pipeline in the streaming mode instead of batch mode.
2. Define a time window with whatever is suitable to the scenario in which it is being is used.
3. For the `PCollection` in each window, write it to a BQ table with the decorator being the date of the time window itself.
You will have to consider rearchitecting your data source to send data to dataflow in the real time but you will have a dynamically date partitioned big query table with the results of your data processing being near real time.
References -
Google Big Query Table Decorators
Google Big Query Table insert using HTTP POST request
How job description files work
Note: Please mention in the comments and I will elaborate the answer with code snippets if needed.
I have the following JMeter test plan.
+Test Plan
+Login Thread Group
HttpRequest1
HttpRequest2
HttpRequest3
Is there a way to automatically view\monitor the average of sums of HttpRequest1 ,2 and 3?
I couln't found a way to do it in "Summary Report" or "Aggregate Report"
Is it possible? or do I have to do it manually?
Do you explicitly mean 'the average of sums' As in the average of the total sum for each request over the duration of the test run? If so, then I'm not aware of any JMeter listeners will show you the sum of elapsed time for a sampler, it's not something typically required. Instead, you could probably get what you need fairly easily from reading the jtl file at the command line.
But perhaps you meant something else, you might find that using a Transaction Controller serves you requirements. This will record and show the total elapsed time for multiple requests. So in your example, you might wrap HTTPRequest1, 2 & 3 in a transaction controller and this would give you the sum of all three requests. Then, the Aggregate and Summary listeners would show you the average for this transaction as a separate line.