How to write the JQ expression for dynamic partitioning from Kinesis console in CDK Kinesis configuration parameters? - aws-cdk

I have a JQ expression for dynamic partitioning in the Kinesis firehose as follows:
Key name JQ expression
time .time| strptime("%Y-%m-%dT%H:%M:%SZ") | mktime | strftime("%Y%m%d")
This was written for the below time format:
"time": "2020-01-29T17:26:50Z"
The above JQ expression gives me time format output as 20200129 which I use as my partition date while storing records from Kinesis.
How should I write it in the config parameters while writing Kinesis firehose code in AWS CDK?
s3_destination_conf = ds.ExtendedS3DestinationConfigurationProperty(
bucket_arn=output_bucket.bucket_arn,
dynamic_partitioning_configuration=ds.DynamicPartitioningConfigurationProperty(
enabled=True
),
processing_configuration=ds.ProcessingConfigurationProperty(
enabled=True,
processors=[
ds.ProcessorProperty(
type="MetadataExtraction",
parameters=[
ds.ProcessorParameterProperty(
parameter_name="JsonParsingEngine",
parameter_value="JQ-1.6",
),
ds.ProcessorParameterProperty(
parameter_name="MetadataExtractionQuery",
==== Issue line => parameter_value="{time: .time| strptime('%Y-%m-%dT%H:%M:%SZ') | mktime | strftime('%Y%m%d')}",
),
],
)
],
),
role_arn=role.role_arn,
buffering_hints=ds.BufferingHintsProperty(
interval_in_seconds=params.KinesisFirehose.BUFFER_INTERVAL_SEC,
size_in_m_bs=params.KinesisFirehose.BUFFER_SIZE_MB,
),
I tried playing a bit with the double quotes and single quotes in the JQ string(parameter_value="{time: .time| strptime('%Y-%m-%dT%H:%M:%SZ') | mktime | strftime('%Y%m%d')}") but it didn't help.

Try the following:
parameter_value="{time: .time| strptime(\'%Y-%m-%dT%H:%M:%SZ\') | mktime | strftime(\'%Y%m%d\')}"
You can also have a look on the string escape characters in python here.

Related

query for passive hosts to be removed?

Can someone please help me to remove passive hosts in splunk. the query i am using is:
| metadata type=hosts
| sort recentTime
| convert ctime(recentTime) as Latest
You should compare the recentTime with the current time, work out the difference and compare the difference with a threshold to identify those hosts
Example query:
| metadata type=hosts | eval diff=now()-recentTime | eval threshold=3600 | where diff>threshold
Note: query not tested but you should get the idea

cannot find "for loop" keyword in robot framework

I am currently connecting SQL server to robot framework, so i can read my data table name in robot. and I want to use for loop to check table name, somehow, ":FOR" loop keyword cannot found, but I have installed libraries such as operating-system, collections, string, built-in, diff-library and so on. anyone can help me why i cannot use for loop? any help will be appreciated.
The robot framework users guide has a whole section on how to use the for loop. From that section:
The syntax starts with :FOR, where colon is required to separate the
syntax from normal keywords. The next cell contains the loop variable,
the subsequent cell must have IN, and the final cells contain values
over which to iterate. These values can contain variables, including
list variables.
Here's an example from the user's guide, reformatted to use pipes (for clarity):
*** Test Cases ***
| Example 1
| | :FOR | ${animal} | IN | cat | dog
| | | log | ${animal}
| | | log | 2nd keyword
| | Log | Outside loop
Maybe you are not escaping indented cells; as the Tip in the documentation says. Try writing loops like this:
:FOR ${index} IN RANGE ${start} ${stop}
\ log to console index: ${index}
\ Call a Keyword

Jenkins matrix parameters with multiple values?

As part of automating a legacy deployment process, I'm trying to build a Jenkins pipeline that can, among other operations, execute parameterized builds.
One of those is the ability to execute several commands against given list of services. Given a list of service names, e.g.
Mailer
Reporter
DbMigrator
...
etc, I'd like to run certain commands against some of those.
Using Extended Choice Parameter plugin, I was able to load this list from a properties file, and display it as a list of checkboxes, however, I am looking for a way I could build a "matrix" parameter with multiple values. My goal is to do something like this:
| Service | Opt1| Opt2|
|------------------------|
| Mailer | [x] | [ ] |
| Reporter | [x] | [x] |
| DbMigrator | [ ] | [x] |
| ... | | |
So that I can apply several values (Opt1 and/or Opt2) to one parameter (e.g. Mailer).
Is there a way in Jenkins to do this?
or
Is there a better way of doing this?
You can try the matrix project plugin which comes bundled with later versions of Jenkins.
You can add a couple of groovy axis
which allows you to do this
import groovy.json.JsonSlurper
def result = []
def inputFile = new File("/path/to/prop.json")
def InputJSON = new JsonSlurper().parseText(inputFile.text)
InputJSON.prop1.each{ result << it }
return result
with this sort of JSON
{
"parm1": [
"Mailer",
"Reporter",
"DbMigrator"
],
"parm2": [
"opt1",
"opt2"
],
"filter": "parm1=='Mailer'"
}
These is also a filter option in the matrix to restrict to various combinations. I was trying to make it evaluate that from the 'filter' property above (creating an environment variable using the EnvInject and another groovy script), so far with no success but you can use a string
parm1=='Mailer' || parm2 = 'Opt2'

How to parse SPARQL results?

I am using Twinkle (A SPARQL Query Tool). I did a SPARQL over a RDF file, and got a results file like below. Since it doesn't seems a typical file format like CSV, do you know a library to parse this format? Any programming language is fine.
---------------------------------------------------------------------
| name |
=====================================================================
| "Egypt"^^<http://www.w3.org/2001/XMLSchema#string> |
| "Iraq"^^<http://www.w3.org/2001/XMLSchema#string> |
| "Jordan"^^<http://www.w3.org/2001/XMLSchema#string> |
| "Kuwait"^^<http://www.w3.org/2001/XMLSchema#string> |
| "Libya"^^<http://www.w3.org/2001/XMLSchema#string> |
| "Mauritania"^^<http://www.w3.org/2001/XMLSchema#string> |
| "Somalia"^^<http://www.w3.org/2001/XMLSchema#string> |
| "Sudan"^^<http://www.w3.org/2001/XMLSchema#string> |
| "Syrian Arab Republic"^^<http://www.w3.org/2001/XMLSchema#string> |
| "Tunisia"^^<http://www.w3.org/2001/XMLSchema#string> |
| "United Arab Emirates"^^<http://www.w3.org/2001/XMLSchema#string> |
| "Yemen"^^<http://www.w3.org/2001/XMLSchema#string> |
---------------------------------------------------------------------
That's not any standard format, so you'd have to write a parser for that by hand; it looks like the default CLI output of a query command for a database (which one I wonder?).
The query command from the CLI probably has the option to provide standard SPARQL results formats, such as SPARQL/XML or SPARQL/JSON, which you can use any standard RDF library, such as Jena or Sesame if you are working in Java, to parse the results in that format. That is the best way to accomplish what you're attempting.
Generally, you should not interface programmatically with CLI output and instead use API's provided with the database.
That looks like it could be Jena output.
The ResultSetFormatter class contains ways to format results in all the standard formats (XML, JSON, TSV, CSV) as well as this display format in text.
ResultsetFormatter.outputAsXML
ResultsetFormatter.outputAsJSON
ResultsetFormatter.outputAsTSV
ResultsetFormatter.outputAsCSV
The text format is not for parsing - more for simple display and debugging.
The command line has args to set the results format e.g. --results json
And the query form in Fuseki allows you to choose the output format.
The format you see is a Typed RDF literal. The URI http://www.w3.org/2001/XMLSchema#string is a XSD type "string" saying that your value is just a "string" (it could be an "int", etc...). If you just want the value, you can omit the URI after "^^" or use the STR function in your SPARQL query.

Use grep -A1 to return a value in the second line as long as a numeric value in the first line is met

I have log entries that are paired two lines each. I have to parse the first line to extract
a number to know if it is greater than 5000. If this number is greater than 5000 then I need to return the second line, which will also be parsed to retrieve an ID.
I know how to grep all of the info and to parse it. I don't know how to make the grep ignore
things if they are less than a particular value. Note that I am not committed to using grep if some
other means like awk/sed can be substituted.
Raw Data (two lines separated for example clarity).
The target of my grep is the number 5001
following "credits extracted = ", if this is over 5000 then I want to return number "12345" from
the second line --------------------------
2012-03-16T23:26:12.082358 0x214d000 DEBUG ClientExtractAttachmentsPlayerMailTask for envelope 22334455 finished: credits extracted = 5001, items extracted count = 0, status = 0. [Mail.heomega.mail.Mail](PlayerMailTasks.cpp:OnExtractAttachmentsResponse:944)
2012-03-16T23:26:12.082384 0x214d000 DEBUG Mail Cache found cached mailbox for: 12345 [Mail.heomega.mail.Mail](MailCache.cpp:GetCachedMailbox:772)
Snippits --------------------------
-- Find the number of credits extracted, without the comma noise:
grep "credits extracted = " fileName.log | awk '{print $12}' | awk -F',' '{print $1}'
-- Find the second line's ID no matter what the value of credits extracted is:
grep -A1 "credits extracted = " fileName.log | grep "cached mailbox for" | awk -F, '{print $1}' | awk '{print $10}'
-- An 'if' statement symbolizing the logic I need to acquire:
v_CredExtr=5001; v_ID=12345; if [ $v_Cred -gt 5000 ]; then echo $v_ID; fi;
You can do everything with a single AWK filter I believe:
#!/usr/bin/awk -f
/credits extracted =/ {
credits = substr($12, 1, length($12) - 1) + 0
if (credits > 5000)
show_id = 1
next
}
show_id == 1 {
print $10
show_id = 0
}
Obviously, you can stuff all the AWK script in a shell string inside a script, even multiline. I showed it here in its own script for clarity.
P.S: Please notify when it works ;-)

Resources