How to grep word, exclude this word from the string - grep

I need to find all strings, that contain "foo", but exclude this "foo"
grep "foo" | grep -ve "foo" return zero strings
Example input:
"category": "aaaa",
"amount": 0.01208210,
"vout": 0,
"fee": 0.00007523,
"confirmations": 12345,
"blockhash": "12345",
"blockindex": 12345,
"blocktime": 12345,
On output I need just:
0.01208210

Related

Filtering by comparing two streams one-on-one in jq

I have streams
{
"key": "a",
"value": 1
}
{
"key": "b",
"value": 1
}
{
"key": "c",
"value": 1
}
{
"key": "d",
"value": 1
}
{
"key": "e",
"value": 1
}
And
(true,true,false,false,true)
I want to compare the two one-on-one and only print the object if the corresponding boolean is true.
So I want to output
{
"key": "a",
"value": 1
}
{
"key": "b",
"value": 1
}
{
"key": "e",
"value": 1
}
I tried (https://jqplay.org/s/GGTHEfQ9s3)
filter:
. as $input | foreach (true,true,false,false,true) as $dict ($input; select($dict))
input:
{
"key": "a",
"value": 1
}
{
"key": "b",
"value": 1
}
{
"key": "c",
"value": 1
}
{
"key": "d",
"value": 1
}
{
"key": "e",
"value": 1
}
But I get output:
{"key":"a","value":1}
{"key":"a","value":1}
null
{"key":"b","value":1}
{"key":"b","value":1}
null
{"key":"c","value":1}
{"key":"c","value":1}
null
{"key":"d","value":1}
{"key":"d","value":1}
null
{"key":"e","value":1}
{"key":"e","value":1}
null
Help will be appreciated.
One way would be to read in the streams as arrays, use transpose to match their items, and select by one and output the other:
jq -s '[.,[(true,true,false,false,true)]] | transpose[] | select(.[1])[0]' objects.json
Demo
Another approach would be to read in the streams as arrays, convert the booleans array into those indices where conditions match, and use them to reference into the objects array:
jq -s '.[[(true,true,false,false,true)] | indices(true)[]]' objects.json
Demo
The same approach but using nth to reference into the inputs stream requires more precaution, as the successive consumption of stream inputs demands the provision of relative distances, not absolute positions to nth. A conversion can be implemented by successively checking the position of the next true value using index and a while loop:
jq -n 'nth([true,true,false,false,true] | while(. != []; .[index(true) + 1:]) | index(true) | values; inputs)' objects.json
Demo
One could also use reduce to directly iterate over the boolean values, and just select any appropriate input:
jq -n 'reduce (true,true,false,false,true) as $dict ([]; . + [input | select($dict)]) | .[]' objects.json
Demo
A solution using foreach, like you intended, also would need the -n option to not miss the first item:
jq -n 'foreach (true,true,false,false,true) as $dict (null; input | select($dict))' objects.json
Demo
Unfortunately, each invocation of jq can currently handle at most one external JSON stream. This is not usually an issue unless both streams are very large, so in this answer I'll focus on a solution that scales. In fact, the amount of computer memory required is miniscule no matter how large the streams may be.
For simplicity, let's assume that:
demon.json is a file consisting of a stream of JSON boolean values (i.e., not comma-separated);
object.json is your stream of JSON objects;
the streams have the same length;
we are working in a bash or bash-like environment.
Then we could go with:
paste -d '\t' demon.json <(jq -c . objects.json) | jq -n '
foreach inputs as $boolean (null; input; select($boolean))'
So apart from the startup costs of paste and jq, we basically only need enough memory to hold one of the objects in objects.json at a time. This solution is also very fast.
Of course, if objects.json were already in JSONL (JSON-lines) format, then the first call to jq above would not be necessary.

OPA masking a dynamic array field

I'm trying to apply masking on an input and result field that is part of an array. And the size of the array is dynamic. Based on the documentation, it is instructed to provide absolute array index which is not possible in this use case. Do we have any alternative?
Eg. If one needs to mask the age field of all the students from the input document?
Input:
"students" : [
{
"name": "Student 1",
"major": "Math",
"age": "18"
},
{
"name": "Student 2",
"major": "Science",
"age": "20"
},
{
"name": "Student 3",
"major": "Entrepreneurship",
"age": "25"
}
]
If you want to just generate a copy of input that has a field (or set of fields) removed from the input, you can use json.remove. The trick is to use a comprehension to compute the list of paths to remove. For example:
paths_to_remove := [sprintf("/students/%v/age", [x]) | some x; input.students[x]]
result := json.remove(input, paths_to_remove)
If you are trying to mask fields from the input document in the decision log using the Decision Log Masking feature then you would write something like:
package system.log
mask[x] {
some i
input.input.students[i]
x := sprintf("/input/students/%v/age", [i])
}

Parsing stops on null or []

I have this filter working well but after a new use case where the property "p" can be null or a empty array [], the parser stop to evaluate the expression.
".p[]?.product.productId" the issue is here, when p is null or an empty array [].
When I have the p property like this, it works well. [{}] or [{"id":123}]
I'm breaking the filter in lines to make it easy to understand.
.p as $p
| .p[]?.product.productId as $pa
| .io[]
| select(.product.productId == ($pa) or .description == "product description x")
| .product.productId as $pid
| {"offerId": .offerId,
"description": .description,
"required":
"($p[] | select(.product.productId == $pid) | .required)",
"applied": false,
"amount": (if .prices | length == 0
then 0
elif .prices[0].amount != null
then .prices[0].amount
else .prices[0].amountPercentage
end)}
Input:
{
"p": null,
"io": [{
"offerId": 5593,
"description": "product description x",
"product": {
"productId": 393,
"description": "product description x 2",
"type": "Insurance"
},
"prices": [
{
"amount": null,
"amountPercentage": 4.13999987,
"status": "On"
}
]
}]
}
All I want is to be able to ignore the P when it is null or [].
*I'm aware about this literal expression "($p[] | select(.product.productId == $pid) | .required)"
jqplay.org/s/wYwKUFM2XR
Regards
E? is like try E catch empty, whereas what you seem to want is either try E catch null or perhapsE? // null
.p[]? is not the same as .p?[] or .p?[]?:
$ jq -n '[] | .p[]?'
jq: error (at <unknown>): Cannot index array with string "p"
$ jq -n '[] | .p?[]'
$
$ jq -n '[] | .p?[]?'
$
Specifically, .p[] is like .p | (try .[] catch empty), so there is nothing to stop the .p from raising an exception.
You might like to consider using try explicitly:
$ jq -n '[] | try .p[] catch null'
$

KSQL - Select Columns from Array of Struct as Arrays

Similar to KSQL streams - Get data from Array of Struct, my input JSON looks like:
{
"Obj1": {
"a": "abc",
"b": "def",
"c": "ghi"
},
"ArrayObj": [
{
"key1": "1",
"key2": "2",
"key3": "3"
},
{
"key1": "4",
"key2": "5",
"key3": "6"
},
{
"key1": "7",
"key2": "8",
"key3": "9"
}
]
}
I have created a stream with:
CREATE STREAM Example1(Obj1 STRUCT<a VARCHAR, b VARCHAR, c VARCHAR>, ArrayObj ARRAY<STRUCT<key1 VARCHAR, key2 VARCHAR, key3 VARCHAR>>) WITH (kafka_topic='sample_topic', value_format='JSON', partitions=1);
However, I would like only a single row of output from each input JSON document, with the data from each column in the array flattened into arrays, like:
a b key1 key2 key3
abc def [1, 4, 7] [2, 5, 8] [3, 6, 9]
Is this possible with KSQL?
At present you can only flatten ArrayObj in the way you want if you know up front how many elements it will have:
CREATE STREAM flatten AS
SELECT
Obj1.a AS a,
Obj1.b AS b,
ARRAY[ArrayObj[1]['key1'], ArrayObj[2]['key1'], ArrayObj[3]['key1']] as key1,
ARRAY[ArrayObj[1]['key2'], ArrayObj[2]['key2'], ArrayObj[3]['key2']] as key2,
ARRAY[ArrayObj[1]['key3'], ArrayObj[2]['key3'], ArrayObj[3]['key3']] as key3,
FROM Example1;
I guess if you new the array was going to be up to a certain size you could just a case statement to selectively extract the elements, e.g.
-- handles arrays of size 2 or 3 elements, i.e. third element is optional.
CREATE STREAM flatten AS
SELECT
Obj1.a AS a,
Obj1.b AS b,
ARRAY[ArrayObj[1]['key1'], ArrayObj[2]['key1'], ArrayObj[3]['key1']] as key1,
ARRAY[ArrayObj[1]['key2'], ArrayObj[2]['key2'], ArrayObj[3]['key2']] as key2,
CASE
WHEN ARRAY_LENGTH(ArrayObj) >= 3)
THEN ARRAY[ArrayObj[1]['key3'], ArrayObj[2]['key3'], ArrayObj[3]['key3']]
ELSE
null
as key3,
FROM Example1;
If that doesn't suit your needs then the design discussion going on at the moment around lambda function support in ksqlDB may be of interest: https://github.com/confluentinc/ksql/pull/5661

Same key, different values: nested dicts of dicts

Borrowing an MWE from this question, I have a set of nested dicts of dicts:
{
"type": "A"
"a": "aaa",
"payload": {"another":{"dict":"value", "login":"user1"}},
"actor": {"dict":"value", "login":"user2"}
}
{
"type": "B"
"a": "aaa",
"payload": {"another":{"dict":"value", "login":"user3"}},
"actor": {"dict":"value", "login":"user4"}
}
}
{
"type": "A"
"a": "aaa",
"b": "bbb",
"payload": {"another":{"dict":"value", "login":"user5"}},
"actor": {"dict":"value", "login":"user6"}
}
}
{
"type": "A"
"a": "aaa",
"b": "bbb",
"payload": {"login":"user5"},
"actor": {"login":"user6"}
}
}
For dictionaries that have "type":"A", I want to get the username from the payload dict and the username from actor dict. The same username can appear multiple times. I would like to store a txt file with a list of actor (ID1) and a list of payload (ID2) like this:
ID1 ID2
user2 user1
user6 user5
user6 user5
Right now, I have a start:
zgrep "A" | zgrep -o 'login":"[^"]*"' | zgrep -o 'payload":"[^"]*" > usernames_list.txt
But of course this won't work, because I need to find login within the payload dict and login within the actor dict for each dict of type A.
Any thoughts?
I am assuming you have the payload and actor dictionaries for all entries of type A.
Parse out the user name from the payload entries and redirect them
to a file named payload.txt
Parse out the user name from actor entries and redirect them to a
different file named actor.txt
Use paste command to join the entries and output them the way you want it

Resources