How to take a particular data using GREP? - grep

Command:
grep -oP '(?<=\"name\":\")[^"]*|(?<=\"title\":\")[^"]*' *.json >newjson
o/p getting as,
10XANY10G_1.json:chMax
10XANY10G_1.json:Max Frequency in GHz
10XANY10G_1.json:up
10XANY10G_1.json:UP
10XANY10G_1.json:down
10XANY10G_1.json:DOWN
10XANY10G_1.json:CapabilityList
10XANY10G_1.json:Capabilities
10XANY10G_1.json:encoding
10XANY10G_1.json:Encoding
expected o/p:
chMax:"Max Frequency in GHz",
up:"UP",
down:"DOWN",
contents of file:
{"card":{"cardName":"10AN10G","portSignalRates":["10AN10G-1-OTU2","10AN10G-1-OTU2E","10AN10G-1-TENGIGE","10AN10G-1-STM64"],"listOfPort":{"10AN10G-1-OTU2":{"portAid":"10AN10G-1-OTU2","signalType":"OTU2","tabNames":["PortDetails"],"requestType":{"PortDetails":"PTP"},"paramDetailsMap":{"PortDetails":[{"type":"dijit.form.TextBox","name":"signalType","title":"Signal Rate","id":"","options":[],"label":"","value":"OTU2","checked":"","enabled":"false","selected":""},{"type":"dijit.form.TextBox","name":"userLabel","title":"Description","id":"","options":[],"label":"","value":"","checked":"","enabled":"true","selected":""},{"type":"dijit.form.Select","name":"Frequency","title":"Transmit Frequency",}}}}}}

I think you're looking for this,
$ grep -oP '(?<=\"name\":\")[^"]*|(?<=\"title\":)[^,]*' file
signalType
"Signal Rate"
userLabel
"Description"
Frequency
"Transmit Frequency"
To get the desired output
$ grep -oP '(?<=\"name\":\")[^"]*|(?<=\"title\":)[^,]*' file | paste -d: - -
signalType:"Signal Rate"
userLabel:"Description"
Frequency:"Transmit Frequency"

I think that your problem is that you are ORing the two groups with | try removing the | and you will get closer to what you are looking for but you may have to add a term to skip any intervening tags for cases where name and title are not immediately after each other, then you might have to get clever to deal with the case that has an entry with a name but no title.
As said grep is not the best tool for parsing json there are numerous others - personally I would suggest using python and the json library to load your file then output the tags that you need.

Here is an gnu awk (due to RS) version to extract data:
awk -F\" '/title/ {print $3":"$7}' RS='name' file
signalType:Signal Rate
userLabel:Description
Frequency:Transmit Frequency

Related

Grep's word boundaries include spaces?

I tried to use grep to search for lines containing the word "bead" using "\b" but it doesn't find the lines containing the word "bead" separated by space. I tried this script:
cat in.txt | grep -i "\bbead\b" > out.txt
I get results like
BEAD-air.JPG
Bead, 3 sided MET DP110317.jpg
Bead. -2819 (FindID 10143).jpg
Bead(Gem), Artefacts of Phu Hoa site(Dong Nai province).jpg
Romano-British pendant amulet (bead) (FindID 241983).jpg
But I don't get the results like
Bead fun.jpg
Instead of getting some 2,000 lines, I'm only getting 92 lines
My OS is Windows 10 - 64 bit but I'm using grep 2.5.4 from the GnuWin32 package.
I've also tried the MSYS2, which includes grep 3.0 but it does the same thing.
And then, how can I search for words separated by space?
LATER EDIT:
It looks like grep has problems with big files. My input file is 2.4 GB in size. With smaller files, it works - I reported the bug here: https://sourceforge.net/p/getgnuwin32/discussion/554300/thread/03a84e6b/
Try this,
cat in.txt | grep -wi "bead"
-w provides you a whole word search
What you are doing normally should work but there are ways of setting what is and is not considered a word boundary. Rather than worry about it please try this instead:
cat in.txt | grep -iP "\bbead(\b|\s)" > out.txt
The P option adds in Perl regular expression power and the \s matches any sort of space character. The Or Bar | separates options within the parens ( )
While you are waiting for grep to be fixed you could use another tool if it is available to you. E.g.
perl -lane 'print if (m/\bbead\b/i);' in.txt > out.txt

Grep Filenames from ls for specific part of them

I want to extract a specific part out of the filenames to work with them.
Example:
ls -1
REZ-Name1,Surname1-02-04-2012.png
REZ-Name2,Surname2-07-08-2013.png
....
So I want to get only the part with the name.
How can this be achieved ?
There are several ways to do this. Here's a loop:
for file in REZ-*-??-??-????.png
do
name=${file#*-}
name=${name%-??-??-????.png}
echo "($name)"
done
Given a variety of filenames with all sorts of edge cases from spacing, additional hyphens and line feeds:
REZ-Anna-Maria,de-la-Cruz-12-32-2015.png
REZ-Bjørn,Dæhlie-01-01-2015.png
REZ-First,Last-12-32-2015.png
REZ-John Quincy,Adams-11-12-2014.png
REZ-Ridiculous example # this is one filename
is ridiculous,but fun-22-11-2000.png # spanning two lines
it outputs:
(Anna-Maria,de-la-Cruz)
(Bjørn,Dæhlie)
(First,Last)
(John Quincy,Adams)
(Ridiculous example
is ridiculous,but fun)
If you're less concerned with correctness, you can simplify it further:
$ ls | grep -o '[^-]*,[^-]*'
Maria,de
Bjørn,Dæhlie
First,Last
John Quincy,Adams
is ridiculous,but fun
In this case, cut makes more sense than grep:
ls -l | cut -f2 -d-
cut the second field from the input, using '-' as the field delimiter. That other guy's answer will correctly handle some cases mine will not, but for one off uses, I generally find the semantics of cut to be much easier to remember.

How can I grep a file for multiple unique values?

I have some firewall logs and I want to find multiple unique values. I need to find every unique combination of source IP and destination port, which are in this format in /var/log/iptables.
SRC=123.123.123.123
DPT=137
So, if source IP 123.123.123.123 makes multiple appearances on multiple ports, I want to see that but, just once for each SRC/DPT combo.
Thanks!
This awk solution might help. The first awk command combines each pair of successive SRC and DPT lines into a single line. The output from this command is then piped to the second awk command, which provides uniquefied output, retaining original order
awk '/^SRC|^DPT/{ORS=$0 ~ /^SRC/?" ":"\n"; print}' file.* | awk '!a[$0]++'
If multiple SRC, DPT entries exist per line, the following should work
grep -oE 'SRC=[[:digit:].]+[[:space:]]+DPT=[[:digit:].]+' file.txt | awk '!a[$0]++'
You can try "grep AND", see examples from the link:
http://www.thegeekstuff.com/2011/10/grep-or-and-not-operators/

Recursively grep results and pipe back

I need to find some matching conditions from a file and recursively find the next conditions in previously matched files , i have something like this
input.txt
123
22
33
The files where you need to find above terms in following files, the challenge is if 123 is found in say 10 files , the 22 should be searched in these 10 files only and so on...
Example of files are like f1,f2,f3,f4.....f1200
so it is like i need to grep -w "123" f* | grep -w "123" | .....
its not possible to list them manually so any easier way?
You can solve this using awk script, i ve encountered a similar problem and this will work fine
awk '{ if(!NR){printf("grep -w %d f*|",$1)} else {printf("grep -w %d f*",$1)} }' input.txt | sh
What it Does?
it reads input.txt line by line
until it is at last record , it prints grep -w %d | (note there is a
pipe here)
which is then sent to shell for execution and results are piped back
to back
and when you reach the end the pipe is avoided
Perhaps taking a meta-programming viewpoint would help. Have grep output a series of grep commands. Or write a little PERL program. Maybe Ruby, if the mood suits.
You can use grep -lw to write the list of file names that matched (note that it will stop after finding the first match).
You capture the list of file names and use that for the next iteration in a loop.

grep for a string which has a specific number in the end

I want to grep for the string THREAD: 2. It has a space in between. Not able to figure out how.
I tried with grep "THREAD:[ \2]", but its not working
Please let me know.
Try grep "THREAD: 2" <filename>? You just want a literal '2', right?
If you are using GNU grep you could try using the alias egrep or grep -e with 'THREAD: 2$'
You might have to use '^.*THREAD: 2$'
grep reports back the entire line that has matched your pattern. If you wish to look at lines that contains THREAD: 2 then the following should work -
grep "THREAD: 2" filename
However, if you wish to fetch lines that could contain THREAD: and any number then you can use a character class. So in that case the answer would be -
grep "THREAD: [0-9]" filename
You can add + after the character class which means one or more numbers so that you can match numbers like 1,2,3 or 11,12,13 etc.
If you only want to fetch THREAD: 2 from your line then you will have to use an option of grep which is -o. It means show me only my pattern from the file not the entire line.
grep -o "THREAD: 2" filename
You can look up man page for grep and play around with all the options.

Resources