variable does not exist in sas data set - grep

I am trying to check whether variable GROUP exist in SAS data set file or not from the UNIX command but unfortunately it's showing that GROUP variable does not exist in the data set,However GROUP variable is present in SAS data set.
In my command for case sensitive and whole word match I am using i and w options of grep command respectively. But still UNIX command is not giving the expected result.I s there any way to fix this issue?
Below is the command which I am using:
sasfile="sasdata"
rwords="GROUP"
cat $sasfile | grep -iqw "$rwords"
Thank you

As mentioned in earlier comment
SAS data sets are stored in disk files using a proprietary format.
There may be encodings and storage methodologies that do not yield the
information you seek in a plain text examination of said disk file.
Running SAS code in a SAS session is the definitive way to glean information about a data set.
What will that code look like ?
Proc CONTENTS
Data step or macro code that uses VARNAME function
... many other ways ...
In UNIX SAS can use stdio.
From "SAS(R) 9.2 Companion for UNIX Environments", STDIO System Option: UNIX
Details
This option tells SAS to take its input from standard input (stdin),
to write its log to standard error (stderr), and to write its output
to standard output (stdout).
This option is designed for running SAS
in batch mode or from a shell script. If you specify this option
interactively, SAS starts a line mode session.
The STDIO option
overrides the DMS, DMSEXP, and EXPLORER system options. The STDIO
option does not affect the assignment of the Stdio, Stdin, and Stderr
filerefs. See Filerefs Assigned by SAS in UNIX Environments for more
information.
For example, in the following SAS command, the file
myinput is used as the source program, and files myoutput and mylog
are used for the procedure output and log respectively.
sas -stdio < myinput > myoutput 2> mylog
If you are using the C shell, you should
use parentheses:
(sas -stdio < myinput > myoutput ) >& output_log
With -stdio you want a short SAS program that can indicate if a variable is present in a data set, or perhaps emit a list of variables in a data set for further shell processing. A Proc CONTENTS step is short and sweet.
So looking for your proverbial needle in a haystack
sasfile=<path to data set file>/<dataset>.sas7bdat
needle=GROUP
echo "Proc CONTENTS data=""$sasfile""" | sas -stdio | grep $needle
The default CONTENTS output might contain yield some false matches. So you could also try
echo "Proc CONTENTS noprint data=""$sasfile"" out=list;data _null_;set list;file print;put name;"
| sas -stdio
| grep -i "GROUP"

You could try:
sasfile="sasdata"
rwords="GROUP"
grep -iw "$rwords" "$sasfile"
The only difference between these and your original commands is that I omitted cat and grep's quiet flag -q.
Sample input in sasdata:
fasd group
fdsfds fdsfdsa
fdsfd as GROUP afdsfdsa
Output:
fasd group
fdsfd as GROUP afdsfdsa
The -q flag of grep will suppress standard output but echo $? can retrieve the return value of grep. Using the same input file as before:
grep -iqw "$rwords" "$sasfile" # No stout
echo $? # Prints 0, means grep succeeded
grep -iqw "word" "$sasfile" # No stout
echo $? # Prints 1, means grep failed

Related

How to see the GNU debuglink value of an ELF file?

So I can add a link to a debug symbol file like this objcopy --add-gnu-debuglink=$name.dbg $name, but how can I later retrieve that value again?
I checked with readelf -a and grepped for \.dbg without any luck. Similarly I checked the with objdump -sj .gnu_debuglink (.gnu_debuglink is the section) and could see the value there:
$ objdump -sj .gnu_debuglink helloworld|grep \.dbg
0000 68656c6c 6f776f72 6c642e64 62670000 helloworld.dbg..
However, would there be a command that allows me to extract the retrieve the exact value again (i.e. helloworld.dbg in the above example)? That is the file name only ...
I realize I could use some shell foo here, but it seems odd that an option exists to set this value but none to retrieve it. So I probably just missed it.
You can use readelf directly:
$ readelf --string-dump=.gnu_debuglink helloworld
String dump of section '.gnu_debuglink':
[ 0] helloworld
[ 1b] 9
I do not know what the second entry means (it seems to always be different). To get rid of the header and the offsets, you can use sed:
$ readelf --string-dump=.gnu_debuglink helloworld | sed -n '/]/{s/.* //;p;q}'
helloworld
Something like this should work:
objcopy --output-target=binary --set-section-flags .gnu_debuglink=alloc \
--only-section=.gnu_debuglink helloworld helloworld.dbg
--output-target=binary avoids adding ELF headers. --set-section-flags .gnu_debuglink=alloc is needed because objcopy only writes allocated sections by default (with the binary emulation). And --only-section=.gnu_debuglink finally identifies the answer. See this earlier answer.
Note that the generated file may have a trailing NUL byte and four bytes of CRC, so some post-processing is needed to extract everything up to the first NUL byte (perhaps using head -z -n 1 helloworld.dbg | tr -d '\0' or something similar).

(bash) grep -i not making search case insensitive for input files

I am trying to search inside a folder containing several files. The name of the files is written in upper case with a .sub extension in lower case:
AAA.sub
BBB.sub
CCC.sub
DDD.sub
I am searching a pattern trough those file using grep, however i would like to only use lower case letter for the input files.
In the man page for grep it is written:
-i, --ignore-case
Ignore case distinctions in both the PATTERN and the input files. (-i is specified by POSIX.)
So, if i understood properly:
grep -i subckt /schematics/aaa
and
grep -i subckt /schematics/AAA
Are supposed to both be able to search a pattern "subckt" in the file "aaa" regardless of its case (AAA or aaa) and if two files named aaa and AAA are present at the same time in the foler, i expect grep to search trough both of them.
However when i try my search with the 1st instruction (lower case) it does not work, giving me "no such file or directory" message.
When i try to search with the 2nd instruction (upper case) it works properly.
I obviously understood something wrong about how the -i option with grep, can anyone give me an answer regarding this matter?
Is it possible to be case insensitive with the input files when using grep?
EDIT:
My question was lacking details, even tough i have found the answer to my problem i will add the details in case someone else stumbles upon this:
I have one file that contains a list of each file name i want to grep. My list looks like this:
aaa capacitor C_0
bbb capacitor C_0
ccc resistor R_in
...
The grep is done inside a perl script, the perl script parses the list file and gets the name of each individual file name (aaa bbb ccc) inside a while loop.
However the name inside the list file is written in lower case whereas the name of the files i want to grep is written in upper case.
This is why i wanted to have the input file search to be case insensitive so that i could directly do a grep -i subck aaa and it would search inside the file 'AAA'
However, since the grep is launched from a perl script, and since it is apparently not possible to have grep behave like that, i used the uc() function of perl to convert aaa to AAA and do my grep with it. (see my answer below)
-i affects how the contents are searched, not the name of the files.
When the man page says "Ignore case distinctions in both the PATTERN and the input files." that really means that case is ignored in the pattern ( searching for AAA and aaa are equivalent) and the contents of the input files (a line would match if it includes "AAA" or "aaa" or even "AaA")
I think you want to either list all the filenames on the command line, or find a glob (i.e. wildcard) that matches all the filenames:
grep -i subckt *.sub
In Unix/Linux shells (bash, zsh, and so on) "*" is processed by the shell (bash) not the command (grep). The command receives the list of files and actually can't tell the difference between whether a user typed "grep foo *" and "grep foo file1 file2 file3" (if the directory includes those 3 files)
Please try the following command
find . -iname aaa.sub | grep -rn subckt
find with -iname option will list out files ignoring their case. In the above case find . -iname will list out both aaa.sub & AAA.sub. The output is piped to the grep command.
I have found a way to circumvent my problem by using the uc (upper case) function of perl to convert the input files for the grep function into upper case.
The grep command was launched from a perl script in the first place:
grep -i subckt /schematics/aaa
So, i just did that in my perl script:
$tmp=aaa
$tmp=uc($tmp)
grep -i subckt /schematics/$tmp
Now, the "aaa" name is just an example. In the perl script it is recovered from another parsed file that is written in lower case.
Thanks for the answers tough.
grep uses the filenames as they are listed on the command line. The -i option affects the contents of the files, not the names of the files.
You can use find to select filenames to be searched. The -iname option lets you match files ignoring case.
grep subckt $(find /schematics -iname aaa.sub -print)
If you have many filenames, or those filenames include spaces or other characters that would confuse the shell, the safe and secure way to do this is using the -print0 and -0 options:
find /schematics -iname aaa.sub -print0 | xargs -r -0 grep -i subckt

How to convert any GPX file to Xcode acceptable GPX file

I am trying to simulate a path in Xcode which has speed, latitude and longitude information.
There is a site which produces the same: http://www.bikehike.co.uk/mapview.php
I found one awk script which can convert this file to Xcode acceptable format: https://gist.github.com/scotbond/8a61cf1f4a43973e570b
Tried running this command in the terminal: awk -F script.awk bikehike_course >output.gpx
Where script.awk has the script, bikehike_course has the GPX file and output.gpx is the output file name
UPDATE
Tried: awk -f script.awk bikehike_course > output.gpx
Error: awk: syntax error at source line 1 source file adjust_gpx_to_apple_format.awk
context is
awk >>> ' <<<
awk: bailing out at source line 24
I think the syntax of the GPS file is broken.
The script on adjust_gpx_to_apple_format.awk on github is a call of awk with the awk script provided as parameter (in shell syntax).
Thus, the name adjust_gpx_to_apple_format.awk is somehow misleading.
Either the awk ' at the beginning and ' $1' at the end has to be removed. In this case,
awk -f adjust_gpx_to_apple_format.awk
should work (as the script looks like a correct awk script otherwise).
If left as is, the script might be called directly in the shell:
> ./adjust_gpx_to_apple_format.awk input.txt >output.txt
In the latter case, I would suggest two additions:
Insert a "hut" in the first line e.g. #!/bin/bash which makes it more obviously.
Rename the script to adjust_gpx_to_apple_format.sh.
Note:
Remember, that the file suffix does not have the strict meaning in Unix like shells as they have for example in MSDOS. Actually, the suffix could be anything (including nothing). It's more valuable for the user than the shell and should be chosen respectively.

search a pattern from a unix directory

I want to find whether any script in one unix directory is using one particular table as "select *" or not. Means I want to find "select * from tablename". Now after select there can be any number of space or newline like "select <any number of space or newline> * <any number of space or newline> from <any number of space or newline> tablename"
Let's take this as the test file:
$ cat file
select
*
from
tablename
Using grep:
$ grep -z 'select\s*[*]\s*from\s*tablename' -- file
select
*
from
tablename
-z tells to treat the input as NUL-separated. Since no sensible text file contains NUL characters, this has the effect of reading in the whole file at once. That allows us to search over multiple lines. (If the file is too big for memory, we would want to think about another approach.) To protect against file names which begin with -, the -- is used to tell grep to stop option processing.
To obtain just the names of the matching files in the current directory:
grep -lz 'select\s*[*]\s*from\s*tablename' -- *
* tells grep to look at all files in the directory. -l tells grep to just print the names of matching files and not the matching text.
More on the need for --
Let's consider a directory with one file:
$ ls
-l-
Now, lets run a grep command:
$ grep 'regex' *
grep: invalid option -- '-'
Usage: grep [OPTION]... PATTERN [FILE]...
Try 'grep --help' for more information.
The problem here is that grep interprets the file name as two options: l and -. Since the second is not a legal option, it reports an error. To protect against this, we need to use --. The following will run without error:
grep 'regex' -- *

grepping a not matching pattern with a pattern file and data from a pipe

I have an ignore.txt file:
cat ignore.txt
clint
when I do:
pip freeze | grep -v -f ignore.txt
I get:
GitPython==0.3.2.RC1
Markdown==2.2.1
async==0.6.1
clint==0.3.1
gitdb==0.5.4
legit==0.1.1
push-to-wordpress==0.1
python-wordpress-xmlrpc==2.2
smmap==0.8.2
but when I do:
pip freeze | grep -v clint
I do get the correct output:
GitPython==0.3.2.RC1
Markdown==2.2.1
async==0.6.1
gitdb==0.5.4
legit==0.1.1
push-to-wordpress==0.1
python-wordpress-xmlrpc==2.2
smmap==0.8.2
How can I achieve that with grep and command line tools?
Clarfication Edit: I use windows with cygwin so I believe this is GNU grep 2.6.3 (from grep --version)
Your syntax looks correct and works on my system.
There may be a problem with your ignore.txt file.
In particular, check that:
there are no leading or trailing spaces, tabs and the like around the word you are trying to filter (as suggested by Kent above)
the file has Unix line endings
the file is terminated by a single newline
About the latter, the Single Unix Specification says:
Patterns in pattern_file shall be terminated by a <newline>.
Which means that a file with no terminator, or with a different terminator (e.g. CR LF), might behave unexpectedly (though that might be system-dependent).

Resources