I am on ubuntu debian 12.04, and I ran a find command to add something to all of my python files:
find . iname "*.py" -exec echo "import os" >> {} \;
The command runs without error and I want to validate the results so I egrep all of the files:
egrep -in "import os" *
And I get results looking like this:
{}:35:import os
{}:36:import os
{}:37:import os
{}:38:import os
{}:39:import os
...and the numbers go until 51 for some reason. What does this mean?
Thank you.
Your first command:
find . iname "*.py" -exec echo "import os" >> {} \;
Is looking for files ending in .py, and for each one is putting the string "import os" in a file called {}. Presumably there are 51 matches.
So egrep, when you run it, the * matches all files, including your file called {}. With {}:35:import os it's telling you that "in the file {}, at line 35, there's the string you're looking for"
This command:
find . iname "*.py" -exec echo "import os" >> {} \;
...creates a file named {} (in bash, and other shells which honor redirections in positions other than head and tail -- this is an extension which the POSIX sh standard does not require). It does not modify the files found by find. (This is because the >> is acting as a command to the shell that's starting find; it's not modifying the behavior of -exec -- and even if it did, -exec directly uses execve() to invoke the command given; it doesn't start that command through a shell, so it doesn't honor shell constructs such as redirections, so you'd be passing literal >> as an argument to echo on any shells not implementing this extension, still not performing a redirection on the individual files found).
Now, if you did want to modify the files found by find, you might do so like this:
find . -iname '*.py' -exec sh -c 'for f; do echo "import os" >>"$f"; done' {} +
Noteworthy differences:
The redirection is invoked inside a shell started with exec sh; thus, there's a shell present to honor it after the individual filenames have been resolved.
-exec ... {} + is used, which is much more efficient than -exec ... {} ; (the former runs as few subcommands as possible; the latter runs one per file found).
{} is a placeholder that is replaced by find with the filename that matches the given condition, in this case {} is replaced with filename that match the pattern "*.py".
However your find command isn't actually doing that, as the >> {} is not actually part of the -exec block, but interpreted by the shell as a redirect for the whole find command, so the {} never gets replaced by find with the proper filename and instead you are redirecting into a file called {}. To make things more clear, the command you are actually executing is this:
find . iname "*.py" -exec echo "import os" \; >> {}
Meaning for every *.py file you add a line containing "import os" into a file called {}. The output of grep is just filename:linenumber:matched_line so you get a {} in there as that is the filename.
If you are wondering how the \; survives and why you are not getting a:
find: missing argument to `-exec'
The shell doesn't actually care where in the command line the redirect occurs:
echo 1 2 3 4 5 6 7 > foo
is the same as:
echo 1 2 > foo 3 4 5 6 7
and gives you this each time:
$ cat foo
1 2 3 4 5 6 7
Also worth to mention >> is an append operator, so even if you fix your command you are adding to the end of the Python files, while import os probably should go to the top of the file.
Related
Say I have a directory /home/ and within it I have 3 subdirectories /home/red/ /home/blue/ /home/green/
And each subdirectory contains a file each like
/home/red/file1 /home/blue/file2 /home/green/file3
Now I want to find how many times file1,file2, file3 contains the word "hello" within them.
For example,
/home/red/file1 - 23
/home/blue/file2 - 6
/home/green/file3 - 0
Now, going to the locations of file and running the grep command is actually very inefficient when this problem scales.
I have tried using this grep command from the /home/ directory
grep -rnw '/path/to/somewhere/' -e 'pattern'
But this is just giving the occurrences rather than the count.
Is there any command through which I can get what I am looking for?
If the search term occurs at maximum once per line, you can use grep's -c option to report the count instead of the matching lines. So, the command will be grep -rc 'search' (add other options as needed).
If there can be more than one occurrence per line, I'd recommend using ripgrep. Note that rg recursively searches by default, so you can use something like rg -co 'search' from within the home directory (add other options as needed). Add --hidden if you need to search hidden files as well. Add --include-zero if you want to show files even if they didn't have any match.
Instead of grep you can use this find | gnu-awk solution:
cd /home
find {red/file1,blue/file2,green/file3} -type f -exec awk '
{c += gsub(/pattern/, "&")} ENDFILE {print FILENAME, "-", c; c=0}' {} +
I haven't worked with this stuff in years, so please be patient!
I'm having some really weird issues with Mac Excel greying out some .csv files but not others. From what I've read so far, this could have something to do with some of the more hidden file parameters.
Anyways, I'd like to find the files with a certain name in the directory, do a getfileinfo on them and spit out the result, i.e. something like:
for each i in (ls \*_xyz*.csv) do getfileinfo $i | echo
(or whatever more intelligent way this can be accomplished these days...)
I tried a few combinations but keep getting "-bash syntax error", so I've decided it's time to get help...
Thanks!!
Create dummy test files:
$ touch file{1..10}_xyz.csv
$ ls
file10_xyz.csv file1_xyz.csv file2_xyz.csv file3_xyz.csv file4_xyz.csv file5_xyz.csv file6_xyz.csv file7_xyz.csv file8_xyz.csv file9_xyz.csv
There are many ways to do this. My favorite is method1.
Method 1)
$ find . -name "*xyz*.csv" -exec someCommand {} \;
Method2)
$ for x in $( find . -name "*xyz*.csv") ; do someCommand $x ; done
Method3)
$find . -name "*xyz*.csv" | xargs someCommand
I would like to find .plist files recursively in a folder and copy that files into new folder by a single terminal command.
find /Users/admin/Desktop/Norton/StaticAnalysis -iname "*.plist" -exec cp {} /Users/admin/Desktop/Test \;
This is the command which is working fine in terminal.
But i have to use this command in ruby code.
when i use this in ruby code like
CODE 1:
system ("find /Users/admin/Desktop/Norton/StaticAnalysis -iname \"*.plist\" -exec cp {} /Users/admin/Desktop/Test \;")
puts $?.success?
OUTPUT IS:
find: -exec: no terminating ";" or "+"
false
CODE 2:
system ("find /Users/admin/Desktop/Norton/StaticAnalysis -iname \"*.plist\" -exec cp {} /Users/admin/Desktop/Test \;");
end
puts $?.success?
OUTPUT IS:
siva.rb:2: syntax error, unexpected keyword_end, expecting end-of-input
So please help me how to use this in ruby code.
Have you tried with FileUtils module (fileutils.rb)?
It has namespace for several file utility methods for copying, moving, removing, etc.
system ("find ... -exec ... \;")
ruby is interpreting the \; within double quotes as just ;. You need to double the backslash
system ("find ... -iname \"*.plist\" -exec ... \\;")
Or use different outer quotes, which means you don't have to escape the inner quotes
system %q{find ... -iname "*.plist" -exec ... \;}
I am trying to use a shell script (well a "one liner") to find any common lines between around 50 files.
Edit: Note I am looking for a line (lines) that appears in all the files
So far i've tried grep grep -v -x -f file1.sp * which just matches that files contents across ALL the other files.
I've also tried grep -v -x -f file1.sp file2.sp | grep -v -x -f - file3.sp | grep -v -x -f - file4.sp | grep -v -x -f - file5.sp etc... but I believe that searches using the files to be searched as STD in not the pattern to match on.
Does anyone know how to do this with grep or another tool?
I don't mind if it takes a while to run, I've got to add a few lines of code to around 500 files and wanted to find a common line in each of them for it to insert 'after' (they were originally just c&p from one file so hopefully there are some common lines!)
Thanks for your time,
When I first read this I thought you were trying to find 'any common lines'. I took this as meaning "find duplicate lines". If this is the case, the following should suffice:
sort *.sp | uniq -d
Upon re-reading your question, it seems that you are actually trying to find lines that 'appear in all the files'. If this is the case, you will need to know the number of files in your directory:
find . -type f -name "*.sp" | wc -l
If this returns the number 50, you can then use awk like this:
WHINY_USERS=1 awk '{ array[$0]++ } END { for (i in array) if (array[i] == 50) print i }' *.sp
You can consolidate this process and write a one-liner like this:
WHINY_USERS=1 awk -v find=$(find . -type f -name "*.sp" | wc -l) '{ array[$0]++ } END { for (i in array) if (array[i] == find) print i }' *.sp
old, bash answer (O(n); opens 2 * n files)
From #mjgpy3 answer, you just have to make a for loop and use comm, like this:
#!/bin/bash
tmp1="/tmp/tmp1$RANDOM"
tmp2="/tmp/tmp2$RANDOM"
cp "$1" "$tmp1"
shift
for file in "$#"
do
comm -1 -2 "$tmp1" "$file" > "$tmp2"
mv "$tmp2" "$tmp1"
done
cat "$tmp1"
rm "$tmp1"
Save in a comm.sh, make it executable, and call
./comm.sh *.sp
assuming all your filenames end with .sp.
Updated answer, python, opens only each file once
Looking at the other answers, I wanted to give one that opens once each file without using any temporary file, and supports duplicated lines. Additionally, let's process the files in parallel.
Here you go (in python3):
#!/bin/env python
import argparse
import sys
import multiprocessing
import os
EOLS = {'native': os.linesep.encode('ascii'), 'unix': b'\n', 'windows': b'\r\n'}
def extract_set(filename):
with open(filename, 'rb') as f:
return set(line.rstrip(b'\r\n') for line in f)
def find_common_lines(filenames):
pool = multiprocessing.Pool()
line_sets = pool.map(extract_set, filenames)
return set.intersection(*line_sets)
if __name__ == '__main__':
# usage info and argument parsing
parser = argparse.ArgumentParser()
parser.add_argument("in_files", nargs='+',
help="find common lines in these files")
parser.add_argument('--out', type=argparse.FileType('wb'),
help="the output file (default stdout)")
parser.add_argument('--eol-style', choices=EOLS.keys(), default='native',
help="(default: native)")
args = parser.parse_args()
# actual stuff
common_lines = find_common_lines(args.in_files)
# write results to output
to_print = EOLS[args.eol_style].join(common_lines)
if args.out is None:
# find out stdout's encoding, utf-8 if absent
encoding = sys.stdout.encoding or 'utf-8'
sys.stdout.write(to_print.decode(encoding))
else:
args.out.write(to_print)
Save it into a find_common_lines.py, and call
python ./find_common_lines.py *.sp
More usage info with the --help option.
Combining this two answers (ans1 and ans2) I think you can get the result you are needing without sorting the files:
#!/bin/bash
ans="matching_lines"
for file1 in *
do
for file2 in *
do
if [ "$file1" != "$ans" ] && [ "$file2" != "$ans" ] && [ "$file1" != "$file2" ] ; then
echo "Comparing: $file1 $file2 ..." >> $ans
perl -ne 'print if ($seen{$_} .= #ARGV) =~ /10$/' $file1 $file2 >> $ans
fi
done
done
Simply save it, give it execution rights (chmod +x compareFiles.sh) and run it. It will take all the files present in the current working directory and will make an all-vs-all comparison leaving in the "matching_lines" file the result.
Things to be improved:
Skip directories
Avoid comparing all the files two times (file1 vs file2 and file2 vs file1).
Maybe add the line number next to the matching string
Hope this helps.
Best,
Alan Karpovsky
See this answer. I originally though a diff sounded like what you were asking for, but this answer seems much more appropriate.
I'm grepping through a large pile of code managed by git, and whenever I do a grep, I see piles and piles of messages of the form:
> grep pattern * -R -n
whatever/.git/svn: No such file or directory
Is there any way I can make those lines go away?
You can use the -s or --no-messages flag to suppress errors.
-s, --no-messages suppress error messages
grep pattern * -s -R -n
If you are grepping through a git repository, I'd recommend you use git grep. You don't need to pass in -R or the path.
git grep pattern
That will show all matches from your current directory down.
Errors like that are usually sent to the "standard error" stream, which you can pipe to a file or just make disappear on most commands:
grep pattern * -R -n 2>/dev/null
I have seen that happening several times, with broken links (symlinks that point to files that do not exist), grep tries to search on the target file, which does not exist (hence the correct and accurate error message).
I normally don't bother while doing sysadmin tasks over the console, but from within scripts I do look for text files with "find", and then grep each one:
find /etc -type f -exec grep -nHi -e "widehat" {} \;
Instead of:
grep -nRHi -e "widehat" /etc
I usually don't let grep do the recursion itself. There are usually a few directories you want to skip (.git, .svn...)
You can do clever aliases with stances like that one:
find . \( -name .svn -o -name .git \) -prune -o -type f -exec grep -Hn pattern {} \;
It may seem overkill at first glance, but when you need to filter out some patterns it is quite handy.
Have you tried the -0 option in xargs? Something like this:
ls -r1 | xargs -0 grep 'some text'
Use -I in grep.
Example: grep SEARCH_ME -Irs ~/logs.
I redirect stderr to stdout and then use grep's invert-match (-v) to exclude the warning/error string that I want to hide:
grep -r <pattern> * 2>&1 | grep -v "No such file or directory"
I was getting lots of these errors running "M-x rgrep" from Emacs on Windows with /Git/usr/bin in my PATH. Apparently in that case, M-x rgrep uses "NUL" (the Windows null device) rather than "/dev/null". I fixed the issue by adding this to .emacs:
;; Prevent issues with the Windows null device (NUL)
;; when using cygwin find with rgrep.
(defadvice grep-compute-defaults (around grep-compute-defaults-advice-null-device)
"Use cygwin's /dev/null as the null-device."
(let ((null-device "/dev/null"))
ad-do-it))
(ad-activate 'grep-compute-defaults)
One easy way to make grep return zero status all the time is to use || true
→ echo "Hello" | grep "This won't be found" || true
→ echo $?
0
As you can see the output value here is 0 (Success)