Interesting issues related to how 'cp --parents' works - path

I've written a short csh script that reads a file, which contains paths to files to be copied, then copies those files to a directory:
1 #!/bin/csh
2 #
3 # This script copies source and executable files modified to solve issues
4 # brought up by Veracode.
5 #
6
7 set tempdir = '~/updatedfiles2'
8
9 foreach line ( "`cat modifiedFiles`" )
*************here is the cp line**************
10 `cp -a $line $tempdir`
**********************************************
11 end
Which previously worked fine. I've since decided that I want to preserve the paths to these files in the form of a directory tree under that same tempdir directory because colisions are occuring when files with different paths have the same names.
(i.e. /vobs/emv/integratedClient/jniWrapper/OEMIMAKEFILE and /vobs/mv_components/utilities/general/OEMIMAKEFILE)
So, I tried to use the --parents option, like so:
1 #!/bin/csh
2 #
3 # This script copies source and executable files modified to solve issues
4 # brought up by Veracode.
5 #
6
7 set tempdir = '~/updatedfiles2'
8
9 foreach line ( "`cat modifiedFiles`" )
*************here is the cp line**************
10 `cp -a --parents $line $tempdir`
**********************************************
11 end
When I test it, it starts trying to copy the entirety of my system, starting in the root directory, which is not the effect I want. I'm just trying to copy over specific files, maintaining their directory structure as they are copied.
I've found some explanations of --parents, but none describe anything like what I'm seeing happening. Is it because I'm using --parents wrong? Is it my input file? I'm not sure.
The content of modifiedFiles (which is the value of tempdir) looks like this:
...
4 /vobs/emv/C_API/APIPrivate.cpp
5 /vobs/mv_components/utilities/class/Array.c
6 /vobs/mv_components/utilities/class/String1.c
7 /vobs/mv_components/export_functions/code/write_nastran_ortho3_none.c
...
/vobs is a root directory, so this may be effecting something with --parents. Has anyone heard of unrestricted recursive copying, despite specific file paths and no -r argument? Am I misunderstanding --parents?

Wow, I feel dumb.
After looking through it again and again, I've come to find what I've done wrong.
The actual command above is in a csh script. When a command is enclosed in front ticks (``) in a csh script, that command is executed, and the out put of that command is used by the shell. I was therefore doing the cp, then executing the output in the shell. I'm not sure why it was recursively copying upward, but removing those front ticks fixed everything. There was a previous error that I ignored in my original "working" script, and when I added the --parents option, the already broken script broke even more.
Moral of the story, be careful of front ticks!
For anyone who is interested, before:
...
9 foreach line ( "`cat modifiedFiles`" )
*************here is the cp line**************
10 `cp -a --parents $line $tempdir`
**********************************************
11 end
...
And after:
...
9 foreach line ( "`cat modifiedFiles`" )
*************here is the cp line**************
10 cp -a --parents $line $tempdir
**********************************************
11 end
...
Also, two of the entries in the input file were commented out in C style
/* comment */
That was causing the recursive copying from the root directory. Haha....eh. Stupid me.

Related

Find the count of a specific keyword in multiple files in a directory

Say I have a directory /home/ and within it I have 3 subdirectories /home/red/ /home/blue/ /home/green/
And each subdirectory contains a file each like
/home/red/file1 /home/blue/file2 /home/green/file3
Now I want to find how many times file1,file2, file3 contains the word "hello" within them.
For example,
/home/red/file1 - 23
/home/blue/file2 - 6
/home/green/file3 - 0
Now, going to the locations of file and running the grep command is actually very inefficient when this problem scales.
I have tried using this grep command from the /home/ directory
grep -rnw '/path/to/somewhere/' -e 'pattern'
But this is just giving the occurrences rather than the count.
Is there any command through which I can get what I am looking for?
If the search term occurs at maximum once per line, you can use grep's -c option to report the count instead of the matching lines. So, the command will be grep -rc 'search' (add other options as needed).
If there can be more than one occurrence per line, I'd recommend using ripgrep. Note that rg recursively searches by default, so you can use something like rg -co 'search' from within the home directory (add other options as needed). Add --hidden if you need to search hidden files as well. Add --include-zero if you want to show files even if they didn't have any match.
Instead of grep you can use this find | gnu-awk solution:
cd /home
find {red/file1,blue/file2,green/file3} -type f -exec awk '
{c += gsub(/pattern/, "&")} ENDFILE {print FILENAME, "-", c; c=0}' {} +

How to count the occurence of a string in a file, for all files in a directory and output into a new file with shell

I have hundreds of files in a directory that I would like to count the occurrence of a string in each file.
I would like the output to be a summary file that contains the original file name plus the count (ideally on the same line)
for example
file1 6
file2 3
file3 4
etc
Thanks for your consideration
CAUTION: I am pretty much an enthusiastic amateur, so take everything with a grain of salt.
Several questions for you - depending on your answers, the solution below may need some adjustments.
Are all your files in the same directory, or do you also need to look through subdirectories and sub-subdirectories, etc.? Below I make the simplest assumption - that all your files are in a single directory.
Are all your files text files? In the example below, the directory will contain text files, executable files, symbolic links, and directories; the count will only be given for text files. (What linux believe to be text files, anyway.)
There may be files that do not contain the searched-for string at all. Those are not included in the output below. Do you need to show them too, with a count of 0?
I assume by "count occurrences" you mean all of them - even if the string appears more than once on the same line. (Which is why a simple grep -c won't cut it, as that only counts lines that contain the substring, no matter how many times each.)
Do you need to include hidden files (whose name begins with a period)? In my code below I assumed you don't.
Do you care that the count appears first, and then the file name?
OK, so here goes.
[oracle#localhost test]$ ls -al
total 20
drwxr-xr-x. 3 oracle oinstall 81 Apr 3 18:42 .
drwx------. 39 oracle oinstall 4096 Apr 3 18:42 ..
-rw-r--r--. 1 oracle oinstall 40 Apr 3 17:44 aa
lrwxrwxrwx. 1 oracle oinstall 2 Apr 3 18:04 bb -> aa
drwxr-xr-x. 2 oracle oinstall 6 Apr 3 17:40 d1
-rw-r--r--. 1 oracle oinstall 38 Apr 3 17:56 f1
-rw-r--r--. 1 oracle oinstall 0 Apr 3 17:56 f2
-rwxr-xr-x. 1 oracle oinstall 123 Apr 3 18:15 zfgrep
-rw-r--r--. 1 oracle oinstall 15 Apr 3 18:42 .zz
Here's the command to count 'waca' in the text files in this directory (not recursive). I define a variable substr to hold the desired string. (Note that it could also be a regular expression, more generally - but I didn't test that so you will have to, if that's your use case.)
[oracle#localhost test]$ substr=waca
[oracle#localhost test]$ find . -maxdepth 1 -type f \
> -exec grep -osHI "$substr" {} \; | sed "s/^\.\/\(.*\):$substr$/\1/" | uniq -c
8 aa
2 f1
1 .zz
Explanation: I use find to find just the files in the current directory (excluding directories, links, and whatever other trash I may have in the directory). This will include the hidden files, and it will include binary files, not just text. In this example I find in the current directory, but you can use any path instead of . I limit the depth to 1, so the command only applies to files in the current directory - the search is not recursive. Then I pass the results to grep. -o means find all matches (even if multiple matches per line of text) and show each match on a separate line. -s is for silent mode (just in case grep thinks of printing messages), -H is to include file names (even when there is only one file matching the substring), and -I is to ignore binary files.
Then I pass this to sed so that from each row output by grep I keep just the file name, without the leading ./ and without the trailing :waca. This step may not be necessary - if you don't mind the output like this:
8 ./aa:waca
2 ./f1:waca
1 ./.zz:waca
Then I pass the output to uniq -c to get the counts.
You can then redirect the output to a file, if that's what you need. (Left as a trivial exercise - since I forgot that was part of the requirement, sorry.)
Thanks for the detailed answer it provides me with ideas for future projects.
In my case the files were all the same format (output from another script) and the only files in the directory.
I found the answer in another thread
grep -c -R 'xxx'

How to copy multiple files in directory and move each into their correct directory

Unix shell ksh
I created a file list and am currently trying to copy each file to their correct path.
(mylist)
-1111
-2222
-3333
-4444
-5555
current directory
/sample/dir/unknown/
-1111fileneeded.txt
-2222fileneeded.txt
-3333fileneeded.txt
-4444fileneeded.txt
-5555fileneeded.txt
-6666dontneed.txt
-7777dontneed.txt
-8888dontneed.txt
...etc
The first 4 characters of each file matches with their correct path to where they need to go.
/sample/dir/1111/
/sample/dir/2222/
/sample/dir/3333/
/sample/dir/4444/
So here is what I currently have..
for i in `cat mylist`
do echo "$i"
find /sample/dir/unknown/mylist*
this is where I am kinda stuck and trying to figure out what needs to be done to have each file moved into their correct directory.
This should work
#!/bin/ksh
while IFS=\| read -r line; do
dir=`echo $line | cut -c 2-5`
mv "$line /sample/$dir/$line"
done > filelist.txt
IFS is escape special char, just in case.
cut -c 2-5 is taking all char from 2 to 5 (because there is a dash at the start of your file name)
Let me know if there is something else you don't understand.

Grep --exclude-dir (root directory only)

I'm trying to setup a grep command, that searches my current directory, but excludes a directory, only if it's the root directory.
So for the following directories, I want #1 to be excluded, and #2 to be included
1) vendor/phpunit
2) app/views/vendor
I originally started with the below command
grep -Ir --exclude-dir=vendor keywords *
I tried using ^vendor, ^vendor/, ^vendor/, ^vendor, but nothing seems to work.
Is there a way to do this with grep? I was looking to try to do it with one grep call, but if I have to, I can pipe the results to a second grep.
With pipes:
grep -Ir keywords * | grep -v '^vendor/'
The problem with exclude-dir is, it tests the name of the directory and not the path before going into it, so it is not possible to distinguish between two vendor directories based on their depths.
Here is a better solution, which will actually ignore the specified directory:
function grepex(){
excludedir="$1"
shift;
for i in *; do
if [ "$i" != "$excludedir" ]; then
grep $# "$i"
fi
done
}
You use it as a drop-in replacement to grep, just have the excluded dir as the first argument and leave the * off the end. So, your command would look like:
grepex vendor -Ir keywords
It's not perfect, but as long as you don't have any really weird folders (e.g. with names like -- or something), it will cover most use cases. Feel free to refine it if you want something more elaborate.

vim nerdtree files show up with * appended [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
gVim displays every file with an asterisk on the right (and bold)?
I'm using vim with nerdtree plugin for my rails projects and some of the files show up with a * appended to the filename. They are also a different color from the other files.
edit.html.erb*
index.html.erb
show.html.erb*
What does the * mean?
The key is the executable bit. For example, if you do this:
$touch no_exec_file exec_file
$chmod -v u+x exec_file
$ls -lF
total 0
-rwxr--r-- 1 reoo reoo 0 2012-09-19 19:14 exec_file*
-rw-r--r-- 1 reoo reoo 0 2012-09-19 19:14 no_exec_file
You can see the '*' in the exec_file, now, if you open VIM, you can see the '*' symbol again in the exec_file.
So, the NERDTree plugin shows the '*' symbol for those files that can be execute by the user.
It means that your files are executable, meaning you gave them the permission to be executable. Or they are files like .exe for example.

Resources