How to keep from duplicating path variable in csh - path

It is typical to have something like this in your cshrc file for setting the path:
set path = ( . $otherpath $path )
but, the path gets duplicated when you source your cshrc file multiple times, how do you prevent the duplication?
EDIT: This is one unclean way of doing it:
set localpaths = ( . $otherpaths )
echo ${path} | egrep -i "$localpaths" >& /dev/null
if ($status != 0) then
set path = ( . $otherpaths $path )
endif

Im surprised no one used the tr ":" "\n" | grep -x techique to search if a given folder already exists in $PATH. Any reason not to?
In 1 line:
if ! $(echo "$PATH" | tr ":" "\n" | grep -qx "$dir") ; then PATH=$PATH:$dir ; fi
Here is a function ive made myself to add several folders at once to $PATH (use "aaa:bbb:ccc" notation as argument), checking each one for duplicates before adding:
append_path()
{
local SAVED_IFS="$IFS"
local dir
IFS=:
for dir in $1 ; do
if ! $( echo "$PATH" | tr ":" "\n" | grep -qx "$dir" ) ; then
PATH=$PATH:$dir
fi
done
IFS="$SAVED_IFS"
}
It can be called in a script like this:
append_path "/test:$HOME/bin:/example/my dir/space is not an issue"
It has the following advantages:
No bashisms or any shell-specific syntax. It run perfectly with !#/bin/sh (ive tested with dash)
Multiple folders can be added at once
No sorting, preserves folder order
Deals perfectly with spaces in folder names
A single test works no matter if $folder is at begginning, end, middle, or is the only folder in $PATH (thus avoiding testing x:*, *:x, :x:, x, as many of the solutions here implicitly do)
Works (and preserve) if $PATH begins or ends with ":", or has "::" in it (meaning current folder)
No awk or sed needed.
EPA friendly ;) Original IFS value is preserved, and all other variables are local to the function scope.
Hope that helps!

ok, not in csh, but this is how I append $HOME/bin to my path in bash...
case $PATH in
*:$HOME/bin | *:$HOME/bin:* ) ;;
*) export PATH=$PATH:$HOME/bin
esac
season to taste...

you can use the following Perl script to prune paths of duplicates.
#!/usr/bin/perl
#
# ^^ ensure this is pointing to the correct location.
#
# Title: SLimPath
# Author: David "Shoe Lace" Pyke <eselle#users.sourceforge.net >
# : Tim Nelson
# Purpose: To create a slim version of my envirnoment path so as to eliminate
# duplicate entries and ensure that the "." path was last.
# Date Created: April 1st 1999
# Revision History:
# 01/04/99: initial tests.. didn't wok verywell at all
# : retreived path throught '$ENV' call
# 07/04/99: After an email from Tim Nelson <wayland#ne.com.au> got it to
# work.
# : used 'push' to add to array
# : used 'join' to create a delimited string from a list/array.
# 16/02/00: fixed cmd-line options to look/work better
# 25/02/00: made verbosity level-oriented
#
#
use Getopt::Std;
sub printlevel;
$initial_str = "";
$debug_mode = "";
$delim_chr = ":";
$opt_v = 1;
getopts("v:hd:l:e:s:");
OPTS: {
$opt_h && do {
print "\n$0 [-v level] [-d level] [-l delim] ( -e varname | -s strname | -h )";
print "\nWhere:";
print "\n -h This help";
print "\n -d Debug level";
print "\n -l Delimiter (between path vars)";
print "\n -e Specify environment variable (NB: don't include \$ sign)";
print "\n -s String (ie. $0 -s \$PATH:/looser/bin/)";
print "\n -v Verbosity (0 = quiet, 1 = normal, 2 = verbose)";
print "\n";
exit;
};
$opt_d && do {
printlevel 1, "You selected debug level $opt_d\n";
$debug_mode = $opt_d;
};
$opt_l && do {
printlevel 1, "You are going to delimit the string with \"$opt_l\"\n";
$delim_chr = $opt_l;
};
$opt_e && do {
if($opt_s) { die "Cannot specify BOTH env var and string\n"; }
printlevel 1, "Using Environment variable \"$opt_e\"\n";
$initial_str = $ENV{$opt_e};
};
$opt_s && do {
printlevel 1, "Using String \"$opt_s\"\n";
$initial_str = $opt_s;
};
}
if( ($#ARGV != 1) and !$opt_e and !$opt_s){
die "Nothing to work with -- try $0 -h\n";
}
$what = shift #ARGV;
# Split path using the delimiter
#dirs = split(/$delim_chr/, $initial_str);
$dest;
#newpath = ();
LOOP: foreach (#dirs){
# Ensure the directory exists and is a directory
if(! -e ) { printlevel 1, "$_ does not exist\n"; next; }
# If the directory is ., set $dot and go around again
if($_ eq '.') { $dot = 1; next; }
# if ($_ ne `realpath $_`){
# printlevel 2, "$_ becomes ".`realpath $_`."\n";
# }
undef $dest;
#$_=Stdlib::realpath($_,$dest);
# Check for duplicates and dot path
foreach $adir (#newpath) { if($_ eq $adir) {
printlevel 2, "Duplicate: $_\n";
next LOOP;
}}
push #newpath, $_;
}
# Join creates a string from a list/array delimited by the first expression
print join($delim_chr, #newpath) . ($dot ? $delim_chr.".\n" : "\n");
printlevel 1, "Thank you for using $0\n";
exit;
sub printlevel {
my($level, $string) = #_;
if($opt_v >= $level) {
print STDERR $string;
}
}
i hope thats useful.

I've been using the following (Bourne/Korn/POSIX/Bash) script for most of a decade:
: "#(#)$Id: clnpath.sh,v 1.6 1999/06/08 23:34:07 jleffler Exp $"
#
# Print minimal version of $PATH, possibly removing some items
case $# in
0) chop=""; path=${PATH:?};;
1) chop=""; path=$1;;
2) chop=$2; path=$1;;
*) echo "Usage: `basename $0 .sh` [$PATH [remove:list]]" >&2
exit 1;;
esac
# Beware of the quotes in the assignment to chop!
echo "$path" |
${AWK:-awk} -F: '#
BEGIN { # Sort out which path components to omit
chop="'"$chop"'";
if (chop != "") nr = split(chop, remove); else nr = 0;
for (i = 1; i <= nr; i++)
omit[remove[i]] = 1;
}
{
for (i = 1; i <= NF; i++)
{
x=$i;
if (x == "") x = ".";
if (omit[x] == 0 && path[x]++ == 0)
{
output = output pad x;
pad = ":";
}
}
print output;
}'
In Korn shell, I use:
export PATH=$(clnpath /new/bin:/other/bin:$PATH /old/bin:/extra/bin)
This leaves me with PATH containing the new and other bin directories at the front, plus one copy of each directory name in the main path value, except that the old and extra bin directories have bin removed.
You would have to adapt this to C shell (sorry - but I'm a great believer in the truths enunciated at C Shell Programming Considered Harmful). Primarily, you won't have to fiddle with the colon separator, so life is actually easier.

Well, if you don't care what order your paths are in, you could do something like:
set path=(`echo $path | tr ' ' '\n' | sort | uniq | tr '\n' ' '`)
That will sort your paths and remove any extra paths that are the same. If you have . in your path, you may want to remove it with a grep -v and re-add it at the end.

Here is a long one-liner without sorting:
set path = ( echo $path | tr ' ' '\n' | perl -e 'while (<>) { print $_ unless $s{$_}++; }' | tr '\n' ' ')

dr_peper,
I usually prefer to stick to scripting capabilities of the shell I am living in. Makes it more portable. So, I liked your solution using csh scripting. I just extended it to work on per dir in the localdirs to make it work for myself.
foreach dir ( $localdirs )
echo ${path} | egrep -i "$dir" >& /dev/null
if ($status != 0) then
set path = ( $dir $path )
endif
end

Using sed(1) to remove duplicates.
$ PATH=$(echo $PATH | sed -e 's/$/:/;s/^/:/;s/:/::/g;:a;s#\(:[^:]\{1,\}:\)\(.*\)\1#\1\2#g;ta;s/::*/:/g;s/^://;s/:$//;')
This will remove the duplicates after the first instance, which may or may not be what you want, e.g.:
$ NEWPATH=/bin:/usr/bin:/bin:/usr/local/bin:/usr/local/bin:/bin
$ echo $NEWPATH | sed -e 's/$/:/; s/^/:/; s/:/::/g; :a; s#\(:[^:]\{1,\}:\)\(.*\)\1#\1\2#g; t a; s/::*/:/g; s/^://; s/:$//;'
/bin:/usr/bin:/usr/local/bin
$
Enjoy!

Here's what I use - perhaps someone else will find it useful:
#!/bin/csh
# ABSTRACT
# /bin/csh function-like aliases for manipulating environment
# variables containing paths.
#
# BUGS
# - These *MUST* be single line aliases to avoid parsing problems apparently related
# to if-then-else
# - Aliases currently perform tests in inefficient in order to avoid parsing problems
# - Extremely fragile - use bash instead!!
#
# AUTHOR
# J. P. Abelanet - 11/11/10
# Function-like alias to add a path to the front of an environment variable
# containing colon (':') delimited paths, without path duplication
#
# Usage: prepend_path ENVVARIABLE /path/to/prepend
alias prepend_path \
'set arg2="\!:2"; if ($?\!:1 == 0) setenv \!:1 "$arg2"; if ($?\!:1 && $\!:1 !~ {,*:}"$arg2"{:*,}) setenv \!:1 "$arg2":"$\!:1";'
# Function-like alias to add a path to the back of any environment variable
# containing colon (':') delimited paths, without path duplication
#
# Usage: append_path ENVVARIABLE /path/to/append
alias append_path \
'set arg2="\!:2"; if ($?\!:1 == 0) setenv \!:1 "$arg2"; if ($?\!:1 && $\!:1 !~ {,*:}"$arg2"{:*,}) setenv \!:1 "$\!:1":"$arg2";'

When setting path (lowercase, the csh variable) rather than PATH (the environment variable) in csh, you can use set -f and set -l, which will only keep one occurrence of each list element (preferring to keep either the first or last, respectively).
https://nature.berkeley.edu/~casterln/tcsh/Builtin_commands.html#set
So something like this
cat foo.csh # or .tcshrc or whatever:
set -f path = (/bin /usr/bin . ) # initial value
set -f path = ($path /mycode /hercode /usr/bin ) # add things, both new and duplicates
Will not keep extending PATH with duplicates every time you source it:
% source foo.csh
% echo $PATH
% /bin:/usr/bin:.:/mycode:/hercode
% source foo.csh
% echo $PATH
% /bin:/usr/bin:.:/mycode:/hercode
set -f there ensures that only the first occurrence of each PATH element is kept.

I always set my path from scratch in .cshrc.
That is I start off with a basic path, something like:
set path = (. ~/bin /bin /usr/bin /usr/ucb /usr/bin/X11)
(depending on the system).
And then do:
set path = ($otherPath $path)
to add more stuff

I have the same need as the original question.
Building on your previous answers, I have used in Korn/POSIX/Bash:
export PATH=$(perl -e 'print join ":", grep {!$h{$_}++} split ":", "'$otherpath:$PATH\")
I had difficulties to translate it directly in csh (csh escape rules are insane). I have used (as suggested by dr_pepper):
set path = ( `echo $otherpath $path | tr ' ' '\n' | perl -ne 'print $_ unless $h{$_}++' | tr '\n' ' '`)
Do you have ideas to simplify it more (reduce the number of pipes) ?

Related

Extract bin name from Cargo.toml using Bash

I am trying to extract bin names from from Cargo.toml using Bash, I enabled perl regular expression like this
First attempt
grep -Pzo '(?<=(^\[\[bin\]\]))\s*name\s*=\s*"(.*)"' ./Cargo.toml
The regular expression is tested at regex101
But got nothing
the Pzo options usage can be found here
Second attempt
grep -P (?<=(^[[bin]]))\n*\sname\s=\s*"(.*)" ./Cargo.toml
Still nothing
grep -Pzo '(?<=(^\[\[bin\]\]))\s*name\s*=\s*"(.*)"' ./Cargo.toml
Cargo.toml
[[bin]]
name = "acme1"
path = "bin/acme1.rs"
[[bin]]
name = "acme2"
path = "src/acme1.rs"
grep:
grep -A1 '^\[\[bin\]\]$' |
grep -Po '(?<=^name = ")[^"]*(?=".*)'
or if you can use awk, this is more robust
awk '
$1 ~ /^\[\[?[[:alnum:]]*\]\]?$/{
if ($1=="[[bin]]" || $1=="[bin]") {bin=1}
else {bin=0}
}
bin==1 &&
sub(/^[[:space:]]*name[[:space:]]*=[[:space:]]*/, "") {
sub(/^"/, ""); sub(/".*$/, "")
print
}' cargo.toml
Example:
$ cat cargo.toml
[[bin]]
name = "acme1"
path = "bin/acme1.rs"
[bin]
name="acme2"
[[foo]]
name = "nobin"
[bin]
not_name = "hello"
name="acme3"
path = "src/acme3.rs"
[[bin]]
path = "bin/acme4.rs"
name = "acme4" # a comment
$ sh solution
acme1
acme2
acme3
acme4
Obviously, these are no substitute for a real toml parser.
With your shown samples and attempts, please try following code with tac + awk combination, which will be easier to maintain and does the job with easiness, which will be difficult in grep.
tac Input_file |
awk '
/^name =/{
gsub(/"/,"",$NF)
value=$NF
next
}
/^path[[:space:]]+=[[:space:]]+"bin\//{
print value
value=""
}
' |
tac
Explanation: Adding detailed explanation for above code.
tac Input_file | ##Using tac command on Input_file to print it in bottom to top order.
awk ' ##passing tac output to awk as standard input.
/^name =/{ ##Checking if line starts from name = then do following.
gsub(/"/,"",$NF) ##Globally substituting " with NULL in last field.
value=$NF ##Setting value to last field value here.
next ##next will skip all further statements from here.
}
/^path[[:space:]]+=[[:space:]]+"bin\//{ ##Checking if line starts from path followed by space = followed by spaces followed by "bin/ here.
print value ##printing value here.
value="" ##Nullifying value here.
}
' | ##Passing awk program output as input to tac here.
tac ##Printing values in their actual order.

Nullify fields in pipe delimited file

Am not able to get the desired o/p when the data field has pipe in it.
If the i/p is
SAmple file is tst
hdr1|"hdr2|tst"|"hdr3|tst|tst"|hdr4|"hdr5|tst|tst"
lbl1|"lbl2|tst"|"lbl3|tst|tst"|lbl4|"lbl5|tst|tst"
I tried with this cmd but dont get the expected o/p - cut -f2,3 -d"|" tst
The expected o/p is
"hdr2|tst"|"hdr3|tst|tst"
"lbl2|tst"|"lbl3|tst|tst"
Is there an easy way that we can crack this o/p...Dont want to go with sed bcoz the tool that am using doesnt allow the charecter (""- backslash). I mean am embedding this command in one of the tool
Also am using old version of gawk -
so this cmd doesnt give te desired o/p
gawk -v FPAT='[^|]*|("[^"]*")+' '{print $2, $3}' OFS="|"
Output of gawk --version
GNU Awk 3.1.7
Output of cat -vet tst
hdr1|"hdr2|tst"|"hdr3|tst|tst"|hdr4|"hdr5|tst|tst"$
lbl1|"lbl2|tst"|"lbl3|tst|tst"|lbl4|"lbl5|tst|tst"$
Upgrading your gawk version is by far the best approach as you're missing a few bug fixes and a ton of extremely useful functionality introduced since gawk 3.1.7 came out 10+ years ago (we're currently on gawk version 5.1!) but if you can't do that for some reason then - here's what you can do if you don't have FPAT using any awk in any shell on every UNIX box:
$ cat tst.awk
BEGIN { OFS="|" }
{
orig = $0
$0 = i = ""
while ( (orig != "") && match(orig,/[^|]*|("[^"]*")+/) ) {
$(++i) = substr(orig,RSTART,RLENGTH)
orig = substr(orig,RSTART+RLENGTH+1)
}
print $2, $3
}
.
$ awk -f tst.awk file
"hdr2|tst"|"hdr3|tst|tst"
"lbl2|tst"|"lbl3|tst|tst"
Just to verify that it's identifying all of the fields correctly:
$ cat tst.awk
BEGIN { OFS="|" }
{
orig = $0
$0 = i = ""
while ( (orig != "") && match(orig,/[^|]*|("[^"]*")+/) ) {
$(++i) = substr(orig,RSTART,RLENGTH)
orig = substr(orig,RSTART+RLENGTH+1)
}
print NF " <" $0 ">"
for (i=1; i<=NF; i++) {
print "\t" i " <" $i ">"
}
}
.
$ awk -f tst.awk file
5 <hdr1|"hdr2|tst"|"hdr3|tst|tst"|hdr4|"hdr5|tst|tst">
1 <hdr1>
2 <"hdr2|tst">
3 <"hdr3|tst|tst">
4 <hdr4>
5 <"hdr5|tst|tst">
5 <lbl1|"lbl2|tst"|"lbl3|tst|tst"|lbl4|"lbl5|tst|tst">
1 <lbl1>
2 <"lbl2|tst">
3 <"lbl3|tst|tst">
4 <lbl4>
5 <"lbl5|tst|tst">
if you don't have embedded double quotes, you can substitute the quoted delimiter values with another unused character (I used ~) and after extraction switch back to the original values. Obviously it requires that the new delimiter is not used within text.
$ awk 'BEGIN{OFS=FS="\""} {for(i=2;i<NF;i+=2) gsub("\\|","~",$i)}1' file |
awk 'BEGIN{OFS=FS="|"} {print $2,$3}' |
sed 's/~/|/g'
"hdr2|tst"|"hdr3|tst|tst"
"lbl2|tst"|"lbl3|tst|tst"
Not sure it's simpler than the single awk script though.
Main problem here is the document format design. Requires another patch if there are embedded double quotes, or escaped pipes etc.

Parsing and Printing $PATH Using Unix

I've placed my PATH in a text file and would like to print each path on a newline using a simple command in UNIX.
I've found a long way to do it that goes like this...
cat Path.txt | awk -F\; '{print $1"\n", $2"\n", ... }'
This however seems inefficient so I know there must be a way to quickly print out my results on new lines each time without having to manually call each field separated by the delimiter.
Yet another way:
echo $PATH | tr : '\n'
or:
tr : '\n' <Path.txt
The tr solution is the right one but if you were going to use awk then there'd be no need for a loop:
$ echo "$PATH"
/usr/local/bin:/usr/bin:/cygdrive/c/winnt/system32:/cygdrive/c/winnt
$ echo "$PATH" | awk -F: -v OFS="\n" '$1=$1'
/usr/local/bin
/usr/bin
/cygdrive/c/winnt/system32
/cygdrive/c/winnt
I have a Perl script that I use for this:
#!/usr/bin/env perl
#
# "#(#)$Id: echopath.pl,v 1.8 2011/08/22 22:15:53 jleffler Exp $"
#
# Print the components of a PATH variable one per line.
# If there are no colons in the arguments, assume that they are
# the names of environment variables.
use strict;
use warnings;
#ARGV = $ENV{PATH} unless #ARGV;
foreach my $arg (#ARGV)
{
my $var = $arg;
$var = $ENV{$arg} if $arg =~ /^[A-Za-z_][A-Za-z_0-9]*$/;
$var = $arg unless $var;
my #lst = split /:/, $var;
foreach my $val (#lst)
{
print "$val\n";
}
}
I invoke it like:
echopath $PATH
echopath PATH
echopath LD_LIBRARY_PATH
echopath CDPATH
echopath MANPATH
echopath $CLASSPATH
etc. You can specify the variable name, or the value of the variable; it works both ways.
With Perl for UNIX/UNIX-likes :
echo $PATH | perl -F: -ane '{print join "\n", #F}'
With any OSes (tested on Windows XP, Linux, Minix, Solaris):
my $sep;
my $path;
if ($^O =~ /^MS/) {
$sep = ";";
$path = "Path";
}
else {
$sep = ":";
$path = "PATH";
}
print join "\n", split $sep, $ENV{$path} . "\n";
If using bash for Unix, try the following code :
printf '%s\n' ${PATH//:/ }
This use bash parameter expansion
awk:
echo $PATH|awk -F: '{gsub(/:/,"\n");print}'
perl:
echo $PATH|perl -F: -lane 'foreach(#F){print $_}'
for AWK, in addition to:
echo $PATH | awk -vFS=':' -vOFS='\n' '$1=$1'
You can:
echo $PATH | awk -vRS=':' '1'

Inserting a matched string from previous line to the current line using sed or awk

I have a CSV file that shows the statistics for links on a half an hour basis. The link name only appears on the 00:00 line.
link1,0:00,0,0,0,0
,00:30,0,0,0,0
,01:00,0,0,0,0
,01:30,0,0,0,0
,02:00,0,0,0,0
,02:30,0,0,0,0
,03:00,0,0,0,0
,03:30,0,0,0,0
,23:30,0,0,0,0
....
....
link2,00:00,0,0,0,0
How do I copy the link name to every other line until the link name is different, using sed or awk?
With awk, just keep track of the last seen non-empty link name, and always use that.
awk -F, -v OFS=, '$1 != "" { link=$1 } { $1 = link; print $0 }'
Omitting the ellipses, this gives:
link1,0:00,0,0,0,0
link1,00:30,0,0,0,0
link1,01:00,0,0,0,0
link1,01:30,0,0,0,0
link1,02:00,0,0,0,0
link1,02:30,0,0,0,0
link1,03:00,0,0,0,0
link1,03:30,0,0,0,0
link1,23:30,0,0,0,0
link2,00:00,0,0,0,0
This is a simpler job with awk, but if you want to use sed:
sed -e '/^[^,]/{h;s/,.*//;x};/^,/{G;s/^\(.*\)\n\(.*\)/\2\1/}'
Bellow a commented version in sed script file format that can be run with sed -f script:
# For lines not beginning with a ',', saves what precedes a ',' in the hold space and print the original line.
/^[^,]/{
h
s/,.*//
x}
# For lines beginning with a ',', put what has been save in the hold space at the beginning of the pattern space and print.
/^,/{
G
s/^\(.*\)\n\(.*\)/\2\1/}
You can do that in pure bash shell without needing to start a new process, which should be faster than using awk or sed:
IFS=","
while read v1 v2; do
if [[ $v1 != "" ]]; then
link=$v1;
fi
printf "%s,%s\n" "$link" "$v2"
done < file

how can I strip the filename from a path in tcsh?

Given this variable in tcsh:
set i = ~/foo/bar.c
how can I get just the directory part of $i?
~/foo
If your system provides a 'dirname' command you could:
set i = `dirname ~/foo/bar.c`
echo $i
Note the missing $ in front of the variable name. This solution is shell agnostic though.
Here is something different from above:
Available in tcsh but few other shells AFAIK
> set i = ~/foo/bar.c
> echo ${i:t}
bar.c
> echo ${i:h}
/home/erflungued/foo
The way I found to do it while waiting for answers here:
set i = ~/foo/bar.c
echo $i:h
result:
~/foo
For completely, getting the file name is accomplished with the basename command:
set j = `basename ~/foo/bar.c`
echo $j
echo $i | awk -F"/" '{$NF="";print}' OFS="/"
Use dirname command, for example:
set i = `dirname "~/foo/bar.c"`
Notice the quotation marks around path. It's important to include them. If you skip the quotation marks, dirname will fail for paths which contain spaces. Mind that ~/ expression evaluates before dirname is executed, thus even such simple example may fail if quotation marks are not used and home path includes spaces.
Of course the same problem applies also to all other commands, it's good practice to always surround argument to a command with quotation marks.
Use dirname "$i" indeed, and not ${i:h}.
The latter does not produce the intended result if $i contains only a file name (no path), while dirname correctly returns the current directory . in that case.
> set i = bar.c
> echo ${i:h}
bar.c
> dirname "$i"
.

Resources