Escaping regex in a Ruby awk system command - ruby-on-rails

The following works directly in my Mac OS X terminal, creating a file with a few lines:
awk '!/^1499\||^1598\||^1599\||^1999\||^2298\||^2299\||^2403\|/' "#{working_path}" > "#{filtered_file_path}"
However, when I attempt to use it in Ruby on Rails using backticks, the resulting file is empty:
`awk '!/^1499\||^1598\||^1599\||^1999\||^2298\||^2299\||^2403\|/' "#{working_path}" > "#{filtered_file_path}"`
An awk with a simple regex works. For example:
`awk '!/SMITH/' "#{working_path}" > "#{filtered_file_path}"`
So, the issue appears to be with the escaped pipe characters.
Ideas?
Some background I should have provided:
The file I am processing is pipe-delimited. I am filtering out lines with certain codes that are in the first value on the line. So, the regex I am using is something like ^2298\|.
The other pipes in the expression in single quotes are regex OR operators.
"working_path" and "filtered_file_path" are Ruby variables.

I just figured it out. The backslash that is escaping the pipe characters also needs to be escaped. Not sure why there is a difference between the regular Terminal and Ruby, but there it is. The working version:
`awk '!/^1499\\||^1598\\||^1599\\||^1999\\||^2298\\||^2299\\||^2403\\|/' "#{working_path}" > "#{filtered_file_path}"`
After challenging my assumption that the problem was Ruby on Rails, the accepted answer here is what explained it:
Pipe symbol | in AWK field delimiter

Related

Linux search string in a file with space

Im trying to figure out how to search a string in Linux i hope someone can help me out.
grep "Test\|Account" test.txt
The above command works if i only want to search for one word.
But when i try to search "Create Test 'account'" not sure how to use grep since im a newbie in Linux.
With a GNU grep, you can use
grep "Create Test 'account'\|Create Test \`account\`" test.txt
Here, the backticks are escaped since they are used inside a double quoted string where they are evaluated. The | regex alternation operator is escaped because it is considered a literal pipe char otherwise.
Details:
Create Test 'account' - a literal text
\| - or
Create Test `account` - a literal text

flex default rule can be matched

I am working on a flex parser using flex 2.6.4 with the -s option specified, a particular start condition has the following patterns (I am trying to read everything to the next unescaped newline):
\\(.|\n)
[^\\\n]+
\n
Yet I get the warning: "-s option given but default rule can be matched"
I don't see any holes in the above pattern set, am I missing something or is this a flex error?
Your set of rules does not match a backslash at the end of the file.
Your first rule requires the backslash to be followed by something and the other ones don't match backslashes at all.

tr on mac: misplaced sequence asterisk

I'm very new to all of this so please excuse any mistakes.
I'm working on on a mac.
I'm trying to follow this tutorial here
When I type in tr "[ -%,;\(\):=\.\\\*[]\"\']" "_" < hug_tol.fasta > hug_tol.clean.fasta
I get the error message tr:misplaced sequence asterisk
I'm guessing that something in the file must be wrong, but since I'm trying to remove those characters the error message doesn't make sense.
I haven't found anything on Google so maybe someone can help me.
The author of the tutorial appears to be using quasi-regex character class syntax for tr. tr is much more limited in it's scope than that. It only accepts a few escape characters and special characters. Simplify your command to
tr "%,;():=.*[]\"\' \\\\\-" "_" < hug_tol.fasta > hug_tol.clean.fasta
The - character does have special meaning, so put it at the end: in the beginning it will be interpreted as a command-line argument, while in the middle it specifies a character range. In bash, * won't be expanded in double quotes. For tr, to specify a plain \, you need a double \ (since it's the escape character). To get that through bash, you need \\\\.
You may also want to consider using the -c option to specify the complement set (the characters you want to keep), since it is probably much smaller:
tr -c "A-Za-z0-9_" "_" < hug_tol.fasta > hug_tol.clean.fasta
or more tersely
tr -c "[:alnum:]" "_" < hug_tol.fasta > hug_tol.clean.fasta

How to grep to find all instances of a Java method call using a reference?

I am trying the following query, but without success
grep -nr "[[:alnum:]]+\.[[:alnum:]]+\(\)" .
So, according to my logic, a method call would be one or more alphanumeric characters
[[:alnum:]]+
followed by a dot
\.
followed by one or more alphanumeric characters
[[:alnum:]]+
followed by paranthesis (for void return type only)
\(\)
But this query isn't working. How to write such a query?
grep provides several types of regex syntax.
Your pattern is written is the extended syntax and works with -E
extended-regexp has an easier/better syntax, and perl-regexp is, well, quite powerful.
-E, --extended-regexp
-F, --fixed-strings
-G, --basic-regexp (the default)
-P, --perl-regexp
grep -nrE "[[:alnum:]]+\.[[:alnum:]]+\(\)" .
You need to use "\+" instead of "+" otherwise it'll directly match the character "+".

Escape double and single backslashes in a string in Ruby

I'm trying to access a network path in my ruby script on a windows platform in a format like this.
\\servername\some windows share\folder 1\folder2\
Now If I try to use this as a path, it won't work. Single backslashes are not properly escaped for this script.
path = "\\servername\some windows share\folder 1\folder2\"
d = Dir.new(path)
I tried everything I could think of to properly escape slashes in the path. However I can't escape that single backslash - because of it's special meaning. I tried single quotes, double quotes, escaping backslash itself, using alternate quotes such as %Q{} or %q{}, using ascii to char conversion. Nothing works in a sense that I'm not doing it right. :-) Right now the temp solution is to Map a network drive N:\ pointing to that path and access it that way, but that not a solution.
Does anyone have any idea how to properly escape single backslashes?
Thank you
Just double-up every backslash, like so:
"\\\\servername\\some windows share\\folder 1\\folder2\\"
Try this
puts '\\\\servername\some windows share\folder 1\folder2\\'
#=> \\servername\some windows share\folder 1\folder2\
So long as you're using single quotes to define your string(e.g., 'foo'), a single \ does not need to be escaped. except in the following two cases
\\ works itself out to a single \. So, \\\\ will give you the starting \\ you need.
The trailing \ at the end of your path will tries to escape the closing quote so you need a \\ there as well.
Alternatively,
You could define an elegant helper for yourself. Instead of using the clunky \ path separators, you could use / in conjunction with a method like this:
def windows_path(foo)
foo.gsub('/', '\\')
end
puts windows_path '//servername/some windows share/folder 1/folder2/'
#=> \\servername\some windows share\folder 1\folder2\
Sweet!

Resources