Lua gsub chars '(' and ')' fails - lua

For some reason only the open and close bracket wont work, all others are fine.
RequestEncoded = string.gsub(RequestEncoded, '<', ' ')
RequestEncoded = string.gsub(RequestEncoded, '>', ' ')
RequestEncoded = string.gsub(RequestEncoded, '"', ' ')
RequestEncoded = string.gsub(RequestEncoded, '\'', ' ')
RequestEncoded = string.gsub(RequestEncoded, '\\', ' ')
-- RequestEncoded = string.gsub(RequestEncoded, '(', ' ') keeps failing
-- RequestEncoded = string.gsub(RequestEncoded, ')', ' ')
-- RequestEncoded = string.gsub(RequestEncoded, "\x28", " ") --keeps failing
-- RequestEncoded = string.gsub(RequestEncoded, "\x29", ' ')
-- RequestEncoded = string.gsub(RequestEncoded, '\050', ' ') --keeps failing
-- RequestEncoded = string.gsub(RequestEncoded, '\051', ' ')

) and ( are special characters that form a capturing group in a Lua pattern.
You need to escape them when they are outside of square brackets, [...], to match literal parentheses. You need to escape them with %.
string.gsub(RequestEncoded, '%(', ' ')
string.gsub(RequestEncoded, '%)', ' ')
However, since you are using the same replacement pattern in all the subsequent gsub calls, you may simplify your code to
RequestEncoded = string.gsub(RequestEncoded, '[<>"\'\\()]', ' ')
Note that here, () are inside a bracket expression and do not need escaping.
See Lua patterns docs:
Some characters, called magic characters, have special meanings when used in a pattern. The magic characters are
( ) . % + - * ? [ ^ $

Related

Ruby: stringA.gsub(/\s+/, '') versus stringA.strip

Say
string = "Johnny be good! And smile :-) "
Is there a difference between
string.gsub(/\s+/, '')
and
string.strip
?
If so, what is it?
strip only removes leading and trailing whitespace, using gsub in the way that you outline in your question will remove all whitespace from the string.
irb(main):004:0* " hello ".strip
=> "hello"
irb(main):005:0> " h e l l o ".strip
=> "h e l l o"
irb(main):006:0> " hello ".gsub(/\s+/, '')
=> "hello"
irb(main):007:0> " h e l l o ".gsub(/\s+/, '')
=> "hello"

Neo4j Cypher : replace multiple characters

i need to replace multiple characters by single character
RETURN LOWER(REPLACE("ranchod-das-chanchad-240190---Funshuk--Wangdu",'--', '-'))
is there any regex to do this
for neo4j 2.2.2
There's no function similar to REPLACE taking a regex as a parameter.
Since you're using Neo4j 2.2, you can't implement it as a procedure either.
The only way to do it is by splitting and joining (using a combination of reduce and substring):
RETURN substring(reduce(s = '', e IN filter(e IN split('ranchod-das-chanchad-240190---Funshuk--Wangdu', '-') WHERE e <> '') | s + '-' + e), 1);
It can be easier to read if you decompose it:
WITH split('ranchod-das-chanchad-240190---Funshuk--Wangdu', '-') AS elems
WITH filter(e IN elems WHERE e <> '') AS elems
RETURN substring(reduce(s = '', e IN elems | s + '-' + e), 1);

ENBF to JavaCC difference between [] and {}

I have the following 2 production rules in EBNF:
<CharLiteral> ::= ' " ' [ <Printable> ] ' " '
and
<StringLiteral> ::= ' " ' { <Printable> } ' " '
What is the difference between the two? [] imply 1 or more repetitions and {} imply 0 or more repetitions?
In EBNF, [X] means 0 or 1 X and {X} means 0 or more X.
In JavaCC, [X] means 0 or 1 X for grammar productions; in regular expression productions, you should use (X)? instead. To express 0 or more X in JavaCC use (X)*.

Haskell Parser Fails on "|" Read

I am working on a parser in Haskell using Parsec. The issue lies in reading in the string "| ". When I attempt to read in the following,
parseExpr = parseAtom
-- | ...
<|> do string "{|"
args <- try parseList <|> parseDottedList
string "| "
body <- try parseExpr
string " }"
return $ List [Atom "lambda", args, body]
I get a parse error, the following.
Lampas >> {|a b| "a" }
Parse error at "lisp" (line 1, column 12):
unexpected "}"
expecting letter, "\"", digit, "'", "(", "[", "{|" or "."
Another failing case is ^ which bears the following.
Lampas >> {|a b^ "a" }
Parse error at "lisp" (line 1, column 12):
unexpected "}"
expecting letter, "\"", digit, "'", "(", "[", "{|" or "."
However, it works as expected when the string "| " is replaced with "} ".
parseExpr = parseAtom
-- | ...
<|> do string "{|"
args <- try parseList <|> parseDottedList
string "} "
body <- try parseExpr
string " }"
return $ List [Atom "lambda", args, body]
The following is the REPL behavior with the above modification.
Lampas >> {|a b} "a" }
(lambda ("a" "b") ...)
So the question is (a) does pipe have a special behavior in Haskell strings, perhaps only in <|> chains?, and (b) how is this behavior averted?.
The character | may be in a set of reserved characters. Test with other characters, like ^, and I assume it will fail just as well. The only way around this would probably be to change the set of reserved characters, or the structure of your interpreter.

Regular expression to avoid a set of characters

I am using Ruby on Rails 3.1.0 and I would like to validate a class attribute just to avoid to store in the database a string containing these characters: (blank space), <, >, ", #, %, {, }, |, \, ^, ~, [, ] and ```.
What is the regex?
Assuming it should also be non-empty:
^[^\] ><"#%{}|\\^~\[`]+$
Since someone is downvoting this, here is some test code:
ary = [' ', '<', '>', '"', '#', '%', '{', '}', '|', '\\', '^', '~', '[', ']', '`', 'a']
ary.each do |i|
puts i =~ /^[^\] ><"#%{}|\\^~\[`]+$/
end
Output:
nil
nil
nil
nil
nil
nil
nil
nil
nil
nil
nil
nil
nil
nil
nil
0
bad_chars = %w(< > " # % { } | \ ^ ~ [ ] ')
re = Regexp.union(bad_chars)
p %q(hoh'oho) =~ re #=> 3
Regexp.union takes care of escaping.
a = "foobar"
b = "foo ` bar"
re = /[ \^<>"#%\{\}\|\\~\[\]\`]/
a =~ re # => nil
b =~ re # => 3
The inverse expression is:
/\A[^ \^<>"#%\{\}\|\\~\[\]\`]+\Z/

Resources