regular expression for removing empty lines produces wrong results - rascal

Can someone help me solve the problem I'm having with a regular expression? I have a file containing the following code:
I'm using a visit to find matches and replace them so that I can remove the empty lines. The result is, however, not what I'm expecting. The code is as follows:
str content = readFile(location);
// Remove empty lines
content = visit (content) {
case /^[ \t\f\v]*?$(?:\r?\n)*/sm => ""
}
This regular expression also removes non empty lines resulting in an output equal to:
Can someone explain what I'm doing wrong with the regular expression as well as the one shown below? I can't seem to figure out why it's not working.
str content = readFile(location);
// Remove empty lines
content = visit (content) {
case /^\s+^/m => ""
}
Kind regards,
Bob

I think the big issue here is that in the context of visit, the ^ anchor does not mean what you think it does. See this example:
rascal>visit ("aaa") { case /^a/ : println("yes!"); }
yes!
yes!
yes!
visit matches the regex at every postfix of the string, so the ^ is relative for every postfix.
first it starts at "aaa", then at "aa" and then at "a".
In your example visit, what will happen is that empty postfixes of lines will also match your regex, and substitute those by empty strings. I think an additional effect is that the carriage return is not eaten up eagerly.
To fix this, simply not use a visit but a for loop or while, with a := match as the condition.

Related

Remove all indents and spaces from JSON string except inside its value in Ruby

My problematic string is like this:
'{\n"test":"AAAA",\n"test2":"BBB\n\n\nBBB"\n}'
I want to parse it as JSON object(Hash) by JSON.parse(jsonstring)
The expecting result is:
{ "test": "AAAA", "test2": "BBB\nB"}
However, I get the error:
JSON::ParserError: 809
I happend to know that indentaion code in jsonstring be escaped,
so I tried this:
escaped_jsonstring = '{\n"test":"AAAA",\n"test2":"BBB\n\n\nBBB"\n}'.gsub(/\R/, '\\n')
JSON.parse(escaped_jsonstring)
I still have JSON::ParserError.
Indentations outside the key or value may cause this error.
How can I remove \n(or \r any indentation code) only outside the key or value in Ruby?
which means,
'{\n"test":"AAAA",\n"test2":"BBB\n\n\nBBB"\n}'
↓
'{"test":"AAAA","test2":"BBB\n\n\nBBB"}'
try this
'{\n"test":"AAAA",\n"test2":"BBB\n\n\nBBB"\n}'.gsub(/\B(\\n)+/, "")
\n" is considered inside boundary (so i use \B), meanwhile "\n is considered outside boundary (\b), (\\n)+ to fix case '...,\n\n\n"test2":...
update
turn out \s\n also be considered an inside boundary ... iam not sure there's other cases ...
for now, the updated version
'{\n"test":"AAAA",\n"test2":"BBB \n\n\n BBB"\n}'
.gsub(/([{,\"]\s*)\B(\\n)+/) { $1 }
better way
i found another way to solve your problem, also using regexp, now i will scan through the input text (valid or invalid json) then filter follow the pair pattern "<key>":"<value>" and don't care anything else outside those pairs, finally output the hash
def format(json)
matches = json.scan(/\"(?<key>[^\"]+)\":\"(?<val>[^\"]+)\",*/)
matches&.to_h
end
format('{\n "test\n parser":"AA\nAA", \n\n"test2":"BBB ? ;\n\n\n BBB" \n}')
# {"test\n parser"=>"AA\nAA", "test2"=>"BBB ? ;\n\n\n BBB"}

php str_replace produces strange results

I am trying to replace some characters in a text block. All of the replacements are working except the one at the beginning of the string variable.
The text block contains:
[FIRST_NAME] [LAST_NAME], This message is to inform you that...
The variables are defined as:
$fname = "John";
$lname = "Doe";
$messagebody = str_replace('[FIRST_NAME]',$fname,$messagebody);
$messagebody = str_replace('[LAST_NAME]',$lname,$messagebody);
The result I get is:
[FIRST_NAME] Doe, This message is to inform you that...
Regardless of which tag I put first or how the syntax is {TAG} $$TAG or [TAG], the first one never gets replaced.
Can anyone tell me why and how to fix this?
Thanks
Until someone can provide me with an explanation for why this is happening, the workaround is to put a string in front and then remove it afterward:
$messagebody = 'START:'.$messagebody;
do what you need to do
$messagebody = substr($messagebody,6);
I believe it must have something to do with the fact that a string starts at position 0 and that maybe the str_replace function starts to look at position 1.

Parse a string with specific condition

How can I trim my string which is in this form:
https://xxx.kflslfsk.com/kjjfkskfjksf/v1/files/media/93939393hhs8.jpeg
to this?
media/93939393hhs8.jpeg
I want to remove all the characters before the second to last slash /.
I can use stringByTrimmingCharactersInSet but I don't know how to specify the condition that I want:
let trimmedString = myString.stringByTrimmingCharactersInSet(
NSCharacterSet.whitespaceAndNewlineCharacterSet() // what here in my case ??
)
The above is for removing the white spaces, but that is not the case here.
Since the string is an URL get the path components, remove anything but the last 2 items and join the items with the slash separator.
if let url = NSURL(string:"https://xxx.kflslfsk.com/kjjfkskfjksf/v1/files/media/93939393hhs8.jpeg"), pathComponents = url.pathComponents {
let trimmedString = pathComponents.suffix(2).joinWithSeparator("/")
print(trimmedString)
}
You're not trimming, you're parsing.
There's no single call that will do what you want. I suggest writing a block of code that uses componentsSeparatedByString("\n") to break it into lines (one URL per line), then parse each line separately.
You could use componentsSeparatedByString("/") on each line to break it into the fragments between your slashes, and then assemble the last 2 fragments together.
(I'm deliberately not writing out the code for you. You should do that for yourself. I'm just pointing you in the right direction.)
You might also be able to use NSURLComponents to treat each line as a URL, but I'm not sure how you'd get the last part of URL before the filename (e.g. "media " or "lego") with that method.

Line breaks are being lost when sending sms from mvc3

For some reasons the line breaks when send SMS from MVC, not working.
I am using code like,
Constants.cs
public struct SmsBody
{
public const string SMSPostResume=
"[ORG_NAME]"+
"[CONTACT_NUMBER]"+
"[ORG_NAME]"+
"[CONTACT_PERSON]"+
"[EMAIL]"+
"[MOBILE_NUMBER]";
}
Then I call these variables at controller like,
SmsHelper.Sendsms(
Constants.SmsSender.UserId,
Constants.SmsSender.Password,
Constants.SmsBody.SMSPostResume
.Replace("[NAME],",candidate.Name)
.Replace("[EMAIL],",candidate.Email) etc......
My Issue is when i get sms these all things are same line. no spacing.
MY OUTPUT
Dearxxxxyyy#gmail.com0000000000[QUALIFICATION][FUNCTION][DESIGNATION][PRESENT_SALARY][LOCATION][DOB][TOTAL_EXPERIENCE][GENDER] like that.
How to give space between these? Anyone know help me...
Putting the string parts on separate lines, and concatenating them is not a line break... The parts will end up exactly after one another. You should try putting a \n (line break escaped sequence) at each place you want a line break:
public const string SMSPostResume=
"[ORG_NAME]\n"+
"[CONTACT_NUMBER]\n"+
"[ORG_NAME]\n"+
"[CONTACT_PERSON]\n"+
"[EMAIL]\n"+
"[MOBILE_NUMBER]\n";
Also a note based on #finman's comment:
Depending on the service it might be \r\n instead of \n though
So you should look up int he API docs which one would work.
Also there is another error: you try to match string constants with a , at their ends, and the original ones don't have that...
SmsHelper.Sendsms(
Constants.SmsSender.UserId,
Constants.SmsSender.Password,
Constants.SmsBody.SMSPostResume
.Replace("[NAME],",candidate.Name) // <- this line!
.Replace("[EMAIL],",candidate.Email) // <- this line!
You should rewrite either the format string to include, or the replaces to exclude the ,:
SmsHelper.Sendsms(
Constants.SmsSender.UserId,
Constants.SmsSender.Password,
Constants.SmsBody.SMSPostResume
.Replace("[NAME]",candidate.Name) // <- no "," this time
.Replace("[EMAIL]",candidate.Email) // <- no "," this time
//...etc
public const string SMSPostResume=
"[ORG_NAME]"+
"\r[CONTACT_NUMBER]"+
"\r[ORG_NAME]"+
"\r[CONTACT_PERSON]"+
"\r[EMAIL]"+
"\r[MOBILE_NUMBER]";
Also, in
Replace("[NAME],",candidate.Name)
are you sure you want the comma after [NAME] ? If it's not in the string, don't try to replace it.

How do I know whether I'm looking at a newline or carriage return etc.?

For example, say I wanted to determine whether this form was storing newlines as carriage returns or newlines or whatever characters. I'm often in situations where I'm writing code and am not sure what type of new-line character a file/form/whatever I'm parsing is using.
How could I determine this? Is there a way to determine this without actually doing a check inside of code? (It seems like I should be able to right-click and "show all characters" or something like that).
Note: I realize I could write code saying
(if == '\r') cout << "Carriage";
etc
but I have a feeling there's a simpler solution.
Maybe is list what you are looking for (from vim help):
:[range]l[ist] [count] [flags]
Same as :print, but display unprintable characters
with '^' and put $ after the line. This can be
changed with the 'listchars' option.
See ex-flags for [flags].
You can switch modes with:
:set list
and
:set nolist
Additionally you can use "listchars" as shown in this example:
You could for example check your document for occourences of "Carriage Return" or "New Line"/"Line Feed".
e.g. (php):
if( strstr( $yourstring , "\r" ) != false ){ // You have Carriage return
// Do something
}
elseif( strstr( $yourstring , "\n" ) != false ){ // You have New Line/Line feed
// Do something
}
else{
// You cannot determine which on is used, because the string is single-lined
}
I hope this is the thing you're looking for
Note: In windows "\r\n" is used to specify ne lines

Resources