I've got a text/string that contains multiple newlines. Like in the example below :
"This is a test message : \n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
"
I can gsub all \n with space and remove them all. How can I do the following :
If I see that there is more than two \n, leave only two newlines in the text?
If you want to remove all but the first two newlines you can use the block passed to gsub:
hits = 0
text.gsub(/\n/) { (hits = hits + 1) > 2 ? '' : "\n" }
# => "This is a test message : \n \n "
You can replace any sequences of three or more newlines with nothing in between, by using the following regex (assuming s
contains your string):
s.gsub /\n\n+/, "\n\n"
If you want to allow any amount of interleaving space characters between the newlines and remove that as well, better use:
s.gsub /\n *(\n *)+/, "\n\n"
Related
I have a inout text field where user can copy paste data, I want to replace \r \n \t but when the data is posted these characters are escaped.
So a string entered by user for example hello \r\n\t world is posted as hello \\r\\n\\t world
I want to replace these characters but because they are escaped I am not able to use something like gsub(/\s+/, ' ')
Can anyone suggest what would be a ideal way to replace the escaped characters.
Thanks.
If you're getting literally backslash-r you'll need to de-map these:
CONVERT = {
'\r' => "\r",
'\t' => "\t",
'\n' => "\n"
}
CONVERT_RX = Regexp.union(CONVERT.keys)
'this\nis\tinput\r\n'.gsub(CONVERT_RX, CONVERT)
# => "this\nis\tinput\r\n"
You can add more entries to that table as necessary.
From there if you want to strip or convert spaces you can do that as you would normally.
I have the following input txt file:
"Hamlet \r William Shakespeare"
"Romeo and Juliet \r William Shakespeare"
"For the whom the bell tolls \r Earnest Hemingway"
I load it into an array and when I output it I get:
Hamlet \r William Shakespeare.
Why is it not reading the carriage return character?
Thanks
If you have \r in a file, it won't be read as the character \r (special return character), it will be read as 2 separate characters \ and r.
You can fix this by replacing the string "\r" with the special charater \r.
content = content.replacingOccurrences(of: "\\r", with: "\r")
I have a string
Java \n\n c# \n\n c/c++
i need to replace it becomes
Java \n c# \n c/c++
use regular expression in Ruby String
Thanks
Use squeeze function of String class
[18] pry(main)> "Java \n\n c# \n\n c/c++".squeeze("\n")
=> "Java \n c# \n c/c++"
However, it returns a new string where runs of the same character that occur in this set are replaced by a single character, So
[18] pry(main)> "Java \n\n\n\n\n c# \n\n\n\n\n c/c++".squeeze("\n")
=> "Java \n c# \n c/c++"
You probably want to eliminate triple and so forth occurencies as well. In such a case the best option is to use the match counter:
# ⇓⇓⇓⇓
'Java \n\n c# \n\n c/c++'.gsub /\\n{1,}/, '\n'
In this particular case, “one or more” has s syntactic sugar for it:
# ⇓
'Java \n\n c# \n\n c/c++'.gsub /\\n+/, '\n'
If you are using Ruby2, there is \R match to match any combination of \r and \n.
To eliminate exactly two occurencies, one might use:
# ⇓⇓⇓
'Java \n\n c# \n\n c/c++'.gsub /\\n{2}/, '\n'
And, finally, there is a function to remove multiple occurencies of \n from the string using named matches and backreferences:
def singlify s
s.gsub /(?<sym>\\n)\g<sym>+/, '\k<sym>'
end
singlify 'Java \n\n c# \n\n c/c++'
# Java \n c# \n c/c++'
'Java \n\n c# \n\n c/c++'.gsub! /\\n\\n/, '\n'
ruby 2.1.3
rails 4.1.7
I want to generate a unordered list from textarea. So I have to preserve all line breaks for each item and remove leading and trailing spaces.
Well, I'm trying to remove all leading and trailing spaces from each line of textarea with no success.
I'm using a regex:
string_from_textarea.gsub(/^[ \t]+|[ \t]+$/, '')
I've tried strip and rstrip rails methods with no luck too (they are working with the same result as regex):
Leading spaces for each line are removed perfectly.
But with trailing spaces only the last space from string is removed. But I wanna for each line.
What am I missing here? What is the deal with textarea and trailing spaces for each line?
UPDATE
Some code example:
I'm using a callback to save formated data.
after_validation: format_ingredients
def format_ingredients
self.ingredients = #ingredients.gsub(/^[ \t]+|[ \t]+$/, "")
end
Form view:
= f.text_area :ingredients, class: 'fieldW-600 striped', rows: '10'
You can use String#strip
' test text with multiple spaces '.strip
#=> "test text with multiple spaces"
To apply this to each line:
str = " test \ntext with multiple \nspaces "
str = str.lines.map(&:strip).join("\n")
"test\ntext with multiple\nspaces"
This isn't a good use for a regexp. Instead use standard String processing methods.
If you have text that contains embedded LF ("\n") line-ends and spaces at the beginning and ends of the lines, then try this:
foo = "
line 1
line 2
line 3
"
foo # => "\n line 1 \n line 2\nline 3\n"
Here's how to clean the lines of leading/trailing white-space and re-add the line-ends:
bar = foo.each_line.map(&:strip).join("\n")
bar # => "\nline 1\nline 2\nline 3"
If you're dealing with CRLF line-ends, as a Windows system would generate text:
foo = "\r\n line 1 \r\n line 2\r\nline 3\r\n"
bar = foo.each_line.map(&:strip).join("\r\n")
bar # => "\r\nline 1\r\nline 2\r\nline 3"
If you're dealing with the potential of having white-space that contains other forms of white-space like non-breaking spaces, then switching to a regexp that uses the POSIX [[:space:]] character set, that contains white-space used in all character sets. I'd do something like:
s.sub(/^[[:space:]]+/, '').sub(/[[:space:]]+$/, '')
I think #sin probably intimated the problem in his/her first comment. Your file was probably produced on a Windows machine that puts a carriage return/life feed pair ("\r\n") at the end of each line other than (presumably) the last, where it just writes \n. (Check line[-2] on any line other than the last.) That would account for the result you are getting:
r = /^[ \t]+|[ \t]+$/
str = " testing 123 \r\n testing again \n"
str.gsub(r, '')
#=> "testing 123 \r\ntesting again\n"
If this theory is correct the fix should be just a slight tweak to your regex:
r = /^[ \t]+|[ \t\r]+$/
str.gsub(r, '')
#=> "testing 123\ntesting again\n"
You might be able to do this with your regex by changing the value of the global variable $/, which is the input record separator, a newline by default. That could be a problem for the end of the last line, however, if that only has a newline.
I think you might be looking for String#lstrip and String#rstrip methods:
str = %Q^this is a line
and so is this
all of the lines have two spaces at the beginning
and also at the end ^`
`> new_string = ""
> ""
str.each_line do |line|
new_string += line.rstrip.lstrip + "\n"
end
> "this is a line\n and so is this \n all of the lines have two spaces at the beginning \n and also at the end "
2.1.2 :034 > puts new_string
this is a line
and so is this
all of the lines have two spaces at the beginning
and also at the end
> new_string
`> "this is a line\nand so is this\nall of the lines have two spaces at the beginning\nand also at the end\n"`
Data stored in the database is like this:
This is a line
This is another line
How about this line
When I output it to the view, I want to convert that to:
This is a line\n\nThis is another line\n\nHow about this line
with no new lines and the actual \n characters printed out. How can I do that?
> s = "hi\nthere"
> puts s
hi
there
> puts s.gsub(/\n/, "\\n")
hi\nthere
I would personally use gsub if you only want newlines specifically converted. However, if you want to generally inspect the contents of the string, do this:
str = "This is a line\n\nThis is another line\n\nHow about this line"
puts str.inspect[1..-2]
#=> This is a line\n\nThis is another line\n\nHow about this line
The String#inspect method escapes various 'control' characters in your string. It also wraps the string with ", which I've stripped off above. Note that this may produce undesirable results, e.g. the string My name is "Phrogz" will come out as My name is \"Phrogz\".
> s = "line 1\n\nline2"
=> "line 1\n\nline2"
> puts s
line 1
line2
> puts s.gsub("\n", "\\n")
line 1\n\nline2
The key is to escape the single backslash.