How to remove mysterious whitespace from line breaks in Ruby on Rails? - ruby-on-rails

In my Rails application I have a field address which is a varchar(255) in my SQLite database.
Yet whenever I save an address consisting of more than one line through a textarea form field, one mysterious whitespace character gets added to the right.
This becomes visible only when the address is right aligned (like e.g. on a letterhead).
Can anybody tell me why this is happening and how it can be prevented?
I am not doing anything special with those addresses in my model.
I already added this attribute writer to my model but it won't remove the whitespace unfortunately:
def address=(a)
write_attribute(:address, a.strip)
end
This is a screenshot:
As you can see only the last line is right aligned. All others contain one character of whitespace at the end.
Edit:
This would be the HTML output from my (Safari) console:
<p>
"John Doe "<br>
"123 Main Street "<br>
"Eggham "<br>
"United Kingdom"<br>
</p>
I don't even know why it's putting the quotes around each line... Maybe that's part of the solution?

I believe textarea is returning CR/LF for line separators and you're seeing one of these characters displayed between each line. See PHP displays \r\n characters when echoed in Textarea for some discussion of this. There are probably better questions out there as well.

You can strip out the whitespace at the start and end of each line. Here are two simple techniques to do that:
# Using simple ruby
def address=(a)
a = a.lines.map(&:strip).join("\n")
write_attribute(:address, a)
end
# Using a regular expression
def address=(a)
a = a.gsub(/^[ \t]+|[ \t]+$/, "")
write_attribute(:address, a)
end

I solved a very similar kind of problem when I ran into something like this,
(I used squish)
think#think:~/CrawlFish$ irb
1.9.3-p385 :001 > "Im calling squish on a string, in irb".squish
NoMethodError: undefined method `squish' for "Im calling squish on a string, in irb":String
from (irb):1
from /home/think/.rvm/rubies/ruby-1.9.3-p385/bin/irb:16:in `<main>'
That proves, there is no squish in irb(ruby)
But rails has squish and squish!(you should know the difference that bang(!) makes)
think#think:~/CrawlFish$ rails console
Loading development environment (Rails 3.2.12)
1.9.3-p385 :001 > str = "Here i am\n \t \n \n, its a new world \t \t \n, its a \n \t new plan\n \r \r,do you like \r \t it?\r"
=> "Here i am\n \t \n \n, its a new world \t \t \n, its a \n \t new plan\n \r \r,do you like \r \t it?\r"
1.9.3-p385 :002 > out = str.squish
=> "Here i am , its a new world , its a new plan ,do you like it?"
1.9.3-p385 :003 > puts out
Here i am , its a new world , its a new plan ,do you like it?
=> nil
1.9.3-p385 :004 >

Take a loot at strip! method
>> #title = "abc"
=> "abc"
>> #title.strip!
=> nil
>> #title
=> "abc"
>> #title = " abc "
=> " abc "
>> #title.strip!
=> "abc"
>> #title
=> "abc"
source

What's the screen shot look like when you do:
def address=(a)
write_attribute(:address, a.strip.unpack("C*").join('-') )
end
Update based on comment answers. Another way to get rid of the \r's at the end of each line:
def address=(a)
a = a.strip.split(/\r\n/).join("\n")
write_attribute(:address, a)
end

Related

Send a json with a multiline string instead of \n

I have a string with \n as a join result:
my_var = ['name','phone','age'].join("\n")
I will send this string in the body of a post as a Json:
def body
{ "str": my_var }
end
I want to send the var in multiple lines instead of sending the \n like this:
"name\nphone\nage"
EDIT:
I am sending this string joined with \n to another dev team as a json in the body of my post request to their uri. But they don't want to receive a string with \n and replace or translate there. They want to recieve in multiple lines.
You can use different separator and then, on receive, separate lines from whole string based on that delimiter using
my_delimiter = '\n' or my_delimiter = ';'
lines = my_var.split(my_delimiter).
Or you could iterate over ['name','phone','age'] and add them as separate json elements.
If this doesn't help, then please clarify your question.
They are receiving multiple lines, depending on their OS: https://en.wikipedia.org/wiki/Newline
2.1.10 :001 > my_var = ['name','phone','age'].join("\n")
=> "name\nphone\nage"
2.1.10 :002 > puts my_var
name
phone
age
=> nil
2.1.10 :003 > my_var.to_json
=> "\"name\\nphone\\nage\""
2.1.10 :004 > puts my_var.to_json
"name\nphone\nage"
=> nil
2.1.10 :005 >
The text you're sending them contains instructions on where a line break is (i.e. the \n character). The graphical display of it is up to their end.
That's the expected format of the JSON. You should not be including a new line in the values, but using the '\n' character.
json = { text: ['name','phone','age'].join("\n") }.to_json
puts JSON.parse( json )['text']
you can see the parse is parsing it correctly.
pry(main)> puts JSON.parse( json )['text']
name
phone
age
=> nil

Include apostrophe with .split()

I'm trying to display an array of words from a user's post. However the method I'm using treats an apostrophe like whitespace.
<%= var = Post.pluck(:body) %>
<%= var.join.downcase.split(/\W+/) %>
So if the input text was: The baby's foot
it would output the baby s foot,
but it should be the baby's foot.
How do I accomplish that?
Accepted answer is too naïve:
▶ "It’s naïve approach".split(/[^'\w]+/)
#⇒ [
# [0] "It",
# [1] "s",
# [2] "nai",
# [3] "ve",
# [4] "approach"
# ]
this is because nowadays there is almost 2016 and many users might want to use their normal names, like, you know, José Østergaard. Punctuation is not only the apostroph, as you might notice.
▶ "It’s naïve approach".split(/[^'’\p{L}\p{M}]+/)
#⇒ [
# [0] "It’s",
# [1] "naïve",
# [2] "approach"
# ]
Further reading: Character Properties.
Along the lines of mudasobwa's answer, here's what \w and \W bring to the party:
chars = [*' ' .. "\x7e"].join
# => " !\"\#$%&'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~"
That's the usual visible lower-ASCII characters we'd see in code. See the Regexp documentation for more information.
Grabbing the characters that match \w returns:
chars.scan(/\w+/)
# => ["0123456789",
# "ABCDEFGHIJKLMNOPQRSTUVWXYZ",
# "_",
# "abcdefghijklmnopqrstuvwxyz"]
Conversely, grabbing the characters that don't match \w, or that match \W:
chars.scan(/\W+/)
# => [" !\"\#$%&'()*+,-./", ":;<=>?#", "[\\]^", "`", "{|}~"]
\w is defined as [a-zA-Z0-9_] which is not what you want to normally call "word" characters. Instead they're typically the characters we use to define variable names.
If you're dealing with only lower-ASCII characters, use the character-class
[a-zA-Z]
For instance:
chars = [*' ' .. "\x7e"].join
lower_ascii_chars = '[a-zA-Z]'
not_lower_ascii_chars = '[^a-zA-Z]'
chars.scan(/#{lower_ascii_chars}+/)
# => ["ABCDEFGHIJKLMNOPQRSTUVWXYZ", "abcdefghijklmnopqrstuvwxyz"]
chars.scan(/#{not_lower_ascii_chars}+/)
# => [" !\"\#$%&'()*+,-./0123456789:;<=>?#", "[\\]^_`", "{|}~"]
Instead of defining your own, you can take advantage of the POSIX definitions and character properties:
chars.scan(/[[:alpha:]]+/)
# => ["ABCDEFGHIJKLMNOPQRSTUVWXYZ", "abcdefghijklmnopqrstuvwxyz"]
chars.scan(/\p{Alpha}+/)
# => ["ABCDEFGHIJKLMNOPQRSTUVWXYZ", "abcdefghijklmnopqrstuvwxyz"]
Regular expressions always seem like a wonderful new wand to wave when extracting information from a string, but, like the Sorcerer's Apprentice found out, they can create havoc when misused or not understood.
Knowing this should help you write a bit more intelligent patterns. Apply that to what the documentation shows and you should be able to easily figure out a pattern that does what you want.
You can use below RegEx instead of /\W+/
var.join.downcase.split(/[^'\w]+/)
/\W/ refers to all non-word characters, apostrophe is one such non-word character.
To keep the code as close to original intent, we can use /[^'\w]/ - this means that all characters that are not apostrophe and word character.
Running that string through irb with the same split call that you wrote in your comment gets this:
irb(main):008:0> "The baby's foot".split(/\W+/)
=> ["The", "baby", "s", "foot"]
However, if you use split without an explicit delimiter, you get the split you're looking for:
irb(main):009:0> "The baby's foot".split
=> ["The", "baby's", "foot"]
Does that get you what you're looking for?

How to display "\s" as string in ruby

I want to transform " - " string in Ruby to being translatable to regexp. I need to have something like that:
my_regexp => "\s?-\s?"
However, I have a problem with special characters: This "\s" character isn't shown correctly. I tried few ways. Without success.
INPUT => OUTPUT
"\s?" => " ?"
"\\s?" => "\\s?"
Have you any idea how to solve that?
\\ is just a escaped \.
If you print, puts it, you will see the actual string.
>> '\s' # == "\\s"
=> "\\s"
>> puts '\s'
\s
=> nil
BTW, "\s" (not '\s') is another representation of whitespace " ":
>> "\s" == " "
=> true
Most likely, what you're seeing is the result of how IRB displays values. Your second example is correct, (the actual result only contains a single slash, which you can confirm by creating a new Regexp object from it):
>> "\\s?"
"\\s?"
>> puts "\\s?"
\s?
>> Regexp.new "\\s?"
/\s?/

how to replace an apostrophe using gsub

I want to replace apostrophe(') in a name with "backslash apostrophe" (\') . But Unfortunately not getting such a simple thing.
So on irb I tried following
x = "stack's"
x.gsub(/[\']/,"\'")
Some how it is not working I am getting same result- stack's in place of stack\'s
Try this:
x = "anupam's"; puts x.gsub("'", "\\\\'")
Try this out:
x.gsub(/[']/,"\\\\\'")
Result:
1.9.3p0 :014 > puts x.gsub(/[']/,"\\\\\'")
anupam\'s
Here's a ruby variant for PHPs addslashes method (from http://www.ruby-forum.com/topic/113067#263640). This method also escapes \ in the string, with double \:
class String
def addslashes
self.gsub(/['"\\\x0]/,'\\\\\0')
end
end
Which would correctly escape anupam's:
"anupam's".addslashes # => "anupam\\'s"

\n to <br> and multiple \n to <p> question

I use bbruby gem to replace text in bbcode with html.
It replaces \r\n \n with <br>, and mutiple \r\n \ns with <p>.
# https://github.com/cpjolicoeur/bb-ruby/blob/master/lib/bb-ruby.rb
def simple_format(text)
start_tag = '<p>'
text = text.to_s.dup
text.gsub!(/\r\n?/, "\n") # \r\n and \r => \n
text.gsub!(/\n\n+/, "</p>\n\n#{start_tag}") # 2+ newline => paragraph
text.gsub!(/([^\n]\n)(?=[^\n])/, '\1<br />') # 1 newline => br
text.insert 0, start_tag
text << "</p>"
end
It looks fine!
But when the text contains <table>, it becomes terrible! I want to avoid replacing \n when \n is in a table tag, and I try to replace \n in a table before bbruby replaces it, but it doesn't work.
text.gsub!(/\r\n?/, "\n")
Should be
text.gsub!(/\r?\n/, "\n")
You could get into look-ahead and look-behind in your regex to see if you're within a table tag (depending on the version of ruby you're using, this may not be available to you). You may want to instead just start your method by splitting your string on table tags, giving you an odd number of strings. Run the regexs above only on even indexed strings. Then join the strings together with table tags. This would allow you to properly terminate and start paragraph tags and let you ignore the line breaks in the tables.
def simple_format( text )
strings = text.split(/<\/?table>/)
strings.each_with_index do |i, string|
if i % 2 == 0 # even index == outside of table tags
string.gsub!(/\r?\n/, "\n") # \r\n and \r => \n
# ...
strings[i] = "<p>" + string + "</p>"
else # odd index == inside of table tags
strings[i] = "<table>" + string + "</table>"
end
end
strings.join
end
That said, you may want to run away from regex entirely for this as the solution I described assumes that there are no table tags within table tags or unterminated table tags.
well, somewhat like this?
def simple_format( text )
return text if ( text =~ /(<table.*>)/ ) # return text unchanged
start_tag = '<p>'
text = text.to_s.dup
text.gsub!(/\r\n?/, "\n") # \r\n and \r => \n
text.gsub!(/\n\n+/, "</p>\n\n#{start_tag}") # 2+ newline => paragraph
text.gsub!(/([^\n]\n)(?=[^\n])/, '\1<br />') # 1 newline => br
text.insert 0, start_tag
text << "</p>"
end
bbcode and HTML do not mix. In fact bbcode was designed especially to NOT allow html tags. Since this is the design of bbcode, I don't see ways to hack around it. If you wish to continue using bbcode, you should consider it's not going to work with HTML input.
As far as I know, bbcode does not have syntax for html tables. If you absolutely need tables, consider switching to a different parser, or allow a full HTML editor like tinymce.

Resources