How to avoid downcasing acronyms in ruby / rails [closed] - ruby-on-rails

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have a field that contains titles like "IT Professional" or "DB Administrator".
I want to display this in the middle of a sentence and so need to down-case. Unfortunately, this also downcases the acronyms and I end up with "Thanks for joining a community of it professionals".
A good start would be the solution mentioned by Grantovich below, i.e. specifying my acronyms in config/initializers/inflections.rb:
ActiveSupport::Inflector.inflections do |inflect|
inflect.acronym "IT"
inflect.acronym "DB"
end
The problem with going this route is that firstly, I don't want to store them in lower case as suggested as part of the solution because they are titles and should be stored with capitals. Secondly, they are already defined in uppercase and it would be a bad idea to suddenly make them lower case.
Solution Found: Since I want the title to appear in the middle of a sentence, hence the need for lower case, I solved it by downcasing the title, constructing the sentence and then calling #humanize on that. Humanize will capitalize the first letter of the sentence and any defined acronyms.

If possible, I would store the strings as "IT professional", "DB administrator", etc. with all letters except for the acronyms already downcased. Then you can add your acronyms to the inflector and use #titleize to convert to title case when needed. In terms of edge cases and code maintenance burden, this is a better solution than writing your own code to do "selective downcasing".

If we assume that by acronym, you mean any word in your string that is made of 2 or more capitals in a row, then you could do something like this:
def smart_case(field)
field.to_s.split(' ').map { |word|
/[A-Z][A-Z]+/.match(word) ? word : word.downcase
}.join(' ')
end

This is an ugly way to do it but:
def format_me(str)
str.downcase!
#acronymn_words = ["IT Professional", "DB Administrator"]
#acronymn_words.each do |a|
if str.include? a.downcase
str.gsub!(a.downcase,a)
end
end
capitalize_next = true
str = str.split.map do |word|
if capitalize_next then word.capitalize! end
capitalize_next = word.end_with?(".","!","?")
word
end.join(" ")
end
This would be difficult to maintain unless you know the exact strings you are looking for but it will put out a correctly formatted sentence with the items you requested.

I would do this like that :
do_not_downcase = ["IT", "DB"] # Complete the list with your favourites words
res = ""
str.split(" ").each do |word|
if !do_not_downcase.include? word then
res += word.downcase + " "
else
res += word + " "
end
end
puts res
>welcome IT professionals

Related

Started learning Ruby from scratch, why is user_input equivalent to gets.chomp? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 months ago.
Improve this question
I started learning Ruby from scratch, from the preliminary preparation there is a certain knowledge of HTML and CSS. For training I use Code Academy. I have questions and can't always find an answer I can understand I need help understanding the following:
user_input = gets.chomp
user_input.downcase!
Explain why user_input is equivalent to gets.chomp and what that means, thanks in advance!
In Ruby = is used to assign values to variables, as in:
x = 1
y = x
Where y assumes the value of x at the moment that line is executed. This is not to be confused with "equivalence" as in x=y in a mathematical sense where you're establishing some kind of permanent relationship.
In Ruby methods return a value, even if that value is "nothing", or nil. In the case of gets, it returns a String. You can call chomp on that, or any other thing you need to achieve your objective, like chaining on downcase.
On its own gets.chomp will read a line of input, strip off the trailing linefeed character, and then throw the result in the trash. Assigning this to a variable preserves that output.
To understand it, break it down first
Accept user input
Clean the user input (using chomp https://apidock.com/ruby/String/chomp)
Downcase it
user_input = gets # will return the value entered by the user
user_input = user_input.chomp # will remove the trailing \n
# A more idiomatic way to achieve the above steps in a single line
user_input = gets.chomp
# Finally downcase
user_input.downcase!
# By that same principle the entire code can be written in a single line
user_input = gets.chomp.downcase
user_input is equivalent to gets.chomp
Remember, everything in Ruby is an object. So gets returns a String object, so does chomp and so does downcase. Hence with this logic you are essentially calling instance methods on the String class
String.new("hello") == "hello" # true
# "hello".chomp is same as String.new("hello").chomp

Only allow English characters in a string in Rails [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I've been using this:
if value.chars.count < value.bytes.count
puts "Some non english characters found."
end
But this incorrectly marks the following as non-English.
React and You: A Designer’s Point of View
How can I easily check if a string has no Asian/French/Russian characters?
I can probably iterate through each char in the string and if .bytes == 1 add it to a temp var. Then if that temp var is not nil it means it's an English character. But this seems rather convoluted.
As pointed out in the comments (here and here), this solution will reject some english words with letters that may be considered as "non English" characters.
Using the answer provided in "How to read only English characters" you could adjust it to remove any punctuation character or space, and make the comparison wit that same regex, something like this:
str = "React and You: A Designer’s Point of View"
str.gsub(/[[:punct:]]|\s/, "") =~ /^[a-zA-Z]+$/
#=> 0
.gsub(/[[:punct:]]|\s/, "") will remove any punctuation character or space, so you can compare that with the /^[a-zA-Z]+$/ regexp.
Here are step by step examples:
str = "React and You: A Designer’s Point of View"
str.gsub!(/[[:punct:]]|\s/, "") #=> "ReactandYouADesignersPointofView"
str =~ /^[a-zA-Z]+$/ #=> 0
str = "Comment ça va?"
str.gsub!(/[[:punct:]]|\s/, "") #=> "Commentçava"
str =~ /^[a-zA-Z]+$/ #=> nil
If you are expecting numbers too, then change the regexp to: /^[a-zA-Z0-9]+$/.
As pointed out in this comment, note that using [[:punct:]] will allow non-english punctuation characters such as ¿ or ¡; so, if those characters are also expected (and must cause to reject the sentence as valid), then maybe it is better to avoid gsub and compare to a custom regex with all allowed characters, for example1:
str =~ /^[a-zA-Z0-9\[\]{}\\*:;#$%&#?!|’'"-\.\/_\s]+$/
1 This is just an example with most common characters that i could think of, but needs to be customized with any character considered as valid.

Ruby regex to find words starting with #

I'm trying to write a very simple regex to find all words in a string that start with the symbol #. Then change the word to a link. Like you would see in a Twitter where you can mention other usernames.
So far I have written this
def username_link(s)
s.gsub(/\#\w+/, "<a href='/username'>username</a>").html_safe
end
I know it's very basic and not much, but I'd rather write it on my own right now, to fully understand it, before searching GitHub to find a more complex one.
What I'm trying to find out is how can I reference that matched word and include it in the place of username. Once I can do that i can easily strip the first character, #, out of it.
Thanks.
You can capture using parentheses and backreference with \1 (and \2, and so on):
def username_link(s)
s.gsub(/#(\w+)/, "<a href='/\\1'>\\1</a>").html_safe
end
See also this answer
You should use gsub with back references:
str = "I know it's very basic and not much, but #tim I'd rather write it on my own."
def username_to_link(str)
str.gsub(/\#(\w+)/, '#\1')
end
puts username_to_link(str)
#=> I know it's very basic and not much, but #tim I'd rather write it on my own.
Following Regex should handle corner cases which other answers ignore
def auto_username_link(s)
s.gsub(/(^|\s)\#(\w+)($|\s)/, "\\1<a href='/\\2'>\\2</a>\\3").html_safe
end
It should ignore strings like "someone#company" or "#username-1" while converting everything like "Hello #username rest of message"
How about this:
def convert_names_to_links(str)
str = " " + str
result = str.gsub(
/
(?<=\W) #Look for a non-word character(space/punctuation/etc.) preceeding
# #an "#" character, followed by
(\w+) #a word character, one or more times
/xm, #Standard normalizing flags
'#\1'
)
result[1..-1]
end
my_str = "#tim #tim #tim, ##tim,#tim t#mmy?"
puts convert_names_to_links(my_str)
--output:--
#tim #tim #tim, ##tim,#tim t#mmy?

How to sort .edu email domains? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Closed 9 years ago.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Questions concerning problems with code you've written must describe the specific problem — and include valid code to reproduce it — in the question itself. See SSCCE.org for guidance.
Improve this question
I am using Ruby on Rails to make a university-exclusive website that categorizes all registered users into their specific universities via their ".edu" email. Nearly all US-based universities have an "xyz.edu" email domain. In essence, everyone that signs up with their ".edu" email would all be categorized with a similar "domain.edu".
I've searched for a regex to look for like-domains.edu and assign them into a variable or specific indexes, but I must be looking in the wrong place because I cannot find how to do this.
Would I use regex for this? Or maybe a method after their email has been verified?
I would appreciate any help or feedback I can get.
You could use a regex to extract domain names:
"gates#harvard.edu" =~ /.*#(.*)$/
This simple regexp will capture everything after the # symbol. You can experiment more with this regexp here.
However, what you have to think about is how to handle cases like gates#harvard.edu vs gates#seas.harvard.edu.
My example will parse them out as different entities: harvard.edu vs seas.harvard.edu.
I would probably go ahead and create an institution/university/group model that would hold those users. It would be easier now than later down the line. But, in an effort to answer your question, you could do something like:
array_of_emails = ['d#xyz.edu', 'a#abc.edu', 'c#xyz.edu', 'b#abc.edu' ]
array_of_emails.sort_by! { |email| "#{email[email.index('#')..-1]}#{email[0..email.index('#')]}" }
EDIT: Changed sort! to sort_by!
Dealing with domains is going to get a lot more complex in the future, with new TLDs coming on line. Assuming that .edu is the only educational TLD will be wrong.
A simple way to grab just the domain for now is:
"gates#harvard.edu"[/(#.+)$/, 1] # => "#harvard.edu"
That will handle things like:
"gates#mail.harvard.edu"[/(#.+)$/, 1] # => "#mail.harvard.edu"
If you don't want the #, simply shift the opening parenthesis right one character:
pattern = /#(.+)$/
"gates#harvard.edu"[pattern, 1] # => "harvard.edu"
"gates#mail.harvard.edu"[pattern, 1] # => "mail.harvard.edu"
If you want to normalize the domain to strip off sub-domains, you can do something like:
pattern = /(\w+\.\w+)$/
"harvard.edu"[pattern, 1] # => "harvard.edu"
"mail.harvard.edu"[pattern, 1] # => "harvard.edu"
which only grabs the last two "words" that are separated by a single ..
That's somewhat naive, as non-US domains can have a country code, so if you need to handle those you can do something like:
pattern = /(\w+\.edu(?:\.\w+)?)$/
"harvard.edu"[pattern, 1] # => "harvard.edu"
"harvard.edu.cc"[pattern, 1] # => "harvard.edu.cc"
"mail.harvard.edu.cc"[pattern, 1] # => "harvard.edu.cc"
And, as to whether you should do this before or after you've verified their address? Do it AFTER. Why waste your CPU time and disk space processing invalid addresses?
array_of_emails = ['d#xyz.edu', 'a#abc.edu', 'c#xyz.edu', 'b#abc.edu' ]
x = array_of_emails.sort_by do | a | a.match(/#.*/)[0] end
x.each do |a|
puts a
end

Interpret newlines as <br>s in markdown (Github Markdown-style) in Ruby

I'm using markdown for comments on my site and I want users to be able to create line breaks by pressing enter instead of space space enter (see this meta question for more details on this idea)
How can I do this in Ruby? You'd think Github Flavored Markdown would be exactly what I need, but (surprisingly), it's quite buggy.
Here's their implementation:
# in very clear cases, let newlines become <br /> tags
text.gsub!(/^[\w\<][^\n]*\n+/) do |x|
x =~ /\n{2}/ ? x : (x.strip!; x << " \n")
end
This logic requires that the line start with a \w for a linebreak at the end to create a <br>. The reason for this requirement is that you don't to mess with lists: (But see the edit below; I'm not even sure this makes sense)
* we don't want a <br>
* between these two list items
However, the logic breaks in these cases:
[some](http://google.com)
[links](http://google.com)
*this line is in italics*
another line
> the start of a blockquote!
another line
I.e., in all of these cases there should be a <br> at the end of the first line, and yet GFM doesn't add one
Oddly, this works correctly in the javascript version of GFM.
Does anyone have a working implementation of "new lines to <br>s" in Ruby?
Edit: It gets even more confusing!
If you check out Github's official Github Flavored Markdown repository, you'll find yet another newline to <br> regex!:
# in very clear cases, let newlines become <br /> tags
text.gsub!(/(\A|^$\n)(^\w[^\n]*\n)(^\w[^\n]*$)+/m) do |x|
x.gsub(/^(.+)$/, "\\1 ")
end
I have no clue what this regex means, but it doesn't do any better on the above test cases.
Also, it doesn't look like the "don't mess with lists" justification for requiring that lines start with word characters is valid to begin with. I.e., standard markdown list semantics don't change regardless of whether you add 2 trailing spaces. Here:
item 1
item 2
item 3
In the source of this question there are 2 trailing spaces after "item 1", and yet if you look at the HTML, there is no superfluous <br>
This leads me to think the best regex for converting newlines to <br>s is just:
text.gsub!(/^[^\n]+\n+/) do |x|
x =~ /\n{2}/ ? x : (x.strip!; x << " \n")
end
Thoughts?
I'm not sure if this will help, but I just use simple_format()
from ActionView::Helpers::TextHelper
ActionView simple_format
my_text = "Here is some basic text...\n...with a line break."
simple_format(my_text)
output => "<p>Here is some basic text...\n<br />...with a line break.</p>"
Even if it doesn't meet your specs, looking at the simple_format() source code .gsub! methods might help you out writing your own version of required markdown.
A little too late, but perhaps useful for other people. I've gotten it to work (but not thoroughly tested) by preprocessing the text using regular expressions, like so. It's hideous as a result of the lack of zero-width lookbehinds, but oh well.
# Append two spaces to a simple line, if it ends in newline, to render the
# markdown properly. Note: do not do this for lists, instead insert two newlines. Also, leave double newlines
# alone.
text.gsub! /^ ([\*\+\-]\s+|\d+\s+)? (.+?) (\ \ )? \r?\n (\r?\n|[\*\+\-]\s+|\d+\s+)? /xi do
full, pre, line, spaces, post = $~.to_a
if post != "\n" && pre.blank? && post.blank? && spaces.blank?
"#{pre}#{line} \n#{post}"
elsif pre.present? || post.present?
"#{pre}#{line}\n\n#{post}"
else
full
end
end

Resources