Why does Rails titlecase add a space to a name? - ruby-on-rails

Why does the titlecase mess up the name? I have:
John Mark McMillan
and it turns it into:
>> "john mark McMillan".titlecase
=> "John Mark Mc Millan"
Why is there a space added to the last name?
Basically I have this in my model:
before_save :capitalize_name
def capitalize_name
self.artist = self.artist.titlecase
end
I am trying to make sure that all the names are titlecase in the DB, but in situtations with a camelcase name it fails. Any ideas how to fix this?

You can always do it yourself if Rails isn't good enough:
class String
    def another_titlecase
        self.split(" ").collect{|word| word[0] = word[0].upcase; word}.join(" ")
    end
end
"john mark McMillan".another_titlecase
=> "John Mark McMillan"
This method is a small fraction of a second faster than the regex solution:
My solution:
ruby-1.9.2-p136 :034 > Benchmark.ms do
ruby-1.9.2-p136 :035 > "john mark McMillan".split(" ").collect{|word|word[0] = word[0].upcase; word}.join(" ")
ruby-1.9.2-p136 :036?> end
=> 0.019311904907226562
Regex solution:
ruby-1.9.2-p136 :042 > Benchmark.ms do
ruby-1.9.2-p136 :043 > "john mark McMillan".gsub(/\b\w/) { |w| w.upcase }
ruby-1.9.2-p136 :044?> end
=> 0.04482269287109375

Hmm, that's odd.. but you could write a quick custom regex to avoid using that method.
class String
def custom_titlecase
self.gsub(/\b\w/) { |w| w.upcase }
end
end
"John Mark McMillan".custom_titlecase # => "John Mark McMillan"
Source

If all you want is to ensure that each word starts with a capital:
class String
def titlecase2
self.split(' ').map { |w| w[0] = w[0].upcase; w }.join(' ')
end
end
irb(main):016:0> "john mark McMillan".titlecase2
=> "John Mark McMillan"

Edited (inspired by The Tin Man's suggestion)
A hack will be:
class String
def titlecase
gsub(/(?:_|\b)(.)/){$1.upcase}
end
end
p "john mark McMillan".titlecase
# => "John Mark McMillan"
Note that the string 'john mark McMillan' is inconsistent in capitalization, and is somewhat unexpected as a human input, or if it is not from a human input, you probably should not have the strings stored in that way. A string like 'john mark mc_millan' is more consistent, and would more likely appear as a human input if you define such convention. My answer will handle these cases as well:
p "john mark mc_millan".titlecase
# => "John Mark McMillan"

If you want to handle the case where someone has entered JOHN CAPSLOCK JOE as well as the others, I combined this one:
class String
def proper_titlecase
if self.titleize.split.length == self.split.length
self.titleize
else
self.split(" ").collect{|word| word[0] = word[0].upcase; word}.join(" ")
end
end
end
Depends if you want that kinda logic on a String method ;)

The documentation for titlecase says ([emphasis added]):
Capitalizes all the words and replaces
some characters in the string to
create a nicer looking title. titleize
is meant for creating pretty output.
It is not used in the Rails internals.
I'm only guessing here, but perhaps it regards PascalCase as a problem - maybe it thinks it's the name of a ActiveRecordModelClass.

We have just added this which supports a few different cases that we face.
class String
# Default titlecase converts McKay to Mc Kay, which is not great
# May even need to remove titlecase completely in the future to leave
# strings unchanged
def self.custom_title_case(string = "")
return "" if !string.is_a?(String) || string.empty?
split = string.split(" ").collect do |word|
word = word.titlecase
# If we titlecase and it turns in to 2 words, then we need to merge back
word = word.match?(/\w/) ? word.split(" ").join("") : word
word
end
return split.join(" ")
end
end
And the rspec test
# spec/lib/modules/string_spec.rb
require 'rails_helper'
require 'modules/string'
describe "String" do
describe "self.custom_title_case" do
it "returns empty string if incorrect params" do
result_one = String.custom_title_case({ test: 'object' })
result_two = String.custom_title_case([1, 2])
result_three = String.custom_title_case()
expect(result_one).to eq("")
expect(result_two).to eq("")
expect(result_three).to eq("")
end
it "returns string in title case" do
result = String.custom_title_case("smiths hill")
expect(result).to eq("Smiths Hill")
end
it "caters for 'Mc' i.e. 'john mark McMillan' edge cases" do
result_one = String.custom_title_case("burger king McDonalds")
result_two = String.custom_title_case("john mark McMillan")
result_three = String.custom_title_case("McKay bay")
expect(result_one).to eq("Burger King McDonalds")
expect(result_two).to eq("John Mark McMillan")
expect(result_three).to eq("McKay Bay")
end
it "correctly cases uppercase words" do
result = String.custom_title_case("NORTH NARRABEEN")
expect(result).to eq("North Narrabeen")
end
end
end

You're trying to use a generic method for converting Rail's internal strings into more human readable names. It's not designed to handle "Mc" and "Mac" and "Van Der" and any number of other compound spellings.
You can use it as a starting point, then special case the results looking for the places it breaks and do some fix-ups, or you can write your own method that includes special-casing those edge cases. I've had to do that several times in different apps over the years.

You may also encounter names with two capital letters, such as McLaren, McDonald etc.
Have not spent time trying to improve it, but you could always do
Code
# Rails.root/config/initializers/string.rb
class String
def titleize_name
self.split(" ")
.collect{|word| word[0] = word[0].upcase; word}
.join(" ").gsub(/\b('?[a-z])/) { $1.capitalize }
end
end
Examples
[2] pry(main)> "test name".titleize_name
=> "Test Name"
[3] pry(main)> "test name-name".titleize_name
=> "Test Name-Name"
[4] pry(main)> "test McName-name".titleize_name
=> "Test McName-Name"

The "Why" question has already been answered...but as evidenced by the selected answer and upvotes, I think what most of us are ACTUALLY wanting is a silver bullet to deal with the hell that is name-formatting...While multiple capitals trigger that behavior, I've found that hyphenated names do the same.
These cases and many more have already been handled in the gem, NameCase.
In version 2.0 it only converts a string if the string is all uppercase or all lowercase, based on a defined ruleset as a best guess. I like this, because I'm sure the ruleset can never be 100% correct. Example, Ian McDonald (from Scotland) has a different capitalization from Ian Mcdonald (from Ireland)...however those names will be handled correctly at the time of input if the user is particular and if not, the name can be corrected if needed and retain its formatting.
My Solution:
# If desired, add string method once NameCase gem is added
class String
def namecase
NameCase(self)
end
end
Tests: (name.namecase)
test_names = ["john mark McMillan", "JOHN CAPSLOCK JOE", "test name", "test name-name", "test McName-name", "John w McHENRY", "ian mcdonald", "Ian McDonald", "Ian Mcdonald"]
test_names.each { |name| puts '# "' + name + '" => "' + name.namecase + '"' }
# "john mark McMillan" => "John Mark McMillan"
# "JOHN CAPSLOCK JOE" => "John Capslock Joe"
# "test name" => "Test Name"
# "test name-name" => "Test Name-Name"
# "test McName-name" => "Test McName-Name"
# "John w McHENRY" => "John w McHENRY" -FAIL
# "ian mcdonald" => "Ian McDonald"
# "Ian McDonald" => "Ian McDonald"
# "Ian Mcdonald" => "Ian Mcdonald"
If you feel you need to handle all of the corner cases on this page and don't care about losing names that may have been formatted at the start, eg. Ian Mcdonald (from Ireland)...you could use upcase first:
Tests: (name.upcase.namecase)
test_names.each { |name| puts '# "' + name + '" => "' + name.upcase.namecase + '"' }
# "john mark McMillan" => "John Mark McMillan"
# "JOHN CAPSLOCK JOE" => "John Capslock Joe"
# "test name" => "Test Name"
# "test name-name" => "Test Name-Name"
# "test McName-name" => "Test McName-Name"
# "John w McHENRY" => "John W McHenry"
# "ian mcdonald" => "Ian McDonald"
# "Ian McDonald" => "Ian McDonald"
# "Ian Mcdonald" => "Ian McDonald"
The only silver bullet is to go old school...ALL CAPS. But who wants that eyesore in their modern web app?

Related

Ruby: Can you determine if an object is having one of its methods called?

I'm not sure if I'm even asking the right question. I may be approaching the problem incorrectly, but basically I have this situation here:
obj = get_user(params)
obj.profile => {:name => "John D", :age => 40, :sex => "male"} #Has to be of class Hash
obj.profile.name => "John D"
obj.profile[:name] => "John D"
obj.profile.job => nil
So basically, I have to satisfy all of these conditions and I'm not sure exactly how to even approach this (I just learned Ruby today).
Note the dot notation for accessing the inner variables, otherwise I would have just had profile be a hash of symbols. So I've tried two methods, which only sort of get me there
Method 1: Make profile an OpenStruct
So this allows me to access name, age and sex using the dot notation, and it automatically returns nil if a key doesn't exist, however obj.profile is of the type OpenStruct instead of Hash
Method 2: Make profile its own class
With this I set them as instance variables, and I can use method_missing to return nil if they don't exist. But, I again run into the issue of obj.profile not being the correct type/class
Is there something I'm missing? Is there a way to maybe differentiate between
obj.profile
obj.profile.name
in the getter function and return either a hash or otherwise?
Can I change what is returned by my custom class for profile, so it returns a Hash instead?
I've even tried checking the args and **kwargs in the get function for obj.profile and neither of them seem to help, or populate if I call obj.profile.something
If it absolutely has to be a Hash:
require 'pp'
module JSHash
refine Hash do
def method_missing(name, *args, &block)
if !args.empty? || block
super(name, *args, &block)
else
self[name]
end
end
end
end
using JSHash
profile = {:name => "John D", :age => 40, :sex => "male"}
pp profile.name # "John D"
pp profile[:name] # "John D"
pp profile.job # nil
pp profile.class # Hash
But still better not to be a Hash, unless it absolutely needs to:
require 'pp'
class Profile < Hash
def initialize(hash)
self.merge!(hash)
end
def method_missing(name, *args, &block)
if !args.empty? || block
super(name, *args, &block)
else
self[name]
end
end
end
profile = Profile.new({:name => "John D", :age => 40, :sex => "male"})
pp profile.name
pp profile[:name]
pp profile.job
For only a few hash keys, you can easily define singleton methods like so:
def define_getters(hash)
hash.instance_eval do
def name
get_val(__method__)
end
def job
get_val(__method__)
end
def get_val(key)
self[key.to_sym]
end
end
end
profile = person.profile #=> {name: "John Doe", age: 40, gender: "M"}
define_getters(profile)
person.profile.name #=> "John Doe"
person.profile.job #=> nil
Reflects changed values as well (in case you were wondering):
person.profile[:name] = "Ralph Lauren"
person.profile.name #=> "Ralph Lauren"
With this approach, you won't have to override method_missing, create new classes inheriting from Hash, or monkey-patch the Hash class.
However, to be able to access unknown keys through method-calls and return nil instead of errors, you'll have to involve method_missing.
This Hash override will accomplish what you're trying to do. All you need to do is include it with one of your class files that you're already loading.
class Hash
def method_missing(*args)
if args.size == 1
self[args[0].to_sym]
else
self[args[0][0..-2].to_sym] = args[1] # last char is chopped because the equal sign is included in the string, print out args[0] to see for yourself
end
end
end
See the following IRB output to confirm:
1.9.3-p194 :001 > test_hash = {test: "testing"}
=> {:test=>"testing"}
1.9.3-p194 :002 > test_hash.test
=> "testing"
1.9.3-p194 :003 > test_hash[:test]
=> "testing"
1.9.3-p194 :004 > test_hash.should_return_nil
=> nil
1.9.3-p194 :005 > test_hash.test = "hello"
=> "hello"
1.9.3-p194 :006 > test_hash[:test]
=> "hello"
1.9.3-p194 :007 > test_hash[:test] = "success"
=> "success"
1.9.3-p194 :008 > test_hash.test
=> "success"
1.9.3-p194 :009 > test_hash.some_new_key = "some value"
=> "some value"
1.9.3-p194 :011 > test_hash[:some_new_key]
=> "some value"

creating name helper, to split first and last names apart

I'm looking for some help on how to take an attribute and process it through a method to return something different. But I've never done this before and I' not sure where to start. I thought trying to change a name:string attribute from "George Washington" or "John Quincy Adams" into first names only "George" and "John".
I thought maybe a helper method would be best, such as
users_helper.rb
def first_name
end
and then call #user.name.first_name, would this be initially how it would work? Can someone explain where I'd go next to be able to pass #user.name into the method? I've seen things like this but don't quite understand it the parenthesis...
def first_name(name)
puts name
end
Could someone breakdown how rails/ruby does this type of thing? Thanks a lot!
Some people have more than two names, such as "John Clark Smith". You can choose to treat them as:
(1) first_name: "John", last_name: "Smith"
def first_name
if name.split.count > 1
name.split.first
else
name
end
end
def last_name
if name.split.count > 1
name.split.last
end
end
(2) first_name: "John Clark", last_name: "Smith"
def first_name
if name.split.count > 1
name.split[0..-2].join(' ')
else
name
end
end
def last_name
if name.split.count > 1
name.split.last
end
end
(3) first_name: "John", last_name: "Clark Smith"
def first_name
name.split.first
end
def last_name
if name.split.count > 1
name.split[1..-1].join(' ')
end
end
The above examples assume that if the name contains less than 2 words then it is a first name.
The parentheses (which are optional) enclose the parameter list.
def first_name(full_name)
full_name.split(" ")[0]
end
This assumes the parameter is not nil.
> puts first_name "Jimmy McKickems"
Jimmy
> puts first_name "Jeezy"
Jeezy
But this is not a string method, as your assumption is now:
#user.full_name.first_name # Bzzt.
Instead:
first_name #user.name
This could be wrapped up in the model class itself:
class User < ActiveRecord
# Extra stuff elided
def first_name
self.full_name.blank? ? "" : self.full_name.split(" ")[0]
end
end
The extra code checks to see if the name is nil or whitespace (blank? comes from Rails). If it is, it returns an empty string. If it isn't, it splits it on spaces and returns the first item in the resulting array.
In case you are looking to split only once and provide both parts this one liner will work:
last_name, first_name = *name.reverse.split(/\s+/, 2).collect(&:reverse)
Makes the last word the last name and everything else the first name. So if there is a prefix, "Dr.", or a middle name that will be included with the first name. Obviously for last names that have separate words, "Charles de Gaulle" it won't work but handling that is much harder (if not impossible).
Use Ruby's Array#pop
For my needs I needed to take full names that had 1, 2, 3 or more "names" in them, like "AUSTIN" or "AUSTIN J GILLEY".
The Helper Method
def split_full_name_into_first_name_and_last_name( full_name )
name_array = full_name.split(' ') # ["AUSTIN", "J", "GILLEY"]
if name_array.count > 1
last_name = name_array.pop # "GILLEY"
first_name = name_array.join(' ') # "AUSTIN J"
else
first_name = name_array.first
last_name = nil
end
return [ first_name, last_name ] # ["AUSTIN J", "GILLEY"]
end
Using It
split_full_name_into_first_name_and_last_name( "AUSTIN J GILLEY" )
# => ["AUSTIN J", "GILLEY"]
split_full_name_into_first_name_and_last_name( "AUSTIN" )
# => ["AUSTIN", nil]
And you can easily assign the first_name and last_name with:
first_name, last_name = split_full_name_into_first_name_and_last_name( "AUSTIN J GILLEY" )
first_name
# => "AUSTIN J"
last_name
# => "GILLEY"
You can modify from there based on what you need or want to do with it.
For the syntax you're asking for (#user.name.first_name) Rails does a lot of this sort of extension by adding methods to base types, in your example you could do this through defining methods on the String class.
class String
def given; self.split(' ').first end
def surname; self.split(' ').last end
end
"Phileas Fog".surname # 'fog'
Another way to do something like this is to wrap the type you whish to extend, that way you can add all the crazy syntax you wish without polluting more base types like string.
class ProperName < String
def given; self.split(' ').first end
def surname; self.split(' ').last end
end
class User
def name
ProperName.new(self.read_attribute(:name))
end
end
u = User.new(:name => 'Phileas Fog')
u.name # 'Phileas Fog'
u.name.given # 'Phileas'
u.name.surname # 'Fog'
Just as complement of great Dave Newton's answer, here is what would be the "last name" version:
def last_name
self.full_name.blank? ? "" : self.full_name.split(" ")[-1]
end
making it simple
class User < ActiveRecord::Base
def first_name
self.name.split(" ")[0..-2].join(" ")
end
def last_name
self.name.split(" ").last
end
end
User.create name: "John M. Smith"
User.first.first_name
# => "John M."
User.first.last_name
# => "Smith"
Thanks
def test_one(name)
puts "#{name.inspect} => #{yield(name).inspect}"
end
def tests(&block)
test_one nil, &block
test_one "", &block
test_one "First", &block
test_one "First Last", &block
test_one "First Middle Last", &block
test_one "First Middle Middle2 Last", &block
end
puts "First name tests"
tests do |name|
name.blank? ? "" : name.split(" ").tap{|a| a.pop if a.length > 1 }.join(" ")
end
puts "Last name tests"
tests do |name|
name.blank? ? "" : (name.split(" ").tap{|a| a.shift }.last || "")
end
Output:
First name tests
nil => ""
"" => ""
"First" => "First"
"First Last" => "First"
"First Middle Last" => "First Middle"
"First Middle Middle2 Last" => "First Middle Middle2"
Last name tests
nil => ""
"" => ""
"First" => ""
"First Last" => "Last"
"First Middle Last" => "Last"
"First Middle Middle2 Last" => "Last"

How to match and replace templating tags in Ruby / Rails?

Trying to add a very rudimentary description template to one of my Rails models. What I want to do is take a template string like this:
template = "{{ name }} is the best {{ occupation }} in {{ city }}."
and a hash like this:
vals = {:name => "Joe Smith", :occupation => "birthday clown", :city => "Las Vegas"}
and get a description generated. I thought I could do this with a simple gsub but Ruby 1.8.7 doesn't accept hashes as the second argument. When I do a gsub as a block like this:
> template.gsub(/\{\{\s*(\w+)\s*\}\}/) {|m| vals[m]}
=> " is the best in ."
You can see it replaces it with the entire string (with curly braces), not the match captures.
How do I get it to replace "{{ something }}" with vals["something"] (or vals["something".to_sym])?
TIA
Using Ruby 1.9.2
The string formatting operator % will format a string with a hash as the arg
>> template = "%{name} is the best %{occupation} in %{city}."
>> vals = {:name => "Joe Smith", :occupation => "birthday clown", :city => "Las Vegas"}
>> template % vals
=> "Joe Smith is the best birthday clown in Las Vegas."
Using Ruby 1.8.7
The string formatting operator in Ruby 1.8.7 doesn't support hashes. Instead, you can use the same arguments as the Ruby 1.9.2 solution and patch the String object so when you upgrade Ruby you won't have to edit your strings.
if RUBY_VERSION < '1.9.2'
class String
old_format = instance_method(:%)
define_method(:%) do |arg|
if arg.is_a?(Hash)
self.gsub(/%\{(.*?)\}/) { arg[$1.to_sym] }
else
old_format.bind(self).call(arg)
end
end
end
end
>> "%05d" % 123
=> "00123"
>> "%-5s: %08x" % [ "ID", 123 ]
=> "ID : 0000007b"
>> template = "%{name} is the best %{occupation} in %{city}."
>> vals = {:name => "Joe Smith", :occupation => "birthday clown", :city => "Las Vegas"}
>> template % vals
=> "Joe Smith is the best birthday clown in Las Vegas."
codepad example showing the default and extended behavior
The easiest thing is probably to use $1.to_sym in your block:
>> template.gsub(/\{\{\s*(\w+)\s*\}\}/) { vals[$1.to_sym] }
=> "Joe Smith is the best birthday clown in Las Vegas."
From the fine manual:
In the block form, the current match string is passed in as a parameter, and variables such as $1, $2, $`, $&, and $’ will be set appropriately. The value returned by the block will be substituted for the match on each call.

Regex to pull out postal code from string

I have a search string that a user inputs text into.
If it contains any part of a postal code like: 1N1 or 1N11N1 or 1N1 1N1 then I want to pull that out of the text.
example:
John Doe 1n11n1
or
1n1 John Doe
or
John 1n11n1 Doe
I want to capture this:
postal_code: 1n11n1
other: John Doe
Can this be done using regex?
Try matching the regular expression /((?:\d[A-Za-z]\d)+)/ and returning $1:
def get_postal_code(s)
r = /((?:\d[A-Za-z]\d)+)/
return (s =~ r) ? [$1, s.sub(r,'')] : nil
end
# Example usage...
get_postal_code('John Doe 1n11n1') # => ['1n11n1', 'John Doe ']
get_postal_code('1n1 John Doe') # => ['1n1', ' John Doe']
get_postal_code('John Doe 1n1') # => ['1n1', 'John Doe ']
You could also cleanup the "other" string as follows.
...
return (s =~ r) ? [$1, s.sub(r,'').gsub(/\s+/,' ').strip] : nil
end
get_postal_code('John Doe 1n11n1') # => ['1n11n1', 'John Doe']
get_postal_code('1n1 John Doe') # => ['1n1', 'John Doe']
get_postal_code('John Doe 1n1') # => ['1n1', 'John Doe']
Not sure what is the format of the postal codes where you are, but I'd definitely resort to regexlib:
http://regexlib.com/Search.aspx?k=postal%20code
You'll find many regular expressions that you can use to match the postal code in your string.
To get the rest of the string, you can simply do a regex remove on the postal code and get the resulting string. There is probably a more efficient way to do this, but I'm going for simplicity :)
Hope this helps!
Yes, this can be done using a regex. Depending on the type of data in the rows you may be at risk for false positives, because anything that matches the pattern will be seen as a postal code (in your example though that does not seem likely).
Assuming that in your patterns N is an alpha character and 1 a numeric character you'd do something like the below:
strings = ["John Doe 1n11n1", "1n1 John Doe", "John 1n1 1n1 Doe"]
regex = /([0-9]{1}[A-Za-z]{1}[0-9]{2}[A-Za-z]{1}[0-9]{1}|[0-9]{1}[A-Za-z]{1}[0-9]{1}\s[0-9]{1}[A-Za-z]{1}[0-9]{1}|[0-9]{1}[A-Za-z]{1}[0-9]{1})/
strings.each do |s|
if regex.match(s)
puts "postal_code: #{regex.match(s)[1]}"
puts "rest: #{s.gsub(regex, "")}"
puts
end
end
This outputs:
postal_code: 1n11n1
rest: John Doe
postal_code: 1n1
rest: John Doe
postal_code: 1n1 1n1
rest: John Doe
If you want to get rid of excess spaces you can use String#squeeze(" ") to make it so :)

Ruby/Rails Parsing Emails

I'm currently using the following to parse emails:
def parse_emails(emails)
valid_emails, invalid_emails = [], []
unless emails.nil?
emails.split(/, ?/).each do |full_email|
unless full_email.blank?
if full_email.index(/\<.+\>/)
email = full_email.match(/\<.*\>/)[0].gsub(/[\<\>]/, "").strip
else
email = full_email.strip
end
email = email.delete("<").delete(">")
email_address = EmailVeracity::Address.new(email)
if email_address.valid?
valid_emails << email
else
invalid_emails << email
end
end
end
end
return valid_emails, invalid_emails
end
The problem I'm having is given an email like:
Bob Smith <bob#smith.com>
The code above is delete Bob Smith and only returning bob#smith.
But what I want is an hash of FNAME, LNAME, EMAIL. Where fname and lname are optional but email is not.
What type of ruby object would I use for that and how would I create such a record in the code above?
Thanks
I've coded so that it will work even if you have an entry like: John Bob Smith Doe <bob#smith.com>
It would retrieve:
{:email => "bob#smith.com", :fname => "John", :lname => "Bob Smith Doe" }
def parse_emails(emails)
valid_emails, invalid_emails = [], []
unless emails.nil?
emails.split(/, ?/).each do |full_email|
unless full_email.blank?
if index = full_email.index(/\<.+\>/)
email = full_email.match(/\<.*\>/)[0].gsub(/[\<\>]/, "").strip
name = full_email[0..index-1].split(" ")
fname = name.first
lname = name[1..name.size] * " "
else
email = full_email.strip
#your choice, what the string could be... only mail, only name?
end
email = email.delete("<").delete(">")
email_address = EmailVeracity::Address.new(email)
if email_address.valid?
valid_emails << { :email => email, :lname => lname, :fname => fname}
else
invalid_emails << { :email => email, :lname => lname, :fname => fname}
end
end
end
end
return valid_emails, invalid_emails
end
Here's a slightly different approach that works better for me. It grabs the name whether it is before or after the email address and whether or not the email address is in angle brackets.
I don't try to parse the first name out from the last name -- too problematic (e.g. "Mary Ann Smith" or Dr. Mary Smith"), but I do eliminate duplicate email addresses.
def parse_list(list)
r = Regexp.new('[a-z0-9\.\_\%\+\-]+#[a-z0-9\.\-]+\.[a-z]{2,4}', true)
valid_items, invalid_items = {}, []
## split the list on commas and/or newlines
list_items = list.split(/[,\n]+/)
list_items.each do |item|
if m = r.match(item)
## get the email address
email = m[0]
## get everything before the email address
before_str = item[0, m.begin(0)]
## get everything after the email address
after_str = item[m.end(0), item.length]
## enter the email as a valid_items hash key (eliminating dups)
## make the value of that key anything before the email if it contains
## any alphnumerics, stripping out any angle brackets
## and leading/trailing space
if /\w/ =~ before_str
valid_items[email] = before_str.gsub(/[\<\>\"]+/, '').strip
## if nothing before the email, make the value of that key anything after
##the email, stripping out any angle brackets and leading/trailing space
elsif /\w/ =~ after_str
valid_items[email] = after_str.gsub(/[\<\>\"]+/, '').strip
## if nothing after the email either,
## make the value of that key an empty string
else
valid_items[email] = ''
end
else
invalid_items << item.strip if item.strip.length > 0
end
end
[valid_items, invalid_items]
end
It returns a hash with valid email addresses as keys and the associated names as values. Any invalid items are returned in the invalid_items array.
See http://www.regular-expressions.info/email.html for an interesting discussion of email regexes.
I made a little gem out of this in case it might be useful to someone at https://github.com/victorgrey/email_addresses_parser
You can use rfc822 gem. It contains regular expression for seeking for emails that conform with RFC. You can easily extend it with parts for finding first and last name.
Along the lines of mspanc's answer, you can use the mail gem to do the basic email address parsing work for you, as answered here: https://stackoverflow.com/a/12187502/1019504

Resources