Is there currently a gem that's capable of taking strings, all in USD for this purpose, and converting them to a number? Some examples would be:
"$7,600" would turn into 7600
"5500" would turn into 5500
I know on the "5500" example I can just do "5500".to_i, but the spreadsheets being imported aren't consistent and some include commas and dollar signs while others do not. There a decent way of handling this across the board in Ruby?
I've tried something like money_string.scan(/\d/).join which seems to be fine, just worried I'll run into edge cases I haven't found yet, such as decimal places.
Why not remove all non-digit characters before calling .to_i
Example:
"$7,600".gsub(/\D/,'').to_i
And for a floating point number:
"$7,600.90".gsub(/[^\d\.]/, '').to_f
You can do:
"$100.00".scan(/[.0-9]/).join().to_f
or to_i if they're only dollars
You could use the Money gem
Money.parse("$100") == Money.new(10000, "USD")
You should be able to trim any non-numeric characters with a Ruby RegEx object. Sanitize your inputs to remove anything but numbers to the left of a decimal point, and then parse the numbers-only string as a number.
(And note that, if you're getting actual spreadsheets instead of CSV bunk, there's likely a value property you can read that ignores the text displayed on screen.)
def dollar_to_number(dollarPrice)
if dollarPrice
dollarPrice[1, dollarPrice.length].to_i
end
end
You can use the Monetize gem:
pry(main)> Monetize.parse("$7,600").to_i
=> 7600
https://github.com/RubyMoney/monetize
pry(main)> Monetize.parse("$7,600").class
=> Money
If you are using money-rails gem, you can do the following
irb(main):004:0> "$7,600".to_money("USD")
=> #<Money fractional:760000 currency:USD>
irb(main):005:0> "5500".to_money("USD")
=> #<Money fractional:550000 currency:USD>
Related
I'm using Rails and Nokogiri and I'm trying to parse some website.
This is where I'm stuck:
doc.css('#example > li:nth-child(1)').each do |node|
money = node.xpath('//*ul/li/div/span').text
end
It returns something like:
$100,000£230,000$40,000$9,000€600$800,000
I want to split those items, save them to the database and finally hand them to the view.
So, in the view, I want it to appear like:
(1)$100,000
(2)£230,000
(3)$40,000
(4)$9,000
(5)€600
(6)$800,000
I tried to split those items by this code below.
money = node.xpath('//*ul/li/div/span').text.split(/[$€£]/)
but the result looks like this:
["", "100,000", "230,000", "40,000", "9,000", "600", "800,000"]
And I don't know which item is in Dollar, Euro, or Pond.
Is there any good way to solve this problem?
you're almost there,
just use the positive lookahead :)
irb(main):005:0> "$100,000£230,000$40,000$9,000€600$800,000".split(/(?=[$£€])/)
=> ["$100,000", "£230,000", "$40,000", "$9,000", "€600", "$800,000"]
It needs a regular expression. This works:
"$100,000£230,000$40,000$9,000$600$800,000".scan(/([^\d][0-9,]+)/)
=> [["$100,000"],
["£230,000"],
["$40,000"],
["$9,000"],
["$600"],
["$800,000"]]
The regex contains these parts:
[^\d]: A character class matching a single non-digit. This will match the currency symbol.
`[0-9,]+': Another character class, this time repeating (the '+'). It matches the numeric part (0-9) plus the thousand's separator.
Hey I am using the ruby on rails framework and I have a price variable that is a decimal. Naturally values like $39.99 is fine but when the price is $39.90 my app shows the price as $39.9 How could I change that.
My view
%b price
= #product.price
rails includes the number_to_currency(#product.price) helper. Little simpler and easier to remember.
The standard answer here is to use sprintf.
sprintf("$%2.2f", #product.price)
This will format your number with a leading dollar sign, then the number to two decimal places.
If you want, you can write your custom helper method for this.
def num_to_currency price
"$#{price.to_i}."+"#{(price % 1.0)}"[2..3]
end
1.9.3 (main):0 > num_to_currency 6.90
=> "$6.90"
I was using phony to format phone numbers (meaning, if I put in xxx-xxx-xxxx it would convert to a string, and also tell if there is a (1) before to remove it).
But it really doesn't work for us phone number, it's designed for international numbers.
Is there an equivalent?
Thanks.
http://rubygems.org/gems/phony
Earlier this year, I reviewed a bunch of ruby gems that parse and format phone numbers. They fall into a number of groups (see below). TLDR: I used 'phone'. It might work for you because you can specify a default country code that it uses if your phone number doesn't include one.
1) US-centric:
big-phoney (0.1.4)
phone_wrangler (0.1.3)
simple_phone_number (0.1.9)
2) depends on rails or active_record:
phone_number (1.2.0)
validates_and_formats_phones (0.0.7)
3) forks of 'phone' that have been merged back into the trunk:
elskwid-phone (0.9.9.4)
tfe-phone (0.9.9.1)
4) relies on you to know the region ahead of time
phoney (0.1.0)
5) Kind of almost works for me
phone (0.9.9.3)
6) does not contain the substring 'phone' in the gem name (edit: I see you tried this one)
phony (1.6.1)
These groupings may be somewhat unfair or out of date so feel free to comment. I must admit I was a little frustrated at the time at how many people had partially re-invented this particular wheel.
I've never seen much in the way of a reliable telephone number formatter because it's just so hard to get it right. Just when you think you've seen everything, some other format comes along and wrecks it.
Ten digit North American numbers are perhaps the easiest to format, you can use a regular expression, but as soon as you encounter extensions you're in trouble. Still, you can kind of hack it yourself if you want:
def formatted_number(number)
digits = number.gsub(/\D/, '').split(//)
if (digits.length == 11 and digits[0] == '1')
# Strip leading 1
digits.shift
end
if (digits.length == 10)
# Rejoin for latest Ruby, remove next line if old Ruby
digits = digits.join
'(%s) %s-%s' % [ digits[0,3], digits[3,3], digits[6,4] ]
end
end
This will just wrangle eleven and ten digit numbers into the format you want.
Some examples:
formatted_number("1 (703) 451-5115")
# => "(703) 451-5115"
formatted_number("555-555-1212")
# => "(555) 555-1212"
I wrote this regex to match NANPA phone numbers with some conventions (e.g. for extensions) for PHP (thank god those days are over) and converted it over to a Rails validator a few months ago for a project. It works great for me, but it is more pragmatic than strictly to spec.
# app/validators/phone_number_validator.rb
class PhoneNumberValidator < ActiveModel::EachValidator
##regex = %r{\A(?:1(?:[. -])?)?(?:\((?=\d{3}\)))?([2-9]\d{2})(?:(?<=\(\d{3})\))? ?(?:(?<=\d{3})[.-])?([2-9]\d{2})[. -]?(\d{4})(?: (?:ext|x)\.? ?(\d{1,5}))?\Z}
def validate_each (object, attribute, value)
if m = value.match(##regex)
# format the phone number consistently
object.send("#{attribute}=", "(#{m[1]}) #{m[2]}-#{m[3]}")
else
object.errors[attribute] << (options[:message] || "is not an appropriately formatted phone number")
end
end
end
# app/models/foobar.rb
class Foobar < ActiveRecord::Base
validates :phone, phone_number: true
end
The saved/outputted format is like this: (888) 888-8888. Currently the output strips off the extension because I didn't need it. You can add it back in and change the format pretty easily (see the object.send line.
#RAILS_ROOT/lib/String.rb
class String
def convert_to_phone
number = self.gsub(/\D/, '').split(//)
#US 11-digit numbers
number = number.drop(1) if (number.count == 11 && number[0] == 1)
#US 10-digit numbers
number.to_s if (number.count == 10)
end
def format_phone
return "#{self[0,3]}-#{self[3,3]}-#{self[6,4]}"
end
end
"585-343-2070".convert_to_phone
=> "5853432070"
"5853432070".convert_to_phone
=> "5853432070"
"1(585)343-2070".convert_to_phone.format_phone
=> "585-343-2070"
##Everything formatted as requested in Asker's various comments
You could use rails number_to_phone method
see here:
http://api.rubyonrails.org/classes/ActionView/Helpers/NumberHelper.html#method-i-number_to_phone
def format_phone_numbers(n)
"(#{n[-10..-8]}) #{n[-7..-5]}-#{n[-4..-1]}"
end
format_phone_numbers('555555555555')
"(555) 555-5555"
I am wondering what is a convenient function in Rails to convert a string with a negative sign into a number. e.g. -1005.32
When I use the .to_f method, the number becomes 1005 with the negative sign and decimal part being ignored.
.to_f is the right way.
Example:
irb(main):001:0> "-10".to_f
=> -10.0
irb(main):002:0> "-10.33".to_f
=> -10.33
Maybe your string does not include a regular "-" (dash)? Or is there a space between the dash and the first numeral?
Added:
If you know that your input string is a string version of a floating number, eg, "10.2", then .to_f is the best/simplest way to do the conversion.
If you're not sure of the string's content, then using .to_f will give 0 in the case where you don't have any numbers in the string. It will give various other values depending on your input string too. Eg
irb(main):001:0> "".to_f
=> 0.0
irb(main):002:0> "hi!".to_f
=> 0.0
irb(main):003:0> "4 you!".to_f
=> 4.0
The above .to_f behavior may be just what you want, it depends on your problem case.
Depending on what you want to do in various error cases, you can use Kernel::Float as Mark Rushakoff suggests, since it raises an error when it is not perfectly happy with converting the input string.
You should be using Kernel::Float to convert the number; on invalid input, this will raise an error instead of just "trying" to convert it.
>> "10.5".to_f
=> 10.5
>> "asdf".to_f # do you *really* want a zero for this?
=> 0.0
>> Float("asdf")
ArgumentError: invalid value for Float(): "asdf"
from (irb):11:in `Float'
from (irb):11
>> Float("10.5")
=> 10.5
I have an ActiveRecord model, Foo, which has a name field. I'd like users to be able to search by name, but I'd like the search to ignore case and any accents. Thus, I'm also storing a canonical_name field against which to search:
class Foo
validates_presence_of :name
before_validate :set_canonical_name
private
def set_canonical_name
self.canonical_name ||= canonicalize(self.name) if self.name
end
def canonicalize(x)
x.downcase. # something here
end
end
I need to fill in the "something here" to replace the accented characters. Is there anything better than
x.downcase.gsub(/[àáâãäå]/,'a').gsub(/æ/,'ae').gsub(/ç/, 'c').gsub(/[èéêë]/,'e')....
And, for that matter, since I'm not on Ruby 1.9, I can't put those Unicode literals in my code. The actual regular expressions will look much uglier.
ActiveSupport::Inflector.transliterate (requires Rails 2.2.1+ and Ruby 1.9 or 1.8.7)
example:
>> ActiveSupport::Inflector.transliterate("àáâãäå").to_s
=> "aaaaaa"
Rails has already a builtin for normalizing, you just have to use this to normalize your string to form KD and then remove the other chars (i.e. accent marks) like this:
>> "àáâãäå".mb_chars.normalize(:kd).gsub(/[^\x00-\x7F]/n,'').downcase.to_s
=> "aaaaaa"
Better yet is to use I18n:
1.9.3-p392 :001 > require "i18n"
=> false
1.9.3-p392 :002 > I18n.transliterate("Olá Mundo!")
=> "Ola Mundo!"
I have tried a lot of this approaches but they were not achieving one or several of these requirements:
Respect spaces
Respect 'ñ' character
Respect case (I know is not a requirement for the original question but is not difficult to move an string to lowcase)
Has been this:
# coding: utf-8
string.tr(
"ÀÁÂÃÄÅàáâãäåĀāĂ㥹ÇçĆćĈĉĊċČčÐðĎďĐđÈÉÊËèéêëĒēĔĕĖėĘęĚěĜĝĞğĠġĢģĤĥĦħÌÍÎÏìíîïĨĩĪīĬĭĮįİıĴĵĶķĸĹĺĻļĽľĿŀŁłÑñŃńŅņŇňʼnŊŋÒÓÔÕÖØòóôõöøŌōŎŏŐőŔŕŖŗŘřŚśŜŝŞşŠšſŢţŤťŦŧÙÚÛÜùúûüŨũŪūŬŭŮůŰűŲųŴŵÝýÿŶŷŸŹźŻżŽž",
"AAAAAAaaaaaaAaAaAaCcCcCcCcCcDdDdDdEEEEeeeeEeEeEeEeEeGgGgGgGgHhHhIIIIiiiiIiIiIiIiIiJjKkkLlLlLlLlLlNnNnNnNnnNnOOOOOOooooooOoOoOoRrRrRrSsSsSsSssTtTtTtUUUUuuuuUuUuUuUuUuUuWwYyyYyYZzZzZz"
)
– http://blog.slashpoundbang.com/post/12938588984/remove-all-accents-and-diacritics-from-string-in-ruby
You have to modify a little bit the character list to respect 'ñ' character but is an easy job.
My answer: the String#parameterize method:
"Le cœur de la crémiére".parameterize
=> "le-coeur-de-la-cremiere"
For non-Rails programs:
Install activesupport: gem install activesupport then:
require 'active_support/inflector'
"a&]'s--3\014\xC2àáâã3D".parameterize
# => "a-s-3-3d"
Decompose the string and remove non-spacing marks from it.
irb -ractive_support/all
> "àáâãäå".mb_chars.normalize(:kd).gsub(/\p{Mn}/, '')
aaaaaa
You may also need this if used in a .rb file.
# coding: utf-8
the normalize(:kd) part here splits out diacriticals where possible (ex: the "n with tilda" single character is split into an n followed by a combining diacritical tilda character), and the gsub part then removes all the diacritical characters.
I think that you maybe don't really what to go down that path. If you are developing for a market that has these kind of letters your users probably will think you are a sort of ...pip.
Because 'å' isn't even close to 'a' in any meaning to a user.
Take a different road and read up about searching in a non-ascii way. This is just one of those cases someone invented unicode and collation.
A very late PS:
http://www.w3.org/International/wiki/Case_folding
http://www.w3.org/TR/charmod-norm/#sec-WhyNormalization
Besides that I have no ide way the link to collation go to a msdn page but I leave it there. It should have been http://www.unicode.org/reports/tr10/
This assumes you use Rails.
"anything".parameterize.underscore.humanize.downcase
Given your requirements, this is probably what I'd do... I think it's neat, simple and will stay up to date in future versions of Rails and Ruby.
Update: dgilperez pointed out that parameterize takes a separator argument, so "anything".parameterize(" ") (deprecated) or "anything".parameterize(separator: " ") is shorter and cleaner.
Convert the text to normalization form D, remove all codepoints with unicode category non spacing mark (Mn), and convert it back to normalization form C. This will strip all diacritics, and your problem is reduced to a case insensitive search.
See http://www.siao2.com/2005/02/19/376617.aspx and http://www.siao2.com/2007/05/14/2629747.aspx for details.
The key is to use two columns in your database: canonical_text and original_text. Use original_text for display and canonical_text for searches. That way, if a user searches for "Visual Cafe," she sees the "Visual Café" result. If she really wants a different item called "Visual Cafe," it can be saved separately.
To get the canonical_text characters in a Ruby 1.8 source file, do something like this:
register_replacement([0x008A].pack('U'), 'S')
You probably want Unicode decomposition ("NFD"). After decomposing the string, just filter out anything not in [A-Za-z]. æ will decompose to "ae", ã to "a~" (approximately - the diacritical will become a separate character) so the filtering leaves a reasonable approximation.
iconv:
http://groups.google.com/group/ruby-talk-google/browse_frm/thread/8064dcac15d688ce?
=============
a perl module which i can't understand:
http://www.ahinea.com/en/tech/accented-translate.html
============
brute force (there's a lot of htose critters!:
http://projects.jkraemer.net/acts_as_ferret/wiki#UTF-8support
http://snippets.dzone.com/posts/show/2384
I had problems getting the foo.mb_chars.normalize(:kd).gsub(/[^\x00-\x7F]/n,'').downcase.to_s solution to work. I'm not using Rails and there was some conflict with my activesupport/ruby versions that I couldn't get to the bottom of.
Using the ruby-unf gem seems to be a good substitute:
require 'unf'
foo.to_nfd.gsub(/[^\x00-\x7F]/n,'').downcase
As far as I can tell this does the same thing as .mb_chars.normalize(:kd). Is this correct? Thanks!
If you are using PostgreSQL => 9.4 as your DB adapter, maybe you could add in a migration it's "unaccent" extension that I think does what you want, like this:
def self.up
enable_extension "unaccent" # No falla si ya existe
end
In order to test, in the console:
2.3.1 :045 > ActiveRecord::Base.connection.execute("SELECT unaccent('unaccent', 'àáâãäåÁÄ')").first
=> {"unaccent"=>"aaaaaaAA"}
Notice there is case sensitive up to now.
Then, maybe use it in a scope, like:
scope :with_canonical_name, -> (name) {
where("unaccent(foos.name) iLIKE unaccent('#{name}')")
}
The iLIKE operator makes the search case insensitive. There is another approach, using citext data type. Here is a discussion about this two approaches. Notice also that use of PosgreSQL's lower() function is not recommended.
This will save you some DB space, since you will no longer require the cannonical_name field, and perhaps make your model simpler, at the cost of some extra processing in each query, in an amount depending of whether you are using iLIKE or citext, and your dataset.
If you are using MySQL maybe you can use this simple solution, but I have not tested it.
lol.. i just tryed this.. and it is working.. iam still not pretty sure why.. but when i use this 4 lines of code:
str = str.gsub(/[^a-zA-Z0-9 ]/,"")
str = str.gsub(/[ ]+/," ")
str = str.gsub(/ /,"-")
str = str.downcase
it automaticly removes any accent from filenames.. which i was trying to remove(accent from filenames and renaming them than) hope it helped :)