Encoding in Ruby 1.8.7 or 1.9.2 - ruby-on-rails

I have been trying to use the gem 'character-encodings' which doesn't build in 1.9.2 however it does in 1.8.7 but even when I require 'encoding/character/utf-8' I still cant do the simplest of encoding.
require 'encoding/character/utf-8'
str = u"hëllö"
str.length
#=> 5
str.reverse.length
#=> 5
str[/ël/]
#=> "ël"
I get
ruby-1.8.7-p302 > # encoding: utf-8
ruby-1.8.7-p302 > require 'encoding/character/utf-8'
=> nil
ruby-1.8.7-p302 > str = u"hll"
=> u"hll"
ruby-1.8.7-p302 > str.length
=> 3
ruby-1.8.7-p302 > #=> 5
ruby-1.8.7-p302 > str.reverse.length
=> 3
ruby-1.8.7-p302 > #=> 5
ruby-1.8.7-p302 > str[/l/]
=> "l"
My question is, is there a really nice encoding library that can accept allot or possibly all the different characters out there. Or maybe use utf-16? I have tried the magic code "# encoding: utf-8" which didn't seem to do it either.
Thank you

I'm afraid I don't understand your question. Are you having issues with the source code file? I've tried it both in console and a ruby script (1.8.7), and it does work.
require 'rubygems'
require 'encoding/character/utf-8'
str = u'hëllö'
puts str.length
puts str.reverse.length
puts str[/ël/]
and the output works as expected
5
5
ël
In Ruby 1.9+ (I tested in 1.9.2 preview) you don't need a library, as encoding is supported by the standard library. See this post for more information about it.
http://yehudakatz.com/2010/05/05/ruby-1-9-encodings-a-primer-and-the-solution-for-rails/

this works without c extensions and on 1.8/1.9, not all string methods work(but they are easy to add)
https://github.com/grosser/string19
require 'rubygems'
require 'string19'
String19('hëllö').length == 5

Related

Ruby pack - TypeError: can't convert String into Integer

I am getting TypeError: can't convert String into Integer, I found duplicate answer too, but I am facing this error for 'pack'.
Other confusion is, it is working fine with ruby 1.8.7, not with ruby 1.9.3, here is the code, I am using jruby1.7.2
irb(main):003:0> length=nil
=> nil
irb(main):004:0> token_string ||= ["A".."Z","a".."z","0".."9"].collect { |r| r.to_a }.join + %q(!$:*)
=> "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!$:*"
irb(main):005:0> token = (0..(length ? length : 60)).collect { token_string[rand( token_string.size)]}.pack("c*")
TypeError: can't convert String into Integer
from (irb):5:in `pack'
from (irb):5
from /home/appandya/.rvm/rubies/ruby-1.9.3-p374/bin/irb:13:in `<main>'
irb(main):006:0>
Any Idea?
string[x] in Ruby 1.8 gives you a Fixnum (the character code) and in 1.9 it gives you a single character string.
Array#pack turns an array into a binary sequence. The "c*" template to pack converts an array of Ruby integers into a stream of 8-bit signed words.
Here are the solutions, those comes from the Google Groups
1.
Background
>> %w(a b c).pack('*c')
TypeError: can't convert String into Integer
from (irb):1:in `pack'
from (irb):1
from /usr/bin/irb:12:in `<main>'
>> [1, 2, 3].pack('*c')
=> "\x01"
>> %w(a b c).map(&:ord).pack('*c')
=> "a"
Solution
irb(main):001:0> length=nil
=> nil
irb(main):002:0> token_string ||= ["A".."Z","a".."z","0".."9"].collect { |r| r.to_a }.join + %q(!$:*)
=> "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!$:*"
irb(main):003:0> (0..(length ? length : 60)).collect { token_string[rand( token_string.size)]}.map(&:ord).pack("c*")
=> "rd!:!LcxU3ON57*t2s520v*zvvdflSNAgU6uq14SiD00VUDlm9:4:tJz5Ri5o"
irb(main):004:0>
2.
The return type of String's [] function was Fixnum in 1.8 but is String in 1.9:
>JRUBY_OPTS=--1.9 ruby -e "puts 'a'[0].class"
String
>JRUBY_OPTS=--1.8 ruby -e "puts 'a'[0].class"
Fixnum
JRuby 1.7.x defaults to acting like Ruby 1.9.3. You need to set JRUBY_OPTS.
3.
try join instead of pack
irb(main):004:0> length=nil
=> nil
irb(main):005:0> token_string ||= ["A".."Z","a".."z","0".."9"].collect { |r| r.to_a }.join + %q(!$:*)
=> "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!$:*"
irb(main):006:0> token = (0..(length ? length : 60)).collect { token_string[rand( token_string.size)]}.join("c*")
=> "Fc*Dc*1c*6c*ac*Kc*Tc*Qc*Hc*jc*Ec*Kc*kc*zc*sc*3c*ic*hc*kc*wc**c*Wc*$c*Kc*Ic*Uc*Cc*bc*Pc*1c*!c*mc*Bc*lc*dc*ic*Dc*sc*Ac*Bc*nc*Kc*mc*Lc*oc*Zc*Xc*jc*6c*2c*Uc*ec*Yc*Dc*vc*Ic*Uc*5c*Zc*3c*o"
irb(main):007:0>
4.
if you're just trying to make a string of random characters that are 8-bit clean you may want to look at Random#bytes and something like Base64.encode64.
5.
active_support/secure_random also has a nice API for these things
I would recommend you take a look at: Array Method Pack, essentially the .pack("c") expects the elements of the array to be integers. You could try .pack("a*")
Use token = (0..(length ? length : 60)).collect { token_string[rand( token_string.size)]}.pack("a"*61)
You get the same result. I am using 61 because, going from 0..60 has 61 elements.

Nokogiri parsing different on server versus localhost

I'm getting some weird differences when running Nokogiri locally versus running it on my server. On my local machine the entire document seems to parse and be available but on the server I seem to get the doctype tab and some random comment tags.
To start off, to make sure it wasn't a problem with open-uri I checked it - the results are not exact but do contain the correct markup.
Local:
ruby-1.8.7-p352 :005 > s = open('http://www.pennstateind.com/store/PK2WAY.html')
=> #<File:/var/folders/G8/G8bsAGBk1o82Eyks3ZmFtq-+3Y6/-Tmp-/open-uri20120626-5891-10y2ncr-0>
ruby-1.8.7-p352 :006 > s.length
=> 88408
Server:
rb(main):008:0> s = open('http://www.pennstateind.com/store/PK2WAY.html')
=> #<File:/tmp/open-uri20120626-22167-1td2l72-0>
irb(main):009:0> s.length
=> 98184
When I run this on my local machine I get this:
ruby-1.8.7-p352 :003 > d = Nokogiri::HTML(open('http://www.pennstateind.com/store/PK2WAY.html'))
=> [ OUTPUT OMITTED FOR BREVITY - CAN SUPPLY ON REQUEST ]
ruby-1.8.7-p352 :004 > d.to_s.length
=> 85212
But when I run this on the server I get this:
rb(main):006:0> d = Nokogiri::HTML(open('http://www.pennstateind.com/store/PK2WAY.html'))
=> #<Nokogiri::HTML::Document:0x36620e14b580 name="document" children= [#<Nokogiri::XML::DTD:0x36620e14b1c0 name="html">, #<Nokogiri::XML::Comment:0x36620e14b170 " Open Graph Tags ">, #<Nokogiri::XML::Comment:0x36620e14a98c " Customer_Session_Verified: 0 ">]>
irb(main):007:0> d.to_s.length
=> 172
The only apparent gem difference is for the JS compiler - all other gems are the exact version between local and server:
Local => libv8 (3.3.10.4 x86-darwin-10)
Server => libv8 (3.3.10.4 x86_64-linux)
Any ideas how to figure out what is going on and/or fix this?
Update - to isolate where the problem actually was I pulled a file from the server and from localhost then ran them on each. The results below show that the problem definitely lies in Nokogiri - what the problem is I am still perplexed by...
Running locally:
# FILE ORIGINALLY PULLED FROM SERVER
ruby-1.8.7-p352 :015 > server_file = File.open("/Users/jmcdonald/Desktop/files/SERVER.txt", "r")
=> #<File:/Users/jmcdonald/Desktop/files/SERVER.txt>
ruby-1.8.7-p352 :016 > server_file.read.length
=> 93071
ruby-1.8.7-p352 :022 > Nokogiri::HTML(server_file).to_s.length
=> 98793
# FILE ORIGINALLY PULLED FROM LOCALHOST
=> #<File:/Users/jmcdonald/Desktop/files/LOCAL.txt>
ruby-1.8.7-p352 :018 > local_file.read.length
=> 89622
ruby-1.8.7-p352 :026 > Nokogiri::HTML(local_file).to_html.length
=> 94632
Running on server:
# FILE ORIGINALLY PULLED FROM SERVER
irb(main):001:0> sf = File.open('/home/charlest/public_html/files/nokogiri_issue/SERVER.txt', 'r')
=> #<File:/home/charlest/public_html/files/nokogiri_issue/SERVER.txt>
irb(main):002:0> sf.read.length
=> 93071
irb(main):004:0> Nokogiri::HTML(sf).to_s.length
=> 896 # <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< WRONG
# FILE ORIGINALLY PULLED FROM LOCALHOST
irb(main):008:0> lf = File.open('/home/charlest/public_html/files/nokogiri_issue/LOCAL.txt', 'r')
=> #<File:/home/charlest/public_html/files/nokogiri_issue/LOCAL.txt>
irb(main):009:0> lf.read.length
=> 89622
irb(main):011:0> Nokogiri::HTML(lf).to_s.length
=> 896 # <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< WRONG
It looks like your server and local environment are using different versions of libxml2. Older versions are known to have strange parsing bugs, so updating your server to the latest version you possibly can (or at least to the same version you're using for development) should fix you up.
There was also a bug with a shipped version of Nokogiri (I believe it affected 1.5.1) which affected the parsing in a some limited situations. I would suggest making sure your gems are updated. (gem update)
Try using File#read rather than File#open or make sure you're running lf.rewind before you try to parse w/ Nokogiri. The behavior you're seeing is most likely the result of your lf file handle being at the end of the file, which means Nokogiri is parsing an empty document.
> remote = File.open('./PK2WAY.html')
# => #<File:./PK2WAY.html>
> remote.read.length
# => 92978
> remote.read.length
# => 0
> Nokogiri::HTML(remote).to_s.length
# => 108
> remote.rewind
# => 0
> Nokogiri::HTML(remote).to_s.length
# => 93847

Rails / Ruby not following Rublar on regular expression

I have the following expression that I have tested in Rubular and that successfully matches against a snippet of HTML:
Official Website<\/h3>\s*<p><a href="([^"]*)"
However, when I run the expression in Ruby, using the following code, it returns no matches. I've reduced it down to "Official\s*Website" and it matches that, but nothing further.
Are there any additional options I need to set, or anything else that I need to do to configure Ruby/Rails to start tracking Rubular?
matches = sidebar.match(/Official\s*Website<\/h3>\s*<p><a href="([^"]*)"/)
if matches.nil?
puts "no matches"
else
puts "matches"
end
This is the relevant part of the snippet I'm matching against:
<h3>Official Website</h3><p>website.com</p>
your regular expression is correct. rubular should be working the same way your code does.
i tested it against ruby 1.8.7 and 1.9.3
irb(main):006:0> sidebar = ' <h3>Official Website</h3><p>website.com</p>'
=> " <h3>Official Website</h3><p>website.com</p>"
irb(main):007:0> sidebar.match(/Official\s*Website<\/h3>\s*<p><a href="([^"]*)"/)
=> #<MatchData "Official Website</h3><p><a href=\"http://website.com\"" 1:"http://website.com">
-
1.9.3p0 :005 > sidebar = ' <h3>Official Website</h3><p>website.com</p>'
=> " <h3>Official Website</h3><p>website.com</p>"
1.9.3p0 :006 > sidebar.match(/Official\s*Website<\/h3>\s*<p><a href="([^"]*)"/)
=> #<MatchData "Official Website</h3><p><a href=\"http://website.com\"" 1:"http://website.com">
if you want to quickly check why stuff is not working, you should try it in IRB or in your rails console. most of the times it's typo or bad encoding.

opening Array in ruby on rails console v. irb

I wish to make my code a little more readable by calling #rando on any array and retrieve a random element (rando because a rand() method already exists and I don't want there to be any confusion).
So I opened up the class and wrote a method:
class Array
def rando
self[ rand(length) ]
end
end
This seems far too straightforward.
When I open up irb, and type arr = %w(hi bye) and then arr.rando I get either hi or bye back. That's expected. However, in my rails console, when I do the same thing, I get ArgumentError: wrong number of arguments (1 for 0)
I've been tracing Array up the rails chain and can't figure it out. Any idea?
FWIW, I'm using rails 2.3.11 and ruby 1.8.7
Works fine in my case :
Loading development environment (Rails 3.0.3)
ruby-1.9.2-p180 :001 > class Array
ruby-1.9.2-p180 :002?> def rando
ruby-1.9.2-p180 :003?> self[ rand(length) ]
ruby-1.9.2-p180 :004?> end
ruby-1.9.2-p180 :005?> end
=> nil
ruby-1.9.2-p180 :006 > arr = %w(hi bye)
=> ["hi", "bye"]
ruby-1.9.2-p180 :007 > arr.rando
=> "bye"

Full url for an image-path in Rails 3

I have an Image, which contains carrierwave uploads:
Image.find(:first).image.url #=> "/uploads/image/4d90/display_foo.jpg"
In my view, I want to find the absolute url for this. Appending the root_url results in a double /.
root_url + image.url #=> http://localhost:3000//uploads/image/4d90/display_foo.jpg
I cannot use url_for (that I know of), because that either allows passing a path, or a list of options to identify the resource and the :only_path option. Since I do't have a resource that can be identified trough "controller"+"action" I cannot use the :only_path option.
url_for(image.url, :only_path => true) #=> wrong amount of parameters, 2 for 1
What would be the cleanest and best way to create a path into a full url in Rails3?
You can also set CarrierWave's asset_host config setting like this:
# config/initializers/carrierwave.rb
CarrierWave.configure do |config|
config.storage = :file
config.asset_host = ActionController::Base.asset_host
end
This ^ tells CarrierWave to use your app's config.action_controller.asset_host setting, which can be defined in one of your config/envrionments/[environment].rb files. See here for more info.
Or set it explicitly:
config.asset_host = 'http://example.com'
Restart your app, and you're good to go - no helper methods required.
* I'm using Rails 3.2 and CarrierWave 0.7.1
try path method
Image.find(:first).image.path
UPD
request.host + Image.find(:first).image.url
and you can wrap it as a helper to DRY it forever
request.protocol + request.host_with_port + Image.find(:first).image.url
Another simple method to use is URI.parse, in your case would be
require 'uri'
(URI.parse(root_url) + image.url).to_s
and some examples:
1.9.2p320 :001 > require 'uri'
=> true
1.9.2p320 :002 > a = "http://asdf.com/hello"
=> "http://asdf.com/hello"
1.9.2p320 :003 > b = "/world/hello"
=> "/world/hello"
1.9.2p320 :004 > c = "world"
=> "world"
1.9.2p320 :005 > d = "http://asdf.com/ccc/bbb"
=> "http://asdf.com/ccc/bbb"
1.9.2p320 :006 > e = "http://newurl.com"
=> "http://newurl.com"
1.9.2p320 :007 > (URI.parse(a)+b).to_s
=> "http://asdf.com/world/hello"
1.9.2p320 :008 > (URI.parse(a)+c).to_s
=> "http://asdf.com/world"
1.9.2p320 :009 > (URI.parse(a)+d).to_s
=> "http://asdf.com/ccc/bbb"
1.9.2p320 :010 > (URI.parse(a)+e).to_s
=> "http://newurl.com"
Just taking floor's answer and providing the helper:
# Use with the same arguments as image_tag. Returns the same, except including
# a full path in the src URL. Useful for templates that will be rendered into
# emails etc.
def absolute_image_tag(*args)
raw(image_tag(*args).sub /src="(.*?)"/, "src=\"#{request.protocol}#{request.host_with_port}" + '\1"')
end
There's quite a bunch of answers here. However, I didn't like any of them since all of them rely on me to remember to explicitly add the port, protocol etc. I find this to be the most elegant way of doing this:
full_url = URI( root_url )
full_url.path = Image.first.image.url
# Or maybe you want a link to some asset, like I did:
# full_url.path = image_path("whatevar.jpg")
full_url.to_s
And what is the best thing about it is that we can easily change just one thing and no matter what thing that might be you always do it the same way. Say if you wanted to drop the protocol and and use the The Protocol-relative URL, do this before the final conversion to string.
full_url.scheme = nil
Yay, now I have a way of converting my asset image urls to protocol relative urls that I can use on a code snippet that others might want to add on their site and they'll work regardless of the protocol they use on their site (providing that your site supports either protocol).
I used default_url_options, because request is not available in mailer and avoided duplicating hostname in config.action_controller.asset_host if haven't specified it before.
config.asset_host = ActionDispatch::Http::URL.url_for(ActionMailer::Base.default_url_options)
You can't refer to request object in an email, so how about:
def image_url(*args)
raw(image_tag(*args).sub /src="(.*?)"/, "src=\"//#{ActionMailer::Base.default_url_options[:protocol]}#{ActionMailer::Base.default_url_options[:host]}" + '\1"')
end
You can actually easily get this done by
root_url[0..-2] + image.url
I agree it doesn't look too good, but gets the job done.. :)
I found this trick to avoid double slash:
URI.join(root_url, image.url)

Resources