Mongoid and UTF-8 issues in a JRuby on Rails app - ruby-on-rails

I'm taking a JSON string that's the result from polling the Foursquare venue API:
{
"id"=>"4e404742c65b4ec27606deb4",
"name"=>"Sarah's Cheesecake & Cafe",
"contact"=>{
"phone"=>"4134436678",
"formattedPhone"=>"(413) 443-6678"
},
"location"=>{
"address"=>"180 Elm St",
"lat"=>42.44345873,
"lng"=>-73.23804678,
"distance"=>1063,
"postalCode"=>"01201",
"city"=>"Pittsfield",
"state"=>"MA"
},
"categories"=>[
{
"id"=>"4bf58dd8d48988d16d941735",
"name"=>"Café",
"pluralName"=>"Cafés",
"shortName"=>"Café",
"icon"=>{
"prefix"=>"https://foursquare.com/img/categories/food/cafe_",
"sizes"=>[
32,
44,
64,
88,
256
],
"name"=>".png"
},
"primary"=>true
}
],
"verified"=>false,
"stats"=>{
"checkinsCount"=>7,
"usersCount"=>5,
"tipCount"=>0
},
"hereNow"=>{
"count"=>0
}
}
As you can tell, there are some non-standard characters in there such as Cafés and that's breaking my Mongoid based Model in this JRuby on Rails app. When trying to to create an instance with MyModel.create, here's what I get.
jruby-1.6.5 :012 > FoursquareVenue.create(hash)
Java::JavaLang::NullPointerException:
from org.jruby.exceptions.RaiseException.<init>(RaiseException.java:101)
from org.jruby.Ruby.newRaiseException(Ruby.java:3348)
from org.jruby.Ruby.newEncodingCompatibilityError(Ruby.java:3323)
from org.jruby.RubyString.cat(RubyString.java:1285)
from org.jruby.RubyString.cat19(RubyString.java:1221)
from org.jruby.RubyHash$5.visit(RubyHash.java:727)
from org.jruby.RubyHash.visitAll(RubyHash.java:594)
from org.jruby.RubyHash.inspectHash(RubyHash.java:721)
from org.jruby.RubyHash.inspect(RubyHash.java:745)
from org.jruby.RubyHash$i$0$0$inspect.call(RubyHash$i$0$0$inspect.gen:65535)
from org.jruby.RubyClass.finvoke(RubyClass.java:632)
from org.jruby.javasupport.util.RuntimeHelpers.invoke(RuntimeHelpers.java:545)
from org.jruby.RubyBasicObject.callMethod(RubyBasicObject.java:353)
from org.jruby.RubyObject.inspect(RubyObject.java:408)
from org.jruby.RubyArray.inspectAry(RubyArray.java:1483)
from org.jruby.RubyArray.inspect(RubyArray.java:1509)
... 420 levels...
from org.jruby.evaluator.ASTInterpreter.INTERPRET_METHOD(ASTInterpreter.java:75)
from org.jruby.internal.runtime.methods.InterpretedMethod.call(InterpretedMethod.java:190)
from org.jruby.internal.runtime.methods.DefaultMethod.call(DefaultMethod.java:179)
from org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:312)
from org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:169)
from usr.local.rvm.rubies.jruby_minus_1_dot_6_dot_5.bin.jirb.__file__(/usr/local/rvm/rubies/jruby-1.6.5/bin/jirb:17)
from usr.local.rvm.rubies.jruby_minus_1_dot_6_dot_5.bin.jirb.load(/usr/local/rvm/rubies/jruby-1.6.5/bin/jirb)
from org.jruby.Ruby.runScript(Ruby.java:693)
from org.jruby.Ruby.runScript(Ruby.java:686)
from org.jruby.Ruby.runNormally(Ruby.java:593)
from org.jruby.Ruby.runFromMain(Ruby.java:442)
from org.jruby.Main.doRunFromMain(Main.java:321)
from org.jruby.Main.internalRun(Main.java:241)
from org.jruby.Main.run(Main.java:207)
from org.jruby.Main.run(Main.java:191)
from org.jruby.Main.main(Main.java:171)
If I strip out all the odd characters, everything works as expected and no exception is thrown. What's the proper way of handling this? Can I enabled my Mongoid/MongoDB documents to work with UTF-8? do I need to "asciify" them somehow first if that's not possible?

Could be an encoding bug in JRuby's 1.9 mode. Does the same thing happen when you run it in 1.8 mode? Either way, a stacktrace should be filed as a bug at http://bugs.jruby.org. Thanks!

gem install bson_ext might help.
Source: MongoDB, Ruby and UTF-8
If you are using ubuntu, then you need to do some extra steps with spidermonkey/mongodb installation:
Most pre-built Javascript SpiderMonkey libraries do not have UTF-8
support compiled in; MongoDB requires this.
Source: Building for Linux

MongoDB and mongoid handle utf-8 properly. I was doing the same thing with the Foursquare API not long ago via the Quimby wrapper.
As a result, I would suspect the bug is closely related to the use of JRuby.

Have you set up JRuby to use UTF8?
require 'jcode'
$KCODE = 'u'

Related

ArgumentError when running Capybara tests on Ruby 3.0

I am really stuck. I am upgrading my Rails app to Ruby 3 (from 2.7). When running tests, I always run into this issue when I visit a path:
state = "new"
visit status_path(state: “state")
I receive the following error when running rspec:
Capybara starting Puma...
* Version 5.6.4 , codename: Birdie's Version
* Min threads: 0, max threads: 4
* Listening on http://127.0.0.1:58568
ArgumentError: wrong number of arguments (given 2, expected 1)
from ~/.rbenv/versions/3.0.5/lib/ruby/3.0.0/net/protocol.rb:116:in `initialize'
My Gemfile is as such:
gem "capybara" # 3.38.0
gem "selenium-webdriver" # 4.8.0
gem "webdrivers" # 5.2.0
(They're all on the latest version)
My setup doesn't look wrong:
require "webdrivers/chromedriver"
Webdrivers.cache_time = 86_400 # 1 day
Capybara.register_driver :headless_chrome do |app|
Capybara::Selenium::Driver.load_selenium
browser_options = ::Selenium::WebDriver::Chrome::Options.new.tap do |options|
options.add_argument("--headless")
options.add_argument("--no-sandbox")
options.add_argument("--disable-gpu")
end
Capybara::Selenium::Driver.new(app, browser: :chrome, options: browser_options)
end
Capybara.javascript_driver = :headless_chrome
Troubleshooting:
I tried Puma 6 - same issue.
The controllers at status_path is not even hit. This errors occurs right after Puma loads up.
I do not think it's the Capybara setup, and I just cannot find where it is calling the Ruby 3 library wrong (net/protocol).
I downgraded capybara to 3.37.1, and same issue.
Thank you
FYI Upgrading from 2.7 to 3 you will more often than not see this error. It's highly likely that some code you were previously using in 2.7 will be not correctly hashing some args.
The first hit when googling this will take you back to SO (Won't share link as SO doesn't like links). But to paraphrase from the official ruby byline when updating
Separation of positional and keyword arguments in Ruby 3.0:
In most cases, you can avoid incompatibility by adding the double splat operator. It explicitly specifies passing keyword arguments instead of a Hash object. Likewise, you may add braces {} to explicitly pass a Hash object, instead of keyword arguments.
TL;DR - Try doing splatting your hash args collection kwargs -> **kwargs - Your rails path likely isn't a kwarg but a hash --> { key: value }
EDIT: Reasoning (If you're interested), is that prior to ruby3. Ruby would try assess and guesstimate what you meant. From ruby3 onwards it has made a change and fill forcibly use what you give it (A lot of people used to use kwargs but wanted them treated as a single hash, now you need to stipulate this!)

CableReady (Rails) Basic case giving mystifying error message

cable_ready 4.5.0
rails 6.1.4.1
ruby 3.0.2p107
This is a simple example from the basic tutorial (https://www.youtube.com/watch?v=F5hA79vKE_E) I suspect the error I am getting is because either cable_ready or rails evolved a little and created a tiny incompatibility.
I get this error in the JS console:
It is triggered when in my controller I ask cable ready to:
cable_ready["timeline"].console_log(message: "***** cable ready post created")
Which leads to my timeline_channel to:
received(data) {
console.log("******** Received data:", data.operations)
if (data.cableReady) CableReady.perform(data.operations)
}
My interpretation is perform causes this line in cable_ready.js line 13:
operations.forEach(function (operation) {
if (!!operation.batch) batches[operation.batch] = batches[operation.batch] ? ++batches[operation.batch] : 1;
});
Is finding something in the received data that it doesn't like.
That's where my trail ends. Can someone see what I am doing wrong, or tell me what other code you'd like me to include?
Solution: downgrade the version of the cable_ready javascript library.
I previously (maybe a year ago) did this tutorial using CableReady 4.5, Ruby 2.6.5 and Rails 6.0.4 and it worked like a charm back then as well as today.
But today, I tried this tutorial again on a duplicate project--same versions of CR, Ruby, and Rails and now I get java console errors similar to yours.
TypeError: undefined is not a function (near '...operations.forEach...')
perform -- cable_ready.js:13
received -- progress_bar_channel.js:8
I looked at the output of yarn list and saw that cable_ready was version 5.0.0-pre8 on the bad project and it was 5.0.0-pre1 on the good project. The downgrade could be accomplished with yarn add cable_ready#^5.0.0-pre1 in the bad project folder and now both projects work.
FYI for other newbies like me trying to understand how CableReady works: This tutorial gives another example of CableReady, and was also fixed the same way.

Rails 4.2 syntax error, unexpected ':', expecting =>

I have two computers that I mainly use to develop my Rails application. While working on Computer 1, I added some bootstrap elements to some inputs. For example:
= f.select :transport_from_state, options_for_select(state_populator, #invoice_ambulance.transport_from_state), { include_blank: true}, { class: 'chosen-select', 'data-placeholder': 'State' }
I added the 'data-placeholder': 'State' and used the 'newer' syntax instead of the old :data-placeholder' => 'State' which works fine. The page works with no errors on Computer 1.
I pulled down on computer 2, and now I am getting an error for every instance of 'data-placeholder'. Here is my error:
syntax error, unexpected ':', expecting =>
...en-select', 'data-placeholder': 'State' }
I can replace it with the old syntax and it works fine. However, I shouldn't have to switch 100 instances of this to a deprecated syntax. I have since bundle installed, bundle updated, and rebuilt the db with no luck.
Computer 1 (works)
ruby 2.2.0p0
Rails 4.2.0
Computer 2 (doesnt work)
ruby 2.2.0preview1
Rails 4.2.0
You need to upgrade Computer 2 to the real Ruby 2.2.0 rather than this beta-ish "preview" version you have. Using quoted symbols with the JavaScript-style trailing colon syntax:
{ 'some string': value }
wasn't valid before Ruby 2.2, the 2.2.0preview1 version you have on Computer 2 apparently doesn't support it.
BTW, there is no old and new syntax, there is an alternate JavaScript-style notation that can be use when the keys in a Hash-literal are some symbols. Whoever told you that the hashrocket is deprecated is, at best, confused.
The "newer" syntax is only for symbols.
{hello: 'world'} is equivalent to {:hello => 'world'} but if your key is a string then you still have to use the "hash rocket" syntax: {'hello' => 'world'}
http://ruby-doc.org/core-2.2.0/Hash.html

JSON encoding/decoding with unicode in rails

I upgraded downgraded to rails 2.3.17 due to the security bugs, but now I can't decode json strings that I have saved down to a DB if they have unicode in them :(. Is there a way to process the string such that it decodes properly?
e = ActiveSupport::JSON.encode({'a' => "Hello Unicode \u2019"})
ActiveSupport::JSON.decode(e)
gives me
RangeError: 8217 out of char range
from /app/vendor/bundle/ruby/1.9.1/gems/activesupport-2.3.17/lib/active_support/json/backends/okjson.rb:314:in `unquote'
from /app/vendor/bundle/ruby/1.9.1/gems/activesupport-2.3.17/lib/active_support/json/backends/okjson.rb:251:in `strtok'
from /app/vendor/bundle/ruby/1.9.1/gems/activesupport-2.3.17/lib/active_support/json/backends/okjson.rb:215:in `tok'
from /app/vendor/bundle/ruby/1.9.1/gems/activesupport-2.3.17/lib/active_support/json/backends/okjson.rb:178:in `lex'
from /app/vendor/bundle/ruby/1.9.1/gems/activesupport-2.3.17/lib/active_support/json/backends/okjson.rb:46:in `decode'
from /app/vendor/bundle/ruby/1.9.1/gems/activesupport-2.3.17/lib/active_support/json/backends/okjson.rb:612:in `decode'
from /app/vendor/bundle/ruby/1.9.1/gems/activesupport-2.3.17/lib/active_support/json/decoding.rb:14:in `decode'
from (irb):30
from /usr/local/bin/irb:12:in `<main>'
I can't change the first line since it's coming from the DB like that.
This used to work.
You can change the backend JSON provider in ActiveSupport.
Add ActiveSupport::JSON.backend = "JSONGem" into an application initialiser (I added it to application.rb). This fixed the unicode parsing issues I had after I upgraded activesupport to 3.0.20.
See the vulnerability notice which caused this update - It mentions that this workaround should apply to 2.3.16 as well.
From rails console:
> ActiveSupport::VERSION::STRING
=> "3.0.20"
> ActiveSupport::JSON.decode('{"test":"string\u2019"}')
RangeError: 8217 out of char range
> ActiveSupport::JSON.backend = "JSONGem"
> ActiveSupport::JSON.decode('{"test":"string\u2019"}')
 => {"test"=>"string’"}
The JSON gem will handle this correctly.
As a note, the gem is much more strict than the other JSON parsers out there. For example:
{ 'test' : 'value' }
This is not valid JSON even though it looks okay.
For whatever reason the non-UTF-8 savvy JSON parser shipped as part of the 2.3.16 patch which is really sloppy on the part of the maintainer.
Switch to 2.3.15 which should be fine because that's when the fixes landed.
Curse the developer who started this project in rails
Begin work on porting to python post haste

XML parsing in Ruby

I am using a REXML Ruby parser to parse an XML file. But on a 64 bit AIX box with 64 bit Ruby, I am getting the following error:
REXML::ParseException: #<REXML::ParseException: #<RegexpError: Stack overflow in
regexp matcher:
/^<((?>(?:[\w:][\-\w\d.]*:)?[\w:][\-\w\d.]*))\s*((?>\s+(?:[\w:][\-\w\d.]*:)?[\w:][\-\w\d.]*\s*=\s*(["']).*?\3)*)\s*(\/)?>/mu>
The call for the same is something like this:
REXML::Document.new(File.open(actual_file_name, "r"))
Does anyone have an idea regarding how to solve this issue?
I've had several issues for REXML, it doesn't seem to be the most mature library. Usually I use Nokogiri for Ruby XML parsing stuff, it should be faster and more stable than REXML. After installing it with sudo gem install nokogiri, you can use something like this to get a DOM instance:
doc = Nokogiri.XML(File.open(actual_file_name, 'rb'))
# => #<Nokogiri::XML::Document:0xf1de34 name="document" [...] >
The documentation on the official webpage is also much better than that of REXML, IMHO.
I almost immediately found the answer.
The first thing I did was to search in the ruby source code for the error being thrown.
I found that regex.h was responsible for this.
In regex.h, the code flow is something like this:
/* Maximum number of duplicates an interval can allow. */
#ifndef RE_DUP_MAX
#define RE_DUP_MAX ((1 << 15) - 1)
#endif
Now the problem here is RE_DUP_MAX. On AIX box, the same constant has been defined somewhere in /usr/include.
I searched for it and found in
/usr/include/NLregexp.h
/usr/include/sys/limits.h
/usr/include/unistd.h
I am not sure which of the three is being used(most probably NLregexp.h).
In these headers, the value of RE_DUP_MAX has been set to 255! So there is a cap placed on the number of repetitions of a regex!
In short, the reason is the compilation taking the system defined value than that we define in regex.h!
This also answers my question which i had asked recently:
Regex limit in ruby 64 bit aix compilation
I was not able to answer it immediately as i need to have min of 100 reputation :D :D
Cheers!

Resources