Mongoid can not query non-latin attributes - ruby-on-rails

My Mongoid document has two attributes: :en_name and ru_name. I have created one model:
MyModel.create(en_name: 'sport', ru_name: 'спорт')
Then I query it:
MyModel.where(en_name: 'sport').first
It returns me my model.
When I try to query this:
MyModel.where(ru_name: 'спорт').first
It returns me nil
How to make Mongoid able to query attributes which are non-latin?

Mongodb uses UTF-8. However, you may experience problems if the server is running on Windows because Windows uses CP1251.
Use Robomongo (cross-platform graphical client) for that would make sure that the data has been written to the database in the correct encoding.
BSON can be encoded only in UTF-8. If the data is not displayed correctly, you probably are not converting your data to UTF-8 before uploading it to mongodb.
Check encoding encoding_name = str.encoding.name
Convert encoding utf_str = Iconv.conv('windows-1251', 'utf-8', str)

Related

Create a text representation of an ActiveRecord object?

Suppose we have some arbitrary Active Record object
obj = User.first
Is there a way to convert this into a text representation?
That is, is there a way to convert the object into some code that can be dropped into a completely different rails console to regenerate that same object?
The closest example I can give of this functionality is the dput() function from the R programming language. Is there an equivalent in ruby / rails, preferably one that works with Active Record objects?
Ruby has the Marshal module:
The marshaling library converts collections of Ruby objects into a
byte stream, allowing them to be stored outside the currently active
script. This data may subsequently be read and the original objects
reconstituted.
str = Marshal.dump(obj)
# => "\x04\bo:\nThing\x1A:\x10#new_recordF:\x10#attributeso:\x1EActiveModel::AttributeSet\x06;\a{\tI\"\aid\x06:\x06ETo:)ActiveModel::Attribute::FromDatabase\n:\n#name#\b:\x1C#value_before_type_casti\x06:\n#typeo:EActiveRecord::ConnectionAdapters::SQLite3Adapter::SQLite3Integer\t:\x0F#precision0:
You can then load the object back into memory:
restored_obj = Marshal.load(
StringIO.new(str) # usually this would be from a IO stream like a file
)
It has some pretty serious security implications though if you're accepting user input and other serialization formats like JSON or Yaml should be considered. Three are also issues if you use it for caching and then change Ruby versions.
Rails models in recent versions also support Global ID - which doesn't give you the exact same object but it gives you a URI which can be used to load the same record from the database.
gid = User.first.to_global_id
obj = GlobalID::Locator.locate(gid)
This is how ActiveJob passes around references to models.

Rails JSON store value gets quoted upon saving

I have this problem with my Rails (5.1.6) application that is running on a PostgreSQL instance.
I have a model with a JSON type column (t.json :meta). The model has a store accessor like
store :meta, accessors: [:title], coder: JSON
The problem is now when I set this value that it shows up in the database as
"{\"title\":\"I am a title\"}"
making it text rather than a JSON value, which in turn makes that I cannot use the JSON query operator (->>) to query my JSON fields. I already tried without the coder option, but this results in it being saved as YAML.
The serialize function also did not change anything for me (adding serialize :meta, JSON)
Any and all help is appreciated!
serialize and store are not intended to be used for native JSON columns. Their purpose is to marshal and un-marshal data into string columns.
This was a "poor mans" JSON storage before native JSON support existed (and was supported by ActiceRecord). Using it on a JSON column will result in a double encoded string as you have noticed.
You don't actually have to do anything to use a JSON column. Its handled by the adapter.
See:
http://api.rubyonrails.org/classes/ActiveRecord/AttributeMethods/Serialization/ClassMethods.html#method-i-serialize
http://guides.rubyonrails.org/active_record_postgresql.html#json-and-jsonb

Ruby On Rails: Convert PostgreSQL JSON to usual JSON

I have a JSON field in my PostgreSQL database. If I do #profile.json, then I will get something like:
{ {"name"=>"jhon", "degree"=>"12312"}, "1480103144467"=>{"name"=>"", "degree"=>""}}`
It has all the => and other symbols, which I can not parse. How can I convert to normal format?
If you've declared your column of type json that's a signal to Rails to automatically serialize and decode your column on-demand, transparently. What you're seeing here is a traditional Ruby Hash structure, which is to be expected.
Inside the database itself it's stored as JSON.
If you need to re-emit this as JSON for whatever reason, like for an API, try this:
#profile.json.to_json
Calling your column something other than json is probably advisable, too.

Ruby 2.1.5 - ArgumentError: invalid byte sequence in UTF-8

I'm having trouble with UTF8 chars in Ruby 2.1.5 and Rails 4.
The problem is, the data which come from an external service are like that:
"first_name"=>"ezgi \xE7enberci"
"last_name" => "\xFC\xFE\xE7\xF0i\xFE\xFE\xF6\xE7"
These characters mostly include Turkish alphabet characters like "üğşiçö". When the application tries to save these data, the errors below occur:
ArgumentError: invalid byte sequence in UTF-8
Mysql2::Error: Incorrect string value
How can I fix this?
What's Wrong
Ruby thinks you have invalid byte sequences because your strings aren't UTF-8. For example, using the rchardet gem:
require 'chardet'
["ezgi \xE7enberci", "\xFC\xFE\xE7\xF0i\xFE\xFE\xF6\xE7"].map do str
puts CharDet.detect str
end
#=> [{"encoding"=>"ISO-8859-2", "confidence"=>0.8600826867857209},
{"encoding"=>"windows-1255", "confidence"=>0.5807177322740268}]
How to Fix It
You need to use String#scrub or one of the encoding methods like String#encode! to clean up your strings first. For example:
hash = {"first_name"=>"ezgi \xE7enberci",
"last_name"=>"\xFC\xFE\xE7\xF0i\xFE\xFE\xF6\xE7"}
hash.each_pair { |k,v| k[v.encode! "UTF-8", "ISO-8859-2"] }
#=> {"first_name"=>"ezgi çenberci", "last_name"=>"üţçđiţţöç"}
Obviously, you may need to experiment a bit to figure out what the proper encoding is (e.g. ISO-8859-2, windows-1255, or something else entirely) but ensuring that you have a consistent encoding of your data set is going to be critical for you.
Character encoding detection is imperfect. Your best bet will be to try to find out what encoding your external data source is using, and use that in your string encoding rather than trying to detect it automatically. Otherwise, your mileage may vary.
That doesn't look like utf-8 data so this exception is normal. Sounds like you need to tell ruby what encoding the string is actually in:
some_string.force_encoding("windows-1254")
You can then convert to UTF8 with the encode method. There are gems (eg charlock_holmes) that have heuristics for auto detecting encodings if you're getting a mix of encodings

Retrieving and parsing a MIME email from a database

Task given: An email is stored, byte for byte, in one or more chunks (of fixed length) in a database. This mail is to be retrieved from that database and it's contents shall be displayed to the user.
I have no problem wrapping the legacy database in an ActiveRecord model, concatenating the stored chunks and so on. What I don't really know is where to start on the MIME parsing part. I thought about something like perhaps having a dedicated EMail class which I can initialize with the data stored within the database and that class would allow me to see what MIME parts the mail consist of and allowed me to display, e.g., the text/* parts of it.
Now it seems that ActionMailer is able to parse incoming mails, but the doucmentation on receiving mails seems to be rather, erm, "sparse" and it just mentions receiving mails from STDIN.
How can I parse and display a MIME mail (or parts of it) in Rails, given that I can provide it's contents as a single string, variable, query result or soemthing like that?
Take a look at MMS2R.
I've been using it lately for parsing emails and it does a jolly good job.
I did it wrong. Rails comes with the TMail library, which is perfectly capable of parsing MIME emails. The basic workflow is as easy as concatenating the chunks from one stored message and passing them to TMail::Mail.parse like this:
email = TMail::Mail.parse(StoredMessage.find(:all,
:conditions => ["mail_id = ?", "oyByGqacG73b"],
:order => "chunk_ind").collect(&:mail_text).join)
email.body #=> this is your test body
email.subject # => test subject
email.has_attachment? #=> true
email.attachments.first.original_filename # => bulkfile
I do really apologize for having missed a whole library in Rails.
"has_attachment*s*?" (plural) is the correct name for the method

Resources