Storing Postgres Array of Jsonb in Rails 5 Escapes Strings Unexpectedly - ruby-on-rails

Perhaps my understanding of how this is supposed to work is wrong, but I seeing strings stored in my DB when I would expect them to be a jsonb array. Here is how I have things setup:
Migration
t.jsonb :variables, array: true
Model
attribute :variables, :variable, array: true
Custom ActiveRecord::Type
ActiveRecord::Type.register(:variable, Variable::Type)
Custom Variable Type
class Variable::Type < ActiveRecord::Type::Json
include ActiveModel::Type::Helpers::Mutable
# Type casts a value from user input (e.g. from a setter). This value may be a string from the form builder, or a ruby object passed to a setter. There is currently no way to differentiate between which source it came from.
# - value: The raw input, as provided to the attribute setter.
def cast(value)
unless value.nil?
value = Variable.new(value) if !value.kind_of?(Variable)
value
end
end
# Converts a value from database input to the appropriate ruby type. The return value of this method will be returned from ActiveRecord::AttributeMethods::Read#read_attribute. The default implementation just calls #cast.
# - value: The raw input, as provided from the database.
def deserialize(value)
unless value.nil?
value = super if value.kind_of?(String)
value = Variable.new(value) if value.kind_of?(Hash)
value
end
end
So this method does work from the application's perspective. I can set the value as variables = [Variable.new, Variable.new] and it correctly stores in the DB, and retrieves back as an array of [Variable, Variable].
What concerns me, and the root of this question, is that in the database, the variable is stored using double escaped strings rather than json objects:
{
"{\"token\": \"a\", \"value\": 1, \"default_value\": 1}",
"{\"token\": \"b\", \"value\": 2, \"default_value\": 2}"
}
I would expect them to be stored something more resembling a json object like this:
{
{"token": "a", "value": 1, "default_value": 1},
{"token": "b", "value": 2, "default_value": 2}
}
The reason for this is that, from my understanding, future querying on this column directly from the DB will be faster/easier if in a json format, rather than a string format. Querying through rails would remain unaffected.
How can I get my Postgres DB to store the array of jsonb properly through rails?

So it turns out that the Rails 5 attribute api is not perfect yet (and not well documented), and the Postgres array support was causing some problems, at least with the way I wanted to use it. I used the same approach to the problem for the solution, but rather than telling rails to use an array of my custom type, I am using a custom type array. Code speaks louder than words:
Migration
t.jsonb :variables, default: []
Model
attribute :variables, :variable_array, default: []
Custom ActiveRecord::Type
ActiveRecord::Type.register(:variable_array, VariableArrayType)
Custom Variable Type
class VariableArrayType < ActiveRecord::ConnectionAdapters::PostgreSQL::OID::Jsonb
def deserialize(value)
value = super # turns raw json string into array of hashes
if value.kind_of? Array
value.map {|h| Variable.new(h)} # turns array of hashes into array of Variables
else
value
end
end
end
And now, as expected, the db entry is no longer stored as a string, but rather as searchable/indexable jsonb. The whole reason for this song and dance is that I can set the variables attribute using plain old ruby objects...
template.variables = [Variable.new(token: "a", default_value: 1), Variable.new(token: "b", default_value: 2)]
...then have it serialized as its jsonb representation in the DB...
[
{"token": "a", "default_value": 1},
{"token": "b", "default_value": 2}
]
...but more importantly, automatically deserialized and rehydrated back into the plain old ruby object, ready for me to interact with it.
Template.find(123).variables = [#<Variable:0x87654321 token: "a", default_value: 1>, #<Variable:0x12345678 token: "b", default_value: 2>]
Using the old serialize api causes a write with every save (intentionally by Rails architectural design), regardless of whether or not the serialized attribute had changed. Doing this all manually by overriding setters/getters is an unnecessary complication due to the numerous ways attributes can be assigned, and is partly the reason for the newer attributes api.

If it helps anyone else, Rails wants you to provide the possible keys to permit in the controller as well if you're using strong params:
def controller_params
params.require(:parent_key)
.permit(
jsonb_field: [:allowed_key1, :allowed_key2, :allowed_key3]
)
end

One solution could be to just parse the variable via JSON.parse, push it inside an empty array, then assign it to the attribute.
variables = []
variable = "{\"token\": \"a\", \"value\": 1, \"default_value\": 1}"
variable.class #String
parsed_variable = JSON.parse(variable) #{"token"=>"a", "value"=>1, "default_value"=>1}
parsed_variable.class #Hash
variables.push parsed_variable

Related

Rails find_or_create_by add cast to hash value json type attribute?

I have a model with an :extra_fields column that is :jsonb datatype, I want to add in the attr hashes to the column, something like this below but I am unsure of the syntax to cast the hash values' datatypes here, and if not here what is the best practice for casting hash value data ?
instance = Model.find_or_create_by(ref_id: hash[:ref_id]) do |a|
a.extra_fields = {
'attr1' : hash[:attr1], <-- //possible to cast type here ie ::type ?
'attr2' : hash[:attr2] <--
}
instance.save!
end
Bonus: how would I cast the hash values as type :decimal, :string, :boolean, :date for example?
All incoming parameters in Rails/Rack are strings. Well except except array/hash parameters which still have strings as values. Rails does the actual casting when you pass parameters to models.
You can cast strings to any other type in Ruby with the .to_x methods:
irb(main):006:0> "1.23".to_f
=> 1.23
irb(main):007:0> "1.23".to_d
=> #<BigDecimal:7ff7dea40b68,'0.123E1',18(18)>
irb(main):008:0> 1.23.to_s
=> "1.23"
irb(main):008:0> 1.23.to_i
=> 1
Boolean casting is a Rails feature. You can do it by:
# Rails 5
ActiveModel::Type::Boolean.new.cast(value)
ActiveModel::Type::Boolean.new.cast("true") # true
ActiveModel::Type::Boolean.new.cast("t") # true
ActiveModel::Type::Boolean.new.cast("false") # false
ActiveModel::Type::Boolean.new.cast("f") # false
# This is somewhat surprising
ActiveModel::Type::Boolean.new.cast("any arbitrary string") # true
# Rails 4.2
ActiveRecord::Type::Boolean.new.type_cast_from_database(value)
# Rails 4.1 and below
ActiveRecord::ConnectionAdapters::Column.value_to_boolean(value)
Note that this is very different then the Ruby boolean coercion done by ! and !!.
irb(main):008:0> !!"false"
(irb):8: warning: string literal in condition
=> true
In Ruby everything except nil and false are true.
Dates are somewhat more complex. The default Rails date inputs use multi-parameters to send each part of the date (year, month, day) and a special setter that constructs a date from these inputs.
Processing by PeopleController#create as HTML
Parameters: { "person"=>{"birthday(1i)"=>"2019", "birthday(2i)"=>"2", "birthday(3i)"=>"16"}, ...}
You can construct a date from these parameters by:
date_params = params.fetch(:person).permit("birthday")
Date.new(*date_params.values.map(&:to_i))
what is the best practice for casting hash value data ?
There is no best practice here. What you instead should be pondering is the use of a JSON column. Since you seem to be want to apply some sort of schema to the data it might be a good idea to actually create a separate table and model. You are after all using a relational database.
JSON columns are great for solving some complex issues like key/value tables or storing raw JSON data but they should not be your first choice when modelling your data.
See PostgreSQL anti-patterns: Unnecessary json/hstore dynamic columns for a good write up on the topic.

Ruby on rails array - no implicit conversion of symbol into integer

I made a table component using react js, which uses columns to display data (which works fine with other data). For example, column 1 would show the title, column 2 the year, and column 3 the format.
Here is an example of my JSON:
{"movies": [{"title": "Iron Man", "year": "2008", "format": "DVD"}, {"title": "Iron Man 2", "year": "2010", "format": "DVD"}, {"title": "Iron Man 3", "year": "2013", "format": "DVD"}]}
Here is my code to populate the table, but it does not seem to work:
#movieList = #Makes a call to my mock API to get list of movies
#movies = Array.new
#movieList.each do |item|
#movie = Hash.new
#movie[:column1] = item[:title]
#movie[:column2] = item[:year]
#movie[:column3] = item[:format]
#movies << #movie
end
I need some advice to overcome a "no implicit conversion of symbol into integer error" I get. Could anyone offer some advice and point out where I am going wrong?
tl;dr
use #movieList["movies"].each
explanation
The issue here, is that you act as though your #movieList is ann array, when it is actually a hash (assuming #movieList is the JSON you showed).
each works on both arrays and hashes. However, when you use it on a hash, the block is passed |key, val|. Also, assigning block variables is optional. So, when you say #movieList.each do |item|, item is actually the top level key of the hash ("movies").
Strings such as "movies" respond to [] indexing with numbers. That's why you get the error no implicit conversion of symbol into integer ... because you pass a symbol to String#[] and it expects an integer.
Another way to write this code, that is more idiomatic, would be like so:
#movies = #movieList["movies"].map do |movie|
{
column1: movie["title"],
column2: movie["year"],
column3: movie["format"]
}
end
try reassigning
#movieList = #movieList[:movies] this will solve your problem. You're trying to iterate a object instead of an array.
lemme know if it solves your problem.
You need to loop movies using #movieList["movies"] as your JSON is a hash that has a key 'movies' and an array of movies as a value => {'movies': [{...},{...},...]}
As #max pleaner explained assigning block variables is optional, but when you use each on a hash(your JSON in this case) and provide only one block variable (instead of two refering to the keys and values of the hash), your key-value pairs are converted to two-element arrays inside the block where first element is the key and second one is the value of the pair.
Your item looks like this inside your each block -
['movies', [{movie1}, {movie2},..]], hence:
item[0] # 'movies'
item[1] # [{movie1}, {movie2},...]
As arrays expect indexing with integers and you supply symbol (item[:title]), you receive:
TypeError (no implicit conversion of Symbol into Integer)

How to access Chewy results with the dot notation?

I'm using Toptal's Chewy gem to connect and query my Elasticsearch, just like an ODM.
I'm using Chewy along with Elasticsearch 6, Ruby on Rails 5.2 and Active Record.
I've defined my index just like this:
class OrdersIndex < Chewy::Index
define_type Order.includes(:customer) do
field :id, type: "keyword"
field :customer do
field :id, type: "keyword"
field :name, type: "text"
field :email, type: "keyword"
end
end
end
And my model:
class Order < ApplicationRecord
belongs_to :customer
end
The problem here is that when I perform any query using Chewy, the customer data gets deserialized as a hash instead of an Object, and I can't use the dot notation to access the nested data.
results = OrdersIndex.query(query_string: { query: "test" })
results.first.id
# => "594d8e8b2cc640bb78bd115ae644637a1cc84dd460be6f69"
results.first.customer.name
# => NoMethodError: undefined method `name' for #<Hash:0x000000000931d928>
results.first.customer["name"]
# => "Frederique Schaefer"
How can I access the nested association using the dot notation (result.customer.name)? Or to deserialize the nested data inside an Object such as a Struct, that allows me to use the dot notation?
try to use
results = OrdersIndex.query(query_string: { query: "test" }).objects
It converts query result into active record Objects. so dot notation should work. If you want to load any extra association with the above result you can use .load method on Index.
If you want to convert existing ES nested object to accessible with dot notation try to reference this answer. Open Struct is best way to get things done in ruby.
Unable to use dot syntax for ruby hash
also, this one can help too
see this link if you need openStruct to work for nested object
Converting the just-deserialized results to JSON string and deserializing it again with OpenStruct as an object_class can be a bad idea and has a great CPU cost.
I've solved it differently, using recursion and the Ruby's native Struct, preserving the laziness of the Chewy gem.
def convert_to_object(keys, values)
schema = Struct.new(*keys.map(&:to_sym))
object = schema.new(*values)
object.each_pair do |key, value|
if value.is_a?(Hash)
object.send("#{key}=", convert_to_object(value.keys, value.values))
end
end
object
end
OrdersIndex.query(query_string: { query: "test" }).lazy.map do |item|
convert_to_object(item.attributes.keys, item.attributes.values)
end
convert_to_object takes an array of keys and another one of values and creates a struct from it. Whenever the class of one of the array of values items is a Hash, then it converts to a struct, recursively, passing the hash keys and values.
To presence the laziness, that is the coolest part of Chewy, I've used Enumerator::Lazy and Enumerator#map. Mapping every value returned by the ES query into the convert_to_object function, makes every entry a complete struct.
The code is very generic and works to every index I've got.

value is updated in DB with ActiveSupport::HashWithIndifferentAccess after serializing

I am trying to update a columns value to a hash in database. The column in database is text.
In model i have,
serialize :order_info
In controller i have update action
def update
Order.update_order_details(update_params, params[:order_info])
head :no_content
end
I am not doing strong parameters for order_info because order_info is an arbitrary hash and after doing research, strong params doesnt support an arbitrary hash
The value that i am trying to pass is like below
"order_info": {
"orders": [
{
"test": "AAAA"
}
],
"detail": "BBBB",
"type": "CCCC"
}
But when i try to update the value it gets updated in database like
--- !ruby/object:ActionController::Parameters parameters: !ruby/hash:ActiveSupport::HashWithIndifferentAccess comments: - !ruby/hash:ActiveSupport::HashWithIndifferentAccess test: AAAA detail: BBBB type: CCCC permitted: false
serialize is an instance of ActiveSupport::HashWithIndifferentAccess so i am guessing thats why its in the value. How can i get rid of the extra stuff and just update the hash?
If you want to unwrap all the ActionController::Parameters stuff from params[:order_info] without any filtering then the easiest thing to do is call to_unsafe_h (or its alias to_unsafe_hash):
hash = params[:order_info].to_unsafe_h
In Rails4 that should give you a plain old Hash in hash but AFAIK Rails5 will give you an ActiveSupport::HashWithIndifferentAccess so you might want to add a to_h call:
hash = params[:order_info].to_unsafe_h.to_h
The to_h call won't do anything in Rails4 but will give you one less thing to worry about when you upgrade to Rails5.
Then your update call:
Order.update_order_details(
update_params,
params[:order_info].to_unsafe_h.to_h # <-------- Extra DWIM method calls added
)
should give you the YAML in the database that you're looking for:
"---\n:order_info:\n :orders:\n - :test: AAAA\n :detail: BBBB\n :type: CCCC\n"
You might want to throw in a deep_stringify_keys call too:
params[:order_info].to_unsafe_h.to_h.deep_stringify_keys
depending on what sort of keys you want in your YAMLizied Hash.

How to get rid of surrounding quotes in Rails?

I'm having problems with weird behaviour in RoR. I'm having a Hash that i'm converting to json using to_json() like so:
data = Hash.new
# ...
data = data.to_json()
This code appears inside a model class. Basically, I'm converting the hash to JSON when saving to database. The problem is, the string gets saved to database with its surrounding quotes. For example, saving an empty hash results in: "{}". This quoted string fails to parse when loading from the database.
How do I get rid of the quotes?
The code is:
def do_before_save
#_data = self.data
self.data = self.data.to_json()
end
EDIT:
Due to confusions, I'm showing my entire model class
require 'json'
class User::User < ActiveRecord::Base
after_find { |user|
user.data = JSON.parse(user.data)
}
after_initialize { |user|
self.data = Hash.new unless self.data
}
before_save :do_before_save
after_save :do_after_save
private
def do_before_save
#_data = self.data
self.data = self.data.to_json()
end
def do_after_save
self.data = #_data
end
end
The data field is TEXT in mysql.
I'm willing to bet money that this is the result of you calling .to_json on the same data twice (without parsing it in between). I've had a fair share of these problems before I devised a rule: "don't mutate data in a lossy way like this".
If your original data was {}, then first .to_json would produce "{}". But if you were to jsonify it again, you'd get "\"{}\"" because a string is a valid json data type.
I suggest that you put a breakpoint in your before_save filter and see who's calling it the second time and why.
Update
"call .to_json twice" is a general description and can also mean that there are two subsequent saves on the same object, and since self.data is reassigned, this leads to data corruption. (thanks, #mudasobwa)
It depends on your model's database field type.
If the field is string type (like VARCHAR or TEXT) it should be stored as string (no need to get rid of the quotes - they are fine). Make sure calling to_json once.
If the field is Postgres JSON type, then you can just assign a hash to the model's field, no need to call to_json at all.
If you are saving hash as a JSON string in a varchar column you can use serialize to handle marshalling/unmarshaling the data:
class Thing < ActiveRecord::Base
serialize :foo, JSON
end
Knowing exactly when to convert the data in the lifecycle of a record is actually quite a bit harder than your naive implementation. So don't reinvent the wheel.
However a huge drawback is that the data cannot be queried in the DB*. If you are using Postgres or MySQL you can instead use a JSON or JSONB (postgres only) column type which allows querying. This example is from the Rails guide docs:
# db/migrate/20131220144913_create_events.rb
create_table :events do |t|
t.json 'payload'
end
# app/models/event.rb
class Event < ApplicationRecord
end
# Usage
Event.create(payload: { kind: "user_renamed", change: ["jack", "john"]})
event = Event.first
event.payload # => {"kind"=>"user_renamed", "change"=>["jack", "john"]}
## Query based on JSON document
# The -> operator returns the original JSON type (which might be an object), whereas ->> returns text
Event.where("payload->>'kind' = ?", "user_renamed")
use {}.as_json instead of {}.to_json
ex:
a = {}
a.as_json # => {}
a.to_json # => "{}"
http://api.rubyonrails.org/classes/ActiveModel/Serializers/JSON.html#method-i-as_json

Resources