Using a frozen constant as a hash key? - ruby-on-rails

I built a small API that makes a few calls that have similar payloads, so I have a base payload that I merge in the call-specific elements, like so:
def foo_call
base_payload.merge({"request.id" => request_id})
end
def biz_call
base_payload.merge({"request.id" => some_other_thing})
end
def base_payload
{
bar: bar,
baz: baz,
"request.id" => default_id
}
end
My coworker suggested that I make "request.id" a frozen constant, arguing that making it a constant means we can freeze it means we wont be allocating a new string object on each call, saving a bit of memory. That would look like this:
REQUESTID = "request.id".freeze
def foo_call
base_payload.merge({REQUESTID => request_id})
end
def biz_call
base_payload.merge({REQUESTID => some_other_thing})
end
def base_payload
{
bar: bar,
baz: baz,
REQUESTID => default_id
}
end
I'm a little apprehensive, but I can't quite pin down any reason why not to do this (other than my latent resistance at having a new commit :>). I feel it might cause weirdness to have the same object be the key in multiple hashes -- that merging in an object "request.id" might not overwrite the original string from the base_payload -- or that we won't actually see any memory saved since the hash would have to clone the string for its hash key anyway, but I'm not really sure.
Am I just being overly paranoid/resistant?

Related

Log changes from both attribute= and Array.push

That ActiveModel::Dirty doesn't cover Array.push (or any other modify-in-place methods, as I've read extremely recently) for attributes pertaining to, say, postgres arrays is pretty well-established. For example, if an Apple model has an array Apple.seeds, you'll see the following in a Rails console.
johnny = Apple.new()
# => <Apple #blahblahblah>
johnny.seeds
# => [] (assuming [] default)
johnny.seeds << "Oblong"
# => ["Oblong"]
johnny.changed?
# => false
johnny.seeds = []
johnny.seeds += ["Oblong"]
# => ["Oblong"]
johnny.changed?
# => true
So you can use two different ways of changing the array attribute, but Rails only recognizes the one that uses a setter. My question is, is there a way (that won't mangle the Array class) to get push to behave like a setter in the context of an ActiveRecord object, so that johnny.seeds << (x) will reflect in johnny.changes?
(On my end, this is to prevent future developers from using push on array attributes, unwittingly failing to record changes because they were not aware of this limitation.)
This is a problem with any column with a mutable object, not just Array objects.
seeder = Apple.first
seeder.name
=> "Johnny "
seeder.name << " Appleseed"
seeder.changed?
=> false
You're better off leaving a note for future developers, but otherwise you can consider replacing the changed? method
class Apple
alias_method 'old_changed?', 'changed?'
def changed?
return old_changed? if old_changed?
return (seeds.count > 0) if new_record?
return seeds != Apple.find(id).seeds
end
end
However, note that just because changed? comes backtrue, does not assure you that fields with unchanged object_ids will be updated in update_attributes... you may find that they're not. You might need to hire competent rails developers who understand these pitfalls.

Editing params nested hash

Assume we have a rails params hash full of nested hashes and arrays. Is there a way to alter every string value (whether in nested hashes or arrays) which matches a certain criteria (e.g. regex) and still keep the output as a params hash (still containing nested hashes arrays?
I want to do some sort of string manipulation on some attributes before even assigning them to a model. Is there any better way to achieve this?
[UPDATE]
Let's say we want to select the strings that have an h in the beginning and replace it with a 'b'. so we have:
before:
{ a: "h343", b: { c: ["h2", "s21"] } }
after:
{ a: "b343", b: { c: ["b2", "s21"] } }
For some reasons I can't do this with model callbacks and stuff, so it should have be done before assigning to the respective attributes.
still keep the output as a params hash (still containing nested hashes arrays
Sure.
You'll have to manipulate the params hash, which is done in the controller.
Whilst I don't have lots of experience with this I just spent a bunch of time testing -- you can use a blend of the ActionController::Parameters class and then using gsub! -- like this:
#app/controllers/your_controller.rb
class YourController < ApplicationController
before_action :set_params, only: :create
def create
# Params are passed from the browser request
#model = Model.new params_hash
end
private
def params_hash
params.require(:x).permit(:y).each do |k,v|
v.gsub!(/[regex]/, 'string')
end
end
end
I tested this on one of our test apps, and it worked perfectly:
--
There are several important points.
Firstly, when you call a strong_params hash, params.permit creates a new hash out of the passed params. This means you can't just modify the passed params with params[:description] = etc. You have to do it to the permitted params.
Secondly, I could only get the .each block working with a bang-operator (gsub!), as this changes the value directly. I'd have to spend more time to work out how to do more elaborate changes.
--
Update
If you wanted to include nested hashes, you'd have to call another loop:
def params_hash
params.require(:x).permit(:y).each do |k,v|
if /_attributes/ ~= k
k.each do |deep_k, deep_v|
deep_v.gsub!(/[regex]/, 'string'
end
else
v.gsub!(/[regex]/, 'string')
end
end
end
In general you should not alter the original params hash. When you use strong parameters to whitelist the params you are actually creating a copy of the params - which can be modified if you really need to.
def whitelist_params
params.require(:foo).permit(:bar, :baz)
end
But if mapping the input to a model is too complex or you don't want to do it on the model layer you should consider using a service object.
Assuming you have a hash like this:
hash = { "hello" => { "hello" => "hello", "world" => { "hello" => "world", "world" => { "hello" => "world" } } }, "world" => "hello" }
Then add a function that transforms the "ello" part of all keys and values into "i" (meaning that "hello" and "yellow" will become "hi" and "yiw")
def transform_hash(hash, &block)
hash.inject({}){ |result, (key,value)|
value = value.is_a?(Hash) ? transform_hash(value, &block) : value.gsub(/ello/, 'i')
block.call(result, key.gsub(/ello/, 'i'), value)
result
}
end
Use the function like:
new_hash = transform_hash(hash) {|hash, key, value| hash[key] = value }
This will transform your hash and it's values regardless of the nesting level. However, the values should be strings (or another Hash) otherwise you'll get an error. to solve this problem just change the value.is_a?(Hash) conditional a bit.
NOTE that I strongly recommend you NOT to change the keys of the hash!

How to stream large xml in Rails 3.2?

I'm migrating our app from 3.0 to 3.2.x. Earlier the streaming was done by the assigning the response_body a proc. Like so:
self.response_body = proc do |response, output|
target_obj = StreamingOutputWrapper.new(output)
lib_obj.xml_generator(target_obj)
end
As you can imagine, the StreamingOutputWrapper responds to <<.
This way is deprecated in Rails 3.2.x. The suggested way is to assign an object that responds to each.
The problem I'm facing now is in making the lib_obj.xml_generator each-aware.
The current version of it looks like this:
def xml_generator(target, conditions = [])
builder = Builder::XmlMarkup.new(:target => target)
builder.root do
builder.elementA do
Model1.find_each(:conditions => conditions) { |model1| target << model1.xml_chunk_string }
end
end
end
where target is a StreamingOutputWrapper object.
The question is, how do I modify the code - the xml_generator, and the controller code, to make the response xml stream properly.
Important stuff: Building the xml in memory is not an option as the model records are huge. The typical size of the xml response is around 150MB.
What you are looking for is SAX Parsing. SAX reads files "chunks" at a time instead of loading the whole file into DOM. This is super convenient and fortunately there are a lot of people before you who have wanted to do the same thing. Nokogiri offers XML::SAX methods, but it can get really confusing in the disastrous documentation and syntactically, it's a mess. I would suggest looking into something that sits on top of Nokogiri and makes getting your job done, a lot more simple.
Here are a few options -
SAX_stream:
Mapping out objects in sax_stream is super simple:
require 'sax_stream/mapper'
class Product
include SaxStream::Mapper
node 'product'
map :id, :to => '#id'
map :status, :to => '#status'
map :name_confirmed, :to => 'name/#confirmed'
map :name, :to => 'name'
end
and calling the parser in is also simple:
require 'sax_stream/parser'
require 'sax_stream/collectors/naive_collector'
collector = SaxStream::Collectors::NaiveCollector.new
parser = SaxStream::Parser.new(collector, [Product])
parser.parse_stream(File.open('products.xml'))
However, working with the collectors (or writing your own) and end up being slightly confusing, so I would actually go with:
Saxerator:
Saxerator gets the job doen and has some really handy methods for traversing into nodes that can be a little less complex than sax_stream. Saxerator also has a few really great configuration options that are well documented. Simple Saxerator example below:
parser = Saxerator.parser(File.new("rss.xml"))
parser.for_tag(:item).each do |item|
# where the xml contains <item><title>...</title><author>...</author></item>
# item will look like {'title' => '...', 'author' => '...'}
puts "#{item['title']}: #{item['author']}"
end
# a String is returned here since the given element contains only character data
puts "First title: #{parser.for_tag(:title).first}"
If you end up having to pull the XML from an external source (or it is getting updated frequently and do you don't want to have to update the version on your server manually, check out THIS QUESTION and the accepted answer, it works great.
You could always monkey-patch the response object:
response.stream.instance_eval do
alias :<< :write
end
builder = Builder::XmlMarkup.new(:target => response.stream)
...

Adding data of an unknown type to a hash

Ruby seems like a language that would be especially well suited to solving this problem, but I'm not finding an elegant way to do it. What I want is a method that will accept a value and add it to a hash like so, with specific requirements for how it is added if the key already exists:
Adding 'foo' to :key1
{:key1 => 'foo'}
Adding 'bar' to :key1
{:key1=> 'foobar'}
Adding ['foo'] to :key2
{:key2 = ['foo']}
Adding ['bar'] to :key2
{:key2 => [['foo'], ['bar']]
Adding {:k1 => 'foo'} to :key3
{:key3 => {:k1 => 'foo'}}
Adding {:k2 => 'bar'} to :key3
{:key3 => {:k1 => 'foo', :k2 => 'bar'}}
Right now I can do this but it looks sloppy and not like idiomatic Ruby. What is a good way to do this?
To make it more Ruby-like you might want to extend the Hash class to provide this kind of functionality across the board, or make your own subclass for this specific purpose. For instance:
class FancyHash < Hash
def add(key, value)
case (self[key])
when nil
self[key] = value
when Array
self[key] = [ self[key], value ]
when Hash
self[key].merge!(value)
else
raise "Adding value to unsupported #{self[key].class} structure"
end
end
end
This will depend on your exact interpretation of what "adding" means, as your examples do seem somewhat simplistic and don't address what happens when you add a hash to a pre-existing array, among other things.
The idea is that you define a handler that accommodates as many possibilities as reasonable and throw an exception if you can't manage.
If you want to utilize the polymorphic feature of oop, you might want to do:
class Object; def add_value v; v end end
class String; def add_value v; self+v end end # or concat(v) for destructive
class Array; def add_value v; [self, v] end end # or replace([self.dup, v]) for destructive
class Hash; def add_value v; merge(v) end end # or merge!(v) for destructive
class Hash
def add k, v; self[k] = self[k].add_value(v) end
end

What's the most efficient way to deep copy an object in Ruby?

I know that serializing an object is (to my knowledge) the only way to effectively deep-copy an object (as long as it isn't stateful like IO and whatnot), but is one way particularly more efficient than another?
For example, since I'm using Rails, I could always use ActiveSupport::JSON, to_xml - and from what I can tell marshalling the object is one of the most accepted ways to do this. I'd expect that marshalling is probably the most efficient of these since it's a Ruby internal, but am I missing anything?
Edit: note that its implementation is something I already have covered - I don't want to replace existing shallow copy methods (like dup and clone), so I'll just end up likely adding Object::deep_copy, the result of which being whichever of the above methods (or any suggestions you have :) that has the least overhead.
I was wondering the same thing, so I benchmarked a few different techniques against each other. I was primarily concerned with Arrays and Hashes - I didn't test any complex objects. Perhaps unsurprisingly, a custom deep-clone implementation proved to be the fastest. If you are looking for quick and easy implementation, Marshal appears to be the way to go.
I also benchmarked an XML solution with Rails 3.0.7, not shown below. It was much, much slower, ~10 seconds for only 1000 iterations (the solutions below all ran 10,000 times for the benchmark).
Two notes regarding my JSON solution. First, I used the C variant, version 1.4.3. Second, it doesn't actually work 100%, as symbols will be converted to Strings.
This was all run with ruby 1.9.2p180.
#!/usr/bin/env ruby
require 'benchmark'
require 'yaml'
require 'json/ext'
require 'msgpack'
def dc1(value)
Marshal.load(Marshal.dump(value))
end
def dc2(value)
YAML.load(YAML.dump(value))
end
def dc3(value)
JSON.load(JSON.dump(value))
end
def dc4(value)
if value.is_a?(Hash)
result = value.clone
value.each{|k, v| result[k] = dc4(v)}
result
elsif value.is_a?(Array)
result = value.clone
result.clear
value.each{|v| result << dc4(v)}
result
else
value
end
end
def dc5(value)
MessagePack.unpack(value.to_msgpack)
end
value = {'a' => {:x => [1, [nil, 'b'], {'a' => 1}]}, 'b' => ['z']}
Benchmark.bm do |x|
iterations = 10000
x.report {iterations.times {dc1(value)}}
x.report {iterations.times {dc2(value)}}
x.report {iterations.times {dc3(value)}}
x.report {iterations.times {dc4(value)}}
x.report {iterations.times {dc5(value)}}
end
results in:
user system total real
0.230000 0.000000 0.230000 ( 0.239257) (Marshal)
3.240000 0.030000 3.270000 ( 3.262255) (YAML)
0.590000 0.010000 0.600000 ( 0.601693) (JSON)
0.060000 0.000000 0.060000 ( 0.067661) (Custom)
0.090000 0.010000 0.100000 ( 0.097705) (MessagePack)
I think you need to add an initialize_copy method to the class you are copying. Then put the logic for the deep copy in there. Then when you call clone it will fire that method. I haven't done it but that's my understanding.
I think plan B would be just overriding the clone method:
class CopyMe
attr_accessor :var
def initialize var=''
#var = var
end
def clone deep= false
deep ? CopyMe.new(#var.clone) : CopyMe.new()
end
end
a = CopyMe.new("test")
puts "A: #{a.var}"
b = a.clone
puts "B: #{b.var}"
c = a.clone(true)
puts "C: #{c.var}"
Output
mike#sleepycat:~/projects$ ruby ~/Desktop/clone.rb
A: test
B:
C: test
I'm sure you could make that cooler with a little tinkering but for better or for worse that is probably how I would do it.
Probably the reason Ruby doesn't contain a deep clone has to do with the complexity of the problem. See the notes at the end.
To make a clone that will "deep copy," Hashes, Arrays, and elemental values, i.e., make a copy of each element in the original such that the copy will have the same values, but new objects, you can use this:
class Object
def deepclone
case
when self.class==Hash
hash = {}
self.each { |k,v| hash[k] = v.deepclone }
hash
when self.class==Array
array = []
self.each { |v| array << v.deepclone }
array
else
if defined?(self.class.new)
self.class.new(self)
else
self
end
end
end
end
If you want to redefine the behavior of Ruby's clone method , you can name it just clone instead of deepclone (in 3 places), but I have no idea how redefining Ruby's clone behavior will affect Ruby libraries, or Ruby on Rails, so Caveat Emptor. Personally, I can't recommend doing that.
For example:
a = {'a'=>'x','b'=>'y'} => {"a"=>"x", "b"=>"y"}
b = a.deepclone => {"a"=>"x", "b"=>"y"}
puts "#{a['a'].object_id} / #{b['a'].object_id}" => 15227640 / 15209520
If you want your classes to deepclone properly, their new method (initialize) must be able to deepclone an object of that class in the standard way, i.e., if the first parameter is given, it's assumed to be an object to be deepcloned.
Suppose we want a class M, for example. The first parameter must be an optional object of class M. Here we have a second optional argument z to pre-set the value of z in the new object.
class M
attr_accessor :z
def initialize(m=nil, z=nil)
if m
# deepclone all the variables in m to the new object
#z = m.z.deepclone
else
# default all the variables in M
#z = z # default is nil if not specified
end
end
end
The z pre-set is ignored during cloning here, but your method may have a different behavior. Objects of this class would be created like this:
# a new 'plain vanilla' object of M
m=M.new => #<M:0x0000000213fd88 #z=nil>
# a new object of M with m.z pre-set to 'g'
m=M.new(nil,'g') => #<M:0x00000002134ca8 #z="g">
# a deepclone of m in which the strings are the same value, but different objects
n=m.deepclone => #<M:0x00000002131d00 #z="g">
puts "#{m.z.object_id} / #{n.z.object_id}" => 17409660 / 17403500
Where objects of class M are part of an array:
a = {'a'=>M.new(nil,'g'),'b'=>'y'} => {"a"=>#<M:0x00000001f8bf78 #z="g">, "b"=>"y"}
b = a.deepclone => {"a"=>#<M:0x00000001766f28 #z="g">, "b"=>"y"}
puts "#{a['a'].object_id} / #{b['a'].object_id}" => 12303600 / 12269460
puts "#{a['b'].object_id} / #{b['b'].object_id}" => 16811400 / 17802280
Notes:
If deepclone tries to clone an object which doesn't clone itself in the standard way, it may fail.
If deepclone tries to clone an object which can clone itself in the standard way, and if it is a complex structure, it may (and probably will) make a shallow clone of itself.
deepclone doesn't deep copy the keys in the Hashes. The reason is that they are not usually treated as data, but if you change hash[k] to hash[k.deepclone] they will also be deep copied also.
Certain elemental values have no new method, such as Fixnum. These objects always have the same object ID, and are copied, not cloned.
Be careful because when you deep copy, two parts of your Hash or Array that contained the same object in the original will contain different objects in the deepclone.

Resources