I am learning in Ruby on Rails 4.0 that Rails has the ability to reference a hash's values, via a key that can either be a symbol or string, using the class HashWithIndifferentAccess. For example, the params hash can be referenced through both a symbol or a string, because it uses the class HashWithIndifferentAccess.
i.e.: params["id"] and params[:id] --> both access the id in the params hash
Although both can be used and are acceptable by Rails, is there clearly one preferred over other, either for best practice/performance reasons? My initial thought was that it would be better to use symbols due to the fact that once they are stored, they retain that piece of memory. This is contrasted against strings, which a new piece of memory is needed for every string.
Is this correct? Or does it truly not matter whether strings or symbols are used?
Ruby Strings are mutable, which can bring some unpredictability and reduced performance. For these reasons Ruby also offers the choice of Symbols. The big difference would be that Symbols are immutable. While mutable object can be changed the immutable ones can only be overwritten. Symbols play more nicely with the memory and thus gain on performance, but if not careful enough the memory footprint of your app will increase.
Using Strings or Symbols comes down to understanding both of the terms and how they will serve the purpose in benefit of the overall app health and performance. This is the reason why maybe there is no strict definition of where to use String and where Symbol.
What I would maybe give as a guidance of what is recommended where:
Symbol - internal identifiers, variables not meant to change
String - variables that are changing or printed out
HashWithIndifferentAccess ends up mapping internally all the symbols to strings.
h = ActiveSupport::HashWithIndifferentAccess.new(test: 'test')
If you try to retrieve the keys of the created hash (h) you will get the keys as strings
h.keys # => ["test"]
A couple highlights:
First, as of Ruby 2.3, string immutability (objects with same content point to same place in memory) is an option. From what I understand, this functionality is going to become the default in Ruby 3.0. Here's an example:
# frozen_string_literal: true
a = 'foo'
b = 'foo'
puts a.object_id
puts b.object_id
puts a.equal?(b)
Run the file:
➜ ruby test.rb
70186103229600
70186103229600
true
Second, symbols are generally preferred as hash keys (see this popular style guide).
If you're interested in learning more about how HashWithIndifferentAccess works, here's a good blog post. The short version is that all keys are converted to strings behind the scenes.
Related
How do you use a slash / in a Ruby symbol? I'm trying to use symbols to express file names instead of strings but I can't figure out how to reference two levels of a file's path using a symbol. For example, how would you express articles/show as a Ruby symbol?
:'articles/show'
Lot's more info at http://www.troubleshooters.com/codecorn/ruby/symbols.htm
You can convert a string to a symbol using to_sym:
'a/b'.to_sym # => :"a/b"
however, just because you can do that, doesn't mean you should do it. There are reasons we use symbols, such as for memory savings and a little bit faster lookups, but there are times the savings don't outweigh the problems they introduce, such as when trying to work with them as filenames.
The question really seems like an "XY problem", which means you're asking about "Y" but really need to work on "X".
In normal scripts, we might need to open a handful of files, meaning there's only a handful of strings required, and symbols will hardly help save space over the string versions. If you're reading a lot of files, you shouldn't be defining them in your code but instead be storing the names in a separate file, and iterate over that file, retrieving the name of a file then process it, one-by-one.
The IO class doesn't expect symbols. Running:
puts File.foreach('test.txt'.to_sym).to_a
results in:
`foreach': no implicit conversion of Symbol into String (TypeError)
That's not a good sign, and means that, to use symbols instead of strings, you'd have to either reimplement all the IO methods or convert to strings on the fly.
It also means that the convenience methods, just as join, won't work. Where normally we can do:
File.join('a', 'b') # => "a/b"
Passing in symbols results in:
File.join(:a, :b) # =>
# ~> -:2:in `join': no implicit conversion of Symbol into String (TypeError)
# ~> from -:2:in `<main>'
and using something like:
File.join(:a.to_s, :b.to_s).to_sym # => :"a/b"
seems like a real waste of typing and CPU time that will only compound the problem the further that it is used.
I have a Door object that has a state attribute of type string. It can only be one of these elements: %w[open shut locked].
Are there any implications or reasons for using strings over symbols?
door.update_attributes(state: :open)
door.update_attributes(state: 'open')
In Rails 4 we can do this:
Door.order(created_at: :desc)
So why shouldn't I do this?
Door.where(state: :open) # vs state: 'open'
Are they equivalent for all intents and purposes? I prefer to use a symbol because it looks cleaner, and in the DB, a symbol will be a string anyway.
Your instincts are right, IMHO.
Symbols are more appropriate than strings to represent the elements of an enumerated type because they are immutable. While it's true that they aren't garbage collected, unlike strings, there is always only one instance of any given symbol, so the impact is minimal for most state transition applications. And, while the performance difference is minimal as well for most applications, symbol comparison is much quicker than string comparison.
See also Enums in Ruby
The difference between using a symbol and a string is that strings are garbage collected if that specific object is no longer being referred to by a variable or held in some collection (such as a Hash or an Array).
So if they are not in a collection that still exists, they will eventually be garbage collected, but Symbols are forever for the life of the program.
If your key no longer references the 'open' string, that string is eligible for garbage collection, but if it was a symbol for the value, it is no longer referenced by that key, but it will linger in memory.
This can be a Very Bad Thing™
I'm working in a ruby app in which symbols are used in various places where one would usually use strings or enums in other languages (to specify configurations mostly).
So my question is, why should I not add a to_str method to symbol?
It seems seems sensible, as it allows implicit conversion between symbol and string. So I can do stuff like this without having to worry about calling :symbol.to_s:
File.join(:something, "something_else") # => "something/something_else"
The negative is the same as the positive, it implicitly converts symbols to strings, which can be REALLY confusing if it causes an obscure bug, but given how symbols are generally used, I'm not sure if this is a valid concern.
Any thoughts?
when an object does respond_to? :to_str, you expect him to really act like a String. This means it should implement all of String's methods, so you could potentially break some code relying on this.
to_s means that you get a string representation of your object, that's why so many objects implement it - but the string you get is far from being 'semantically' equivalent to your object ( an_hash.to_s is far from being a Hash ). :symbol.to_str's absence reflects this : a symbol is NOT and MUST NOT be confused with a string in Ruby, because they serve totally different purposes.
You wouldn't think about adding to_str to an Int, right ? Yet an Int has a lot of common with a symbol : each one of them is unique. When you have a symbol, you expect it to be unique and immutable as well.
You don't have to implicitly convert it right? Because doing something like this will automatically coerce it to a string.
"#{:something}/something_else" # "something/something_else"
The negative is what you say--at one point, anyway, some core Ruby had different behavior based on symbol/string. I don't know if that's still the case. The threat alone makes me a little twitchy, but I don't have a solid technical reason at this point. I guess the thought of making a symbol more string-like just makes me nervous.
I need to be able add days or hours to a previously created Item.
The system can add or subtract a set number of hours or days based on attributes stored in the db, :operator (add/subtract), :unit_of_time(hours/days), and :number.
I'd like to be able to do something like:
Date.today+2.days
where "+" is the :operator, "2" is the :number, and "days" is the :unit_of_time but I'm unsure of how to get the interpolated string of attributes to become the actual operator "+2.days". Any ideas?
(I've poured through the ruby documentation, but to no avail. Currently, I'm just manually creating of the possible options (4) in nested if/else blocks... yeah it's gross.)
You could use eval, e.g.:
eval("Date.today+2.days")
...and then simply use string interpolation to put in the variables. Note, however, that you should only do this if you can be very certain that the values in your database are always what you want them to be; under no circumstances should users be able to change them, otherwise you'll have a major security issue which compromises your entire system.
Using more lengthy methods like the if statement you suggested (or a case statement) require you to write more code, but they are much more secure.
All ruby objects have a send method which can be used to run a method by name with parameters. Using send as well as converting strings to ints (.to_i) should suffice.
Date.today.send '+', '1'.to_i.send('months') # This works
Date.today.send operator, number.to_i.send(unit) # Generalized form
Ruby is beautiful!
I am able to understand immutability with python (surprisingly simple too). Let's say I assign a number to
x = 42
print(id(x))
print(id(42))
On both counts, the value I get is
505494448
My question is, does python interpreter allot ids to all the numbers, alphabets, True/False in the memory before the environment loads? If it doesn't, how are the ids kept track of? Or am I looking at this in the wrong way? Can someone explain it please?
What you're seeing is an implementation detail (an internal optimization) calling interning. This is a technique (used by implementations of a number of languages including Java and Lua) which aliases names or variables to be references to single object instances where that's possible or feasible.
You should not depend on this behavior. It's not part of the language's formal specification and there are no guarantees that separate literal references to a string or integer will be interned nor that a given set of operations (string or numeric) yielding a given object will be interned against otherwise identical objects.
I've heard that the C Python implementation does include a set of the first hundred or so integers as statically instantiated immutable objects. I suspect that other very high level language run-time libraries are likely to include similar optimizations: the first hundred integers are used very frequently by most non-trivial fragments of code.
In terms of how such things are implemented ... for strings and larger integers it would make sense for Python to maintain these as dictionaries. Thus any expression yielding an integer (and perhaps even floats) and strings (at least sufficiently short strings) would be hashed, looked up in the appropriate (internal) object dictionary, added if necessary and then returned as references to the resulting object.
You can do your own similar interning of any sorts of custom object you like by wrapping the instantiation in your own calls to your own class static dictionary.