Difference between string and text in rails? - ruby-on-rails

I'm making a new web app using Rails, and was wondering, what's the difference between string and text? And when should each be used?

The difference relies in how the symbol is converted into its respective column type in query language.
with MySQL :string is mapped to VARCHAR(255)
https://edgeguides.rubyonrails.org/active_record_migrations.html
:string | VARCHAR | :limit => 1 to 255 (default = 255)
:text | TINYTEXT, TEXT, MEDIUMTEXT, or LONGTEXT2 | :limit => 1 to 4294967296 (default = 65536)
Reference:
https://hub.packtpub.com/working-rails-activerecord-migrations-models-scaffolding-and-database-completion/
When should each be used?
As a general rule of thumb, use :string for short text input (username, email, password, titles, etc.) and use :text for longer expected input such as descriptions, comment content, etc.

If you are using postgres use text wherever you can, unless you have a size constraint since there is no performance penalty for text vs varchar
There is no performance difference among these three types, apart from increased storage space when using the blank-padded type, and a few extra CPU cycles to check the length when storing into a length-constrained column. While character(n) has performance advantages in some other database systems, there is no such advantage in PostgreSQL; in fact character(n) is usually the slowest of the three because of its additional storage costs. In most situations text or character varying should be used instead
PostsgreSQL manual

String translates to "Varchar" in your database, while text translates to "text". A varchar can contain far less items, a text can be of (almost) any length.
For an in-depth analysis with good references check http://www.pythian.com/news/7129/text-vs-varchar/
Edit: Some database engines can load varchar in one go, but store text (and blob) outside of the table. A SELECT name, amount FROM products could, be a lot slower when using text for name than when you use varchar. And since Rails, by default loads records with SELECT * FROM... your text-columns will be loaded. This will probably never be a real problem in your or my app, though (Premature optimization is ...). But knowing that text is not always "free" is good to know.

String if the size is fixed and small and text if it is variable and big.
This is kind of important because text is way bigger than strings. It contains a lot more kilobytes.
So for small fields always use string(varchar). Fields like. first_name, login, email, subject (of a article or post)
and example of texts: content/body of a post or article. fields for paragraphs etc
String size 1 to 255 (default = 255)
Text size 1 to 4294967296 (default = 65536)2

As explained above not just the db datatype it will also affect the view that will be generated if you are scaffolding.
string will generate a text_field text will generate a text_area

Use string for shorter field, like names, address, phone, company
Use Text for larger content, comments, content, paragraphs.
My general rule, if it's something that is more than one line, I typically go for text, if it's a short 2-6 words, I go for string.
The official rule is 255 for a string. So, if your string is more than 255 characters, go for text.

The accepted answer is awesome, it properly explains the difference between string vs text (mostly the limit size in the database, but there are a few other gotchas), but I wanted to point out a small issue that got me through it as that answer didn't completely do it for me.
The max size :limit => 1 to 4294967296 didn't work exactly as put, I needed to go -1 from that max size. I'm storing large JSON blobs and they might be crazy huge sometimes.
Here's my migration with the larger value in place with the value MySQL doesn't complain about.
Note the 5 at the end of the limit instead of 6
class ChangeUserSyncRecordDetailsToText < ActiveRecord::Migration[5.1]
def up
change_column :user_sync_records, :details, :text, :limit => 4294967295
end
def down
change_column :user_sync_records, :details, :string, :limit => 1000
end
end

If you are using oracle... STRING will be created as VARCHAR(255) column and TEXT, as a CLOB.
NATIVE_DATABASE_TYPES = {
primary_key: "NUMBER(38) NOT NULL PRIMARY KEY",
string: { name: "VARCHAR2", limit: 255 },
text: { name: "CLOB" },
ntext: { name: "NCLOB" },
integer: { name: "NUMBER", limit: 38 },
float: { name: "BINARY_FLOAT" },
decimal: { name: "DECIMAL" },
datetime: { name: "TIMESTAMP" },
timestamp: { name: "TIMESTAMP" },
timestamptz: { name: "TIMESTAMP WITH TIME ZONE" },
timestampltz: { name: "TIMESTAMP WITH LOCAL TIME ZONE" },
time: { name: "TIMESTAMP" },
date: { name: "DATE" },
binary: { name: "BLOB" },
boolean: { name: "NUMBER", limit: 1 },
raw: { name: "RAW", limit: 2000 },
bigint: { name: "NUMBER", limit: 19 }
}
https://github.com/rsim/oracle-enhanced/blob/master/lib/active_record/connection_adapters/oracle_enhanced_adapter.rb

If the attribute is matching f.text_field in form use string, if it is matching f.text_area use text.

Related

Rails- Mongoid query to filter objects based on length of a field

I want to filter a collection based on the length of a field.
Example : For collection Band, i would want the objects where the length of the name of band is equal to 10.
There are two ways I can think to do this. In these examples, let's pretend I have the following model:
class Band
include Mongoid::Document
field :name, type: String
end
Aggregation
If you're using MongoDB server version 3.6 or newer, you can use the $expr operator to include aggregation operations in your query. In this example, I'm using the $strLenCP operator to find any documents where the name field has 5 Unicode code points:
Band.where("$expr": { "$eq": [ { "$strLenCP": "$name" }, 5 ] })
Regular Expressions
You could also use a Ruby regular expression that matches any five-character string:
Band.where(name: /\A.{5}\z/)
I suspect aggregation will be more performant, but it can't hurt to know a few ways of doing something.

Is it possible to use a custom equality operator with Ruby's Set?

I need to diff collections of child objects between 2 parents. Each is about 30,000 objects, and have about a dozen various attributes. Ruby's Set class provides a fast method to subtract one collection from the other, and get the difference. I had been doing this with JSON data, and the whole thing only took a couple seconds.
Now I'm using ActiveRecord to get the datasets. Of course, once the children are unmarshalled from the database, they include attributes :id, :created_at, and :updated_at. Unfortunately, this automatically ruins the comparisons in the diff, because these fields will always be different, and cause the comparison to fail.
Out of the set of attributes, I really only care about :label and :data. That is, I want to compare the objects with the same label between the 2 sets, and see if their data is different.
I can add a custom equivalency operator in my class:
def ==(other)
self.label == other.label && self.data == other.data
end
This works between comparisons of single objects. They are considered equal if (just) their labels and data match. However, this override does not seem to be getting used in this operation, for purposes of determining equivalency:
#diff = (#left.to_set - #right.to_set)
I was hoping that Set would use the object's class' overridden == operator, but this doesn't seem to be the case. My diffs are just all of the one side or the other, depending on the order of the difference. Is there any way to make this happen? (I already also tried overriding .eql?.)
Since this is too long for a comment, here's the SQL implementation of the idea.
WITH
t1 AS (SELECT * FROM tunings WHERE calibration_id = 7960),
t2 AS (SELECT * FROM tunings WHERE calibration_id = 7965)
SELECT t1.label, t1."data", t2."data" FROM t1 FULL OUTER JOIN t2 ON t1.label = t2.label
WHERE t1."data" != t2."data" OR t1."data" IS NULL OR t2."data" IS NULL
Another speed problem I hadn't even brought up yet was that I have to LOOK UP the "right" value, from the corresponding set, when I display the differences in the view, and THAT takes ANOTHER 10 seconds. This is all done in one step.
Because of the CTE's, I'm guessing that I won't be able to put this into ActiveRecord semantics, and I'll just have to pass the raw SQL with seeded values, but I would love to be proven wrong.
Also, I'm still academically interested in original question.
According to Ruby Set class: equality of sets, you need to override both Object#eql? and Object#hash
Here's how you can do it in general Ruby, without having to redefine your classes' identity.
first = [{ id: 1, label: "foo", data: "foo"},
{ id: 2, label: "bar", data: "bar"},
{ id: 3, label: "baz", data: "baz"}]
second = [{ id: 1, label: "foo", data: "foo"},
{ id: 2, label: "baz", data: "baz"},
{ id: 3, label: "quux", data: "quux"}]
first_groups = first.group_by { |e| e.values_at(:label, :data) }
second_groups = second.group_by { |e| e.values_at(:label, :data) }
first_minus_second_keys = first_groups.keys.to_set - second_groups.keys.to_set
first_minus_second = first_minus_second_keys.flat_map { |k| first_groups[k] }
(This is for lists of hashes; for AR classes you'd replace e.values(:label, :data) with [e.label, e.data])
That said, I agree with the Tin Man: it would be way more performant to do this at the database level.

Rails Amounts in Thousands Are Truncated

In my Rails 5 app, I read in a feed for products. In the JSON, when the price is over $1,000, it the JSON has a comma, like 1,000.
My code seems to be truncating it, so it's storing as 1 instead of 1,000.
All other fields are storing correctly. Can someone please tell me what I'm doing wrong?
In this example, the reg_price saves as 2, instead of 2590.
json sample (for reg_price field):
[
{
"reg_price": "2,590"
}
]
schema
create_table "products", force: :cascade do |t|
t.decimal "reg_price", precision: 10, scale: 2
end
model
response = open_url(url_string).to_s
products = JSON.parse(response)
products.each do |product|
product = Product.new(
reg_price: item['reg_price']
)
product.save
end
You are not doing anything wrong. Decimals don't work with comma separator. I'm not sure there is a nice way to fix the thing. But as an option you could define a virtual attribute:
def reg_price=(reg_price)
self[:reg_price] = reg_price.gsub(',', '')
end
The reason this is happening has nothing to do with Rails.
JSON is a pretty simple document structure and doesn't have any support for number separators. The values in your JSON document are strings.
When you receive a String as input and you want to store it as an Integer, you need to cast it to the appropriate type.
Ruby has built in support for this, and Rails is using it: "1".to_s #=> 1
The particular heuristic Ruby uses to convert a string to an integer is to take any number up to a non-numerical character and cast it as an integer. Commas are non-numeric, at least by default, in Ruby.
The solution is to convert the string value in your JSON to an integer using another method. You can do this any of these ways:
Cast the string to an integer before sending it to your ActiveRecord model.
Alter the string in such a way that the default Ruby casting will cast the string into the expected value.
Use a custom caster to handle the casting for this particular attribute (inside of ActiveRecord and ActiveModel).
The solution proposed by #Danil follows #2 above, and it has some shortcomings (as #tadman pointed out).
A more robust way of handling this without getting down in the mud is to use a library like Delocalize, which will automatically handle numeric string parsing and casting with consideration for separators used by the active locale. See this excellent answer by Benoit Garret for more information.

How can I return the highest "valued" element -- per "name" -- in an Array?

I've read a lot of posts about finding the highest-valued objects in arrays using max and max_by, but my situation is another level deeper, and I can't find any references on how to do it.
I have an experimental Rails app in which I am attempting to convert a legacy .NET/SQL application. The (simplified) model looks like Overlay -> Calibration <- Parameter. In a single data set, I will have, say, 20K Calibrations, but about 3,000-4,000 of these are versioned duplicates by Parameter name, and I need only the highest-versioned Parameter by each name. Further complicating matters is that the version lives on the Overlay. (I know this seems crazy, but this models our reality.)
In pure SQL, we add the following to a query to create a virtual table:
n = ROW_NUMBER() OVER (PARTITION BY Parameters.Designation ORDER BY Overlays.Version DESC)
And then select the entries where n = 1.
I can order the array like this:
ordered_calibrations = mainline_calibrations.sort do |e, f|
[f.parameter.Designation, f.overlay.Version] <=> [e.parameter.Designation, e.overlay.Version] || 1
end
I get this kind of result:
C_SCR_trc_NH3SensCln_SCRT1_Thd 160
C_SCR_trc_NH3SensCln_SCRT1_Thd 87
C_SCR_trc_NH3Sen_DewPtHiThd_Tbl 310
C_SCR_trc_NH3Sen_DewPtHiThd_Tbl 160
C_SCR_trc_NH3Sen_DewPtHiThd_Tbl 87
So I'm wondering if there is a way, using Ruby's Enumerable built-in methods, to loop over the sorted array, and only return the highest-versioned elements per name. HUGE bonus points if I could feed an integer to this method's block, and only return the highest-versioned elements UP TO that version number ("160" would return just the second and fourth entries, above).
The alternative to this is that I could somehow implement the ROW_NUMBER() OVER in ActiveRecord, but that seems much more difficult to try. And, of course, I could write code to deal with this, but I'm quite certain it would be orders of magnitude slower than figuring out the right Enumerable function, if it exists.
(Also, to be clear, it's trivial to do .find_by_sql() and create the same result set as in the legacy application -- it's even fast -- but I'm trying to drag all the related objects along for the ride, which you really can't do with that method.)
I'm not convinced that doing this in the database isn't a better option, but since I'm unfamiliar with SQL Server I'll give you a Ruby answer.
I'm assuming that when you say "Parameter name" you're talking about the Parameters.Designation column, since that's the one in your examples.
One straightforward way you can do this is with Enumerable#slice_when, which is available in Ruby 2.2+. slice_when is good when you want to slice an array "between" values that are different in some way. For example:
[ { id: 1, name: "foo" }, { id: 2, name: "foo" }, { id: 3, name: "bar" } ]
.slice_when {|a,b| a[:name] != b[:name] }
# => [ [ { id: 1, name: "foo" }, { id: 2, name: "foo" } ],
# [ { id: 3, name: "bar" } ]
# ]
You've already sorted your collection, so to slice it you just need to do this:
calibrations_by_designation = ordered_calibrations.slice_when do |a, b|
a.parameter.Designation != b.parameter.Designation
end
Now calibrations_by_designation is an array of arrays, each of which is sorted from greatest Overlay.Version to least. The final step, then, is to get the first element in each of those arrays:
highest_version_calibrations = calibrations_by_designation.map(&:first)

Rails 4 - How to store an array on postgres

I need to save this array:
params[:products]
The array contains these information:
[{"'name'"=>"Product Name 1 ", "'id'"=>"2", "'quantity'"=>"2", "'accessory'"=>{"'id'"=>"8", "'name'"=>"Accessory Name 1"}}, {"'name'"=>"Product Name 2 ", "'id'"=>"5", "'quantity'"=>"1", "'accessory'"=>{"'id'"=>"40", "'name'"=>"Accessory Name 2"}}]
As you can see, accessory is another array.
The process is this: A front-end guy is givin me that array, So I want to store all data on order.rb model.
So, I have a couple of questions:
Do I need to have "array type field" on database?.
Which fields do I need?
I was looking for some examples and I've been trying this on my order model:
serialize :product
order = Order.new
order.product = [:product]
order.save
order.product
I red about this method too: http://api.rubyonrails.org/classes/ActiveRecord/Store.html
Maybe this is a basic question but I really don't know how to solve it. As you can see, I don't have code in any controller because I really don't know what I need to write.
Thank you for your help.
Besides hstore, another solution would be JSON, specifically I suggest you use the PostgreSQL type "jsonb" because it's more efficient, see the documentation:
There are two JSON data types: json and jsonb. They accept almost identical sets of values as input. The major practical difference is one of efficiency. The json data type stores an exact copy of the input text, which processing functions must reparse on each execution; while jsonb data is stored in a decomposed binary format that makes it slightly slower to input due to added conversion overhead, but significantly faster to process, since no reparsing is needed. jsonb also supports indexing, which can be a significant advantage.
(more info here https://www.postgresql.org/docs/current/static/datatype-json.html )
So you have, similarly to hstore, a data structure that you can execute queries against, and this queries are quite fast as you can read above. This is a significant advantage over other strategies, e.g. serializing a Ruby hash and saving it directly in the DB.
Charles,
I suggest you to consider using hstore type of your postgres. There are few benefits of using it (performance, indexing of objects etc..).
enable_extension 'hstore'
This actually enables your PSQL have this support.
Your migration is going to be like this:
class AddRecommendationsToPages < ActiveRecord::Migration
def change
add_column :pages, :recommendations, :hstore
end
end
And after that you can pass into your hstore anything you want..
irb(main):020:0> Page.last.recommendations
Page Load (0.8ms) SELECT "pages".* FROM "pages" ORDER BY "pages"."id" DESC LIMIT 1
=> {"products"=>"34,32"}
irb(main):021:0> Page
=> Page(id: integer, block: string, name: string, slug: string, title: string, h1: string, annotation: text, article: text, created_at: datetime, updated_at: datetime, position: integer, parent: integer, show: boolean, recommendations: hstore)
irb(main):022:0> Page.last.recommendations["products"]
Page Load (0.6ms) SELECT "pages".* FROM "pages" ORDER BY "pages"."id" DESC LIMIT 1
=> "34,32"
irb(main):023:0>

Resources