Validate uniquness in ActiveRecord on part of a string - ruby-on-rails

Lets say I have registration_id attribute on Dummy model.
Its data type is string.
Its length can be anything from 10-14 characters but I want to put a uniqueness validation on last 10 characters only. (weird but true)
Now how can I achieve this?
What I have thought of:
Create another attribute last_ten_chars_registration_id in Dummy table to hold last 10 characters and put uniqueness on this attribute.
(As Computed attributes apparently don't work for uniqueness validations)
I can create a custom validator and write a query.
I am not sure (may be like query)
Can anyone suggest me any better way to achieve this?

You can use a custom validator like this.
class Dummy < ActiveRecord::Base
validates_with RegistrationValidator, :fields => [:registration_id]
# Whatever else...
end
class RegistrationValidator < ActiveModel::Validator
def validate(record)
reg_id = record.registration_id.last(10).join
if Dummy.where('registration_id LIKE ?',"%#{reg_id}")
record.errors[:registration_id] << "Registration ID taken!"
end
end
end

Same as GoGoCarl, I also think it largely depends on the performance you require. A custom query (not using LIKE but rather the RIGHT(registration_id, 10) function, at least in MySQL) will I think do fine unless the Dummy table is huge or you need the query to be super fast. In that case, I too would do the special column with the last 10 chars, and an accompanying db index.
A pragmatic solution might be to first go the custom query route as it seems simpler to implement to me, and later, if performance starts to suffer, switch to the special column.
You could also do your own benchmarks, e.g. populate a test Dummy table with the number of records you expect and see for yourself if the performance is OK with you or not.
See also this related SO question on string SQL functions performance where the solutions are similar.

Related

More efficient, rails way to check for any of three fields being unique?

So, I need check three fields for uniqueness of an object before creating it (from a form), but I will create the object so long as any of the three fields are unique.
My first thought was to just pass the params from the controller to the model, and then run a query to check if a query with those three fields returns > 0 documents. However, I've since learned that this is a dangerous approach, and should not be used.
So I checked the docs, and based off of this snippet
Or even multiple scope parameters. For example, making sure that a teacher can only be on the schedule once per semester for a particular class.
class TeacherSchedule < ActiveRecord::Base
validates_uniqueness_of :teacher_id, scope: [:semester_id, :class_id]
end
I thought I had found my answer, and implemented:
validates_uniqueness_of :link_to_event, :scope => [:name_of_event, :date_of_event]
which works! But, this dataset is going to get very large (not from this form alone, lol), and I'm under the impression that with this implementation, Rails is going to query for all fields with a link_to_event, and then all fields with a name_of_event, and then all fields with a date_of_event. So, my question(s) is:
A) Am I wrong about how rails will implement this? Is it going to be more efficient out of the box?
B) If this will not be efficient for a table with a couple million entries, is there a better (and still railsy) way to do this?
You can define a method that queries the records with all the fields that you want to be unique as a group:
validate :uniqueness_of_teacher_semester_and_class
def uniqueness_of_teacher_semester_and_class
users = self.class.where(teacher_id: teacher_id, semester_id: semester_id, class_id: class_id)
errors.add :base, 'Record not unique.' if users.exists?
end
To answer your questions:
A) Am I wrong about how rails will implement this? Is it going to be more efficient out of the box?
I think Rails will query for a match on all 3 fields, and you should check the Mongo (or Rails) log to see for sure.
B) If this will not be efficient for a table with a couple million entries, is there a better (and still railsy) way to do this?
This is the Rails way. There are 2 things you can do to make it efficient:
You would need indexes on all 3 fields, or a compound index of the 3 fields. The compound index *might* be faster, but you can benchmark to find out.
You can add a new field with the 3 fields concatenated, and an index on it. But this will take up extra space and may not be faster than the compound index.
These days a couple million documents is not that much, but depends on document size and hardware.

How to prevent SQL inject in ActiveRecord Where clause with both field and value dynamic

I'm creating a dynamic search for users where both the field and value are dynamic. I have it working now with the code below, but wanted to prevent possible SQL injection and wondered how to do this.
search_clause = "#{search_criteria.field} LIKE '%#{search_criteria.value}%'"
organizational_users.where(search_clause)
Can I parameterize search_clause even with a dynamic field? How can I do this?
The key to the question is solving the dynamic part of the field, the duplicate question suggestion does not really address that part.
Any help/suggestions would be appreciated!
With value escaping, the best you can do is to limit SQL injection to referencing any single identifier visible in the query (e.g. every column in the table). That would look like this:
organizational_users.where("#{connection.quote_column_name search_criteria.field} LIKE ?", "%#{search_criteria.value}%")
Unless you're absolutely sure you need to expose every single column to this filtering, you should really apply an allowlist first:
raise "nope" unless search_criteria.field.in? %w(first_name last_name)
organizational_users.where("#{connection.quote_column_name search_criteria.field} LIKE ?", "%#{search_criteria.value}%")
(The above assumes this is happening in a model method. As it involves using the connection to quote the column name, it really should be... but if not, you'll need to use SomeModel.connection.quote_column_name instead.)

Data type and approach for Rails attributes with occasional second value

I have a use case where a user clicks on a word and sees information about its base word. Very occasionally there are words with two different but equally accepted spellings with identical meanings, and I'd like to display the alternate spelling when this happens (examples for Spanish: 'video' and 'vídeo', or 'quizá' and 'quizás').
Similarly, there are times where a base_word will need two inflection_forms.
The BaseWord model has attributes for base_word:string, inflection_form:string (i.e., 'ND3XX'), and the language_id. A base_word has_many :inflections to handle related words, but it seems silly to create a new inflection for a differently spelled word with an identical grammatical role.
I tried serializing both of the fields into an Array, and then later as a Set, but in both cases I had trouble querying the database for base_words where base_word was equal to one of the set/array members.
What is the most logical way to handle this case?
Since you are using Postgres you can use it's built in Array type which can be queried using SQL queries. cf http://edgeguides.rubyonrails.org/active_record_postgresql.html#array
Update 1 : Query example
Yes, its not as simple but, assuming you add an inflections column of type array to your migration, you should be able to hide the complexity with something like:
class BaseWord < ActiveRecord::Base
...
scope :has_inflection, -> (word) { where('? = ANY(inflections)', word) }
And then use it like any other scope. For ex.
BaseWord.has_inflection('vídeo').order('base_word').limit(5)

Managing the default order of a table in Rails 3

I have a model which has over 40,000 entires in it. I want to be able to have this table permanently sorted by one of its attributes. The tricky part of this is that some of the elements have a nil value for the attribute I want to sort by.
Some poking around has led me to default_scope, but it appears this is being deprecated and everyone warns against it. It seems like putting default_scope order('director_id DESC') or something like this would fix things, but this doesn't take into account nil values. What is the better alternative?
Thanks!
EDIT
I'm also using Tire with ElasticSearch for managing searches.
Yes, it's best to be explicit with model scopes. You can just do:
class MyModel < ActiveRecord::Base
def self.default_order
order('director_id DESC NULLS LAST')
end
end
Your database will have a syntax as part of ORDER BY for the placement of NULL values. If you don't want NULL values in the output at all then you can add a where call and the method should be renamed.

In Rails, what is the best way to store multiple boolean attributes in a model?

I have a model House that has many boolean attributes, like has_fireplace, has_basement, has_garage, and so on. House has around 30 such boolean attributes. What is the best way to structure this model for efficient database storage and search?
I would like to eventually search for all Houses that have a fireplace and a garage, for example.
The naive way, I suppose, would be to simply add 30 boolean attributes in the model that each corresponds to a column in the database, but I'm curious if there's a Rails best practice I'm unaware of.
Your 'naive' assumption is correct - the most efficient way from a query speed and productivity perspective is to add a column for each flag.
You could get fancy as some others have described, but unless you're solving some very specific performance problems, it's not worth the effort. You'd end with a system that's harder to maintain, less flexible and that takes longer to develop.
For that many booleans in a single model you might consider using a single integer and bitwise operations to represent, store and retrieve values. For example:
class Model < ActveRecord::Base
HAS_FIREPLACE = (1 << 0)
HAS_BASEMENT = (1 << 1)
HAS_GARAGE = (1 << 2)
...
end
Then some model attribute called flags would be set like this:
flags |= HAS_FIREPLACE
flags |= (HAS_BASEMENT | HAS_GARAGE)
And tested like this:
flags & HAS_FIREPLACE
flags & (HAS_BASEMENT | HAS_GARAGE)
which you could abstract into methods. Should be pretty efficient in time and space as an implementation
I suggest the flag_shih_tzu gem. It helps you store many boolean attributes in one integer column. It gives you named scopes for each attribute and a way to chain them together as active record relations.
Here's another solution.
You could make a HouseAttributes model and set up a two way has_and_belongs_to_many association
# house.rb
class House
has_and_belongs_to_many :house_attributes
end
# house_attribute.rb
class HouseAttribute
has_and_belongs_to_many :houses
end
Then each attribute for a house would be a database entry.
Don't forget to set up your join table on your database.
If you're wanting to query on those attributes, then you're unfortunately probably stuck with first-class fields, if performance is a consideration. Bitfields and flag strings are an easy way to solve the problem, but they don't scale well against production data sets.
If you aren't going to worry about performance, then I'd use an implementation where each property is represented by a character ("a" = "garage", "b" = "fireplace", etc), and you just build a string that represents all the flags a record has. The primary advantage this has over a bitfield is that a) it's easier for a human to debug, and b) you don't need to worry about the size of your data types.
If performance is a concern, then you will likely need to promote them to first-class fields.
Normally I'd agree that your naive assumption is correct.
If the number of boolean fields keep growing and growing (has_fusion_reactor?) you may also consider serializing an array of flags
# house.rb
class House
serialize :flags
…
end
# Setting flags
#house.flags = [:fireplace, :pool, :doghouse]
# Appending
#house.flags << :sauna
#Querying
#house.flags.has_key? :porch
#Searching
House.where "flags LIKE ?", "pool"
I'm thinking about something like this
You have a House Table (for details of the house)
You have another master table called Features (which has features, like 'fireplace', 'basement' etc..)
and you have a joining table like Houses_Features
and it has house_id and feature_id
By that way you can assign features to a given house. dont know whether this matches to your needs, but just think about it :D
thanks and regards
sameera
You could always have a TEXT column that you hold JSON in (say, data), and then your queries could use SQL's LIKE.
Eg: house.data #=> '{"has_fireplace":true,"has_basement":false,"has_garage":true}'
Thus, doing a find using LIKE '%"has_fireplace":true%' would return anything with a fireplace.
Using model relationships (eg, a model for Fireplace, Basement, and Garage in addition to just House) would be extremely cumbersome in this case, since you have so many models.

Resources