Rails merge duplicates if value is matching or nil - ruby-on-rails

I'm trying to perform a clean up of some data.
I have details in various forms with various duplicates.
models/object.rb
attr_accessible :name, :email, :assoc_id
I want to merge duplicates where the name is matching and the email is either matching or nil, and the assoc_id is either matching or nil.
Not sure how I write the query to bring back groups of objects that are either matching or nil..
i.e.
grouped_objects = Object.group_by{|o| [o.name]}
brings me grouped just on the name
grouped_objects = Object.group_by{|o| [o.name, o.email]}
brings me grouped on name and email.
the issue is that many of the objects have missing data.
Just want a quick and dirty so that, in the absence of other information, I'll merge the records together.
However, if there's someone with a different email, or a different assoc_id I won't merge that. Appreciate that there'll be some false records, but what we'll end up with will be an improvement
How do I write that activerecord query?
grouped_objects = Object.group_by{|o| [o.name, o.email || o.email == nil]}
Hope that makes sense,

I think a better way is too make your model intolerant with duplication. You can prevent duplication directly in the model. So when your controller try to create an object, it checks before if it doesn't exist by some element you decide.
So if you want your object be unique by some element, better do something like that (assuming you want the uniqueness from name and email field) in MyModel.rb :
class MyModel < ActiveRecord::Base
attr_accessible :name, :email, :assoc_id
validates_uniqueness_of :name
validates_uniqueness_of :email, :allow_nil => true # or :allow_blank => true
# Your code...
end
You can also use :case_sensitive => false if you don't want upper case be differenciated from lower case.
Hope this is what you are looking for !

Related

Adding presence validation on existing records

We have user records that have an attribute called first_name. Many of these records do no have the first_name attribute filled out and thus it equals nil. We want to introduce a presence validation on this attribute. However we've come across a huge problem. If a user updates their record during any request, that request will fail. This leads to a rather abrasive error that we don't know how to handle.
One solution is to only call the validation when the user is creating a record. This works great but we want to enforce this validation when they are on the profile page and they are attempting to update their profile.
Is there a better way to handle this where we can enforce first name requirements on the update page yet still allow users to update their record without ?
Introducing validations on existing data that does not satisfy the new requirements can be problematic. This concept you're after is fundamentally migration-on-write: You've introduce a data migration that happens over time as records are written to, because the migration cannot occur without individual user input. This is one technique for migrating very large data set in zero-downtime environments, or for forcing password resets on users.
Fundamentally, you need to define the conditions in which validation must happen and find a way to test records (on create or update) for that condition. Your condition should select all new records, plus the records being updated in the context where migration is possible.
Once you've defined the condition, you can modify your validation thusly:
validates :first_name, presence: true, if: -> { condition_for_migration }
Ideally the condition should be some field or combination of fields already present in your table that correctly identifies records as ready to be migrated, but this isn't always possible.
Failing that, you could introduce a field specifically for this purpose. You might call it version_number, set all existing records to 1, and then make the default for all new records 2. Your migration might look like this:
# All existing records will have their `version_number` set to the default of 1
add_column :users, :version_number: :integer, null: false, default: 1
# Change the default to 2 for any records created after this point
change_column_default :users, :version_number, 2
You can then use version_number to tell whether validation should take place:
validates :first_name, presence: true, if: -> { version_number >= 2 }
The key is to make sure that, in the context of your profile form, you also update version_number to enable the validation of first_name:
# app/viws/users/edit.html.haml
= form_for #user do |f|
= f.hidden_field :version, value: 2
= f.input :first_name
In the absence of a real database field for this purpose, you can add a temporary one to your model, which maintains the context only for the lifetime of a particular model instance:
Add an accessor to your model, ie update_from_profile_page
Include that field in the contexts in which you want to require validation
Validate first_name during the creation of any new record
Validate first_name during any update where update_from_profile_page is true
For example:
app/models/user.rb
class User < ActiveRecord::Base
attr_accessor :update_from_profile_page
validates :first_name, presence: true, on: :create
validates :first_name, presence: true, on: :update, if: -> { update_from_profile_page }
end
app/views/user/edit.html.haml (your profile page)
= form_for #user do |f|
= f.input :first_name
app/controllers/users_controller.rb
def update
#user = User.find(params[:id])
#user = update_from_profile_page = true
#user.update(params.require(:user).permit(:first_name)
end
This is less desirable than finding a concrete business-logic-based reason for conditional validation as it involves introducing a virtual field to your model that has no functional value outside of a single specific case of a form submission.

ActiveRecord validates inclusion in list - list isn't updated after new associated model created

I have a Company model and an Employer model. Employer belongs_to :company and Company has_many :employers. Within my Employer model I have the following validation:
validates :company_id, inclusion: {in: Company.pluck(:id).prepend(nil)}
I'm running into a problem where the above validation fails. Here is an example setup in a controller action that will cause the validation to fail:
company = Company.new(company_params)
# company_params contains nested attributes for employers
company.employers.each do |employer|
employer.password = SecureRandom.hex
end
company.employers.first.role = 'Admin' if client.employers.count == 1
company.save!
admin = company.employers.where(role: 'Admin').order(created_at: :asc).last
admin.update(some_attr: 'some_val')
On the last line in the example code snippet, admin.update will fail because the validation is checking to see if company_id is included in the list, which it is not, since the list was generated before company was saved.
Obviously there are ways around this such as grabbing the value of company.id and then using it to define admin later, but that seems like a roundabout solution. What I'd like to know is if there is a better way to solve this problem.
Update
Apparently the possible workaround I suggested doesn't even work.
new_company = Company.find(company.id)
admin = new_company.employers.where(role: 'Admin').order(created_at: :asc).last
admin.update
# Fails validation as before
I'm not sure I understand your question completely, but there is an issue in this part of the code:
validates :company_id, inclusion: {in: Company.pluck(:id).prepend(nil)}
The validation is configured on the class-level, so it won't work well with updates on that model (won't be re-evaluated on subsequent validations).
The docs state that you can use a block for inclusion in, so you could try to do that as well:
validates :company_id, inclusion: {in: ->() { Company.pluck(:id).prepend(nil) }}
Some people would recommend that you not even do this validation, but instead, have a database constraint on that column.
I believe you are misusing the inclusion validator here. If you want to validate that an associated model exists, instead of its id column having a value, you can do this in two ways. In ActivRecord, you can use a presence validator.
validates :company, presence: true
You should also use a foreign key constraint on the database level. This prevents a model from being saved if there is no corresponding record in the associated table.
add_foreign_key :employers, :companies
If it gets past ActiveRecord, the database will throw an error if there is no company record with the given company_id.

Ruby on Rails - Paperclip :allow_blank with if statement

Some use case background: I am not using Paperclip for avatars or things like that. I want people to make "submissions" that can contain either a file OR a link, depending on the category of the submission they previously chose (the actual file is being uploaded to a table called Submission Details which has a foreign key to Submissions). Some categories have a "link" type, and some categories have an "image" or a "PDF" type. If they select the link category, that subsequent URL info is stored in a separate column than the image/attachment column.
Here is the code in my model to determine which is which:
def nonlink?
if submission.category.submission_file_type == "Mixed (PDF and Images)" || submission.category.submission_file_type == "Images Only"
return true
else
return false
end
end
So ideally I would want :allow_blank if :nonlink? is true or false in the column validation area of the model.
My first question is, does validates_attachment_presence allow for :allow_blank? If it doesn't, what is the best alternative.
Secondly, what is the syntax for creating an if statement with :allow_blank in a model? Right now I have this but not sure it's right:
validates_attachment_presence :attachment, allow_blank: true, if: [:nonlink? == false]
Would appreciate any thoughts, thanks!
Solved this by doing an extra validation line containing validates_presence_of instead of just validates. I made two separate methods, nonlink and link, with one looking like:
validates_presence_of :attachment, if: :nonlink?
Using this I was able to get rid of :allow_blank entirely.

Rails: Validate unique combination of 3 columns

Hi I wan't to validate the unique combination of 3 columns in my table.
Let's say I have a table called cars with the values :brand, :model_name and :fuel_type.
What I then want is to validate if a record is unique based on the combination of those 3. An example:
brand model_name fuel_type
Audi A4 Gas
Audi A4 Diesel
Audi A6 Gas
Should all be valid. But another record with 'Audi, A6, Gas' should NOT be valid.
I know of this validation, but I doubt that it actually does what I want.
validates_uniqueness_of :brand, :scope => {:model_name, :fuel_type}
There is a syntax error in your code snippet. The correct validation is :
validates_uniqueness_of :car_model_name, :scope => [:brand_id, :fuel_type_id]
or even shorter in ruby 1.9.x:
validates_uniqueness_of :car_model_name, scope: [:brand_id, :fuel_type_id]
with rails 4 you can use:
validates :car_model_name, uniqueness: { scope: [:brand_id, :fuel_type_id] }
with rails 5 you can use
validates_uniqueness_of :car_model_name, scope: %i[brand_id fuel_type_id]
Depends on your needs you could also to add a constraint (as a part of table creation migration or as a separate one) instead of model validation:
add_index :the_table_name, [:brand, :model_name, :fuel_type], :unique => true
Adding the unique constraint on the database level makes sense, in case multiple database connections are performing write operations at the same time.
To Rails 4 the correct code with new hash pattern
validates :column_name, uniqueness: {scope: [:brand_id, :fuel_type_id]}
I would make it this way:
validates_uniqueness_of :model_name, :scope => {:brand_id, :fuel_type_id}
because it makes more sense for me:
there should not be duplicated "model names" for combination of "brand" and "fuel type", vs
there should not be duplicated "brands" for combination of "model name" and "fuel type"
but it's subjective opinion.
Of course if brand and fuel_type are relationships to other models (if not, then just drop "_id" part). With uniqueness validation you can't check non-db columns, so you have to validate foreign keys in model.
You need to define which attribute is validated - you don't validate all at once, if you want, you need to create separate validation for every attribute, so when user make mistake and tries to create duplicated record, then you show him errors in form near invalid field.
Using this validation method in conjunction with ActiveRecord::Validations#save does not guarantee the absence of duplicate record insertions, because uniqueness checks on the application level are inherently prone to race conditions.
This could even happen if you use transactions with the 'serializable' isolation level. The best way to work around this problem is to add a unique index to the database table using ActiveRecord::ConnectionAdapters::SchemaStatements#add_index. In the rare case that a race condition occurs, the database will guarantee the field's uniqueness.
Piecing together the other answers and trying it myself, this is the syntax you're looking for:
validates :brand, uniqueness: { scope: [:model_name, :fuel_type] }
I'm not sure why the other answers are adding _id to the fields in the scope. That would only be needed if these fields are representing other models, but I didn't see an indication of that in the question. Additionally, these fields can be in any order. This will accomplish the same thing, only the error will be on the :model_name attribute instead of :brand:
validates :model_name, uniqueness: { scope: [:fuel_type, :brand] }

Validates uniqueness of :link

I have a url field named link in my model with the following validation
validates_uniqueness_of :link, :case_sensitive => false
When I put "http://stackoverflow.com", it goes well.
Now when I put "https://stackoverflow.com/" (with the trailing slach), this is also accepted as unique.
But I want it to be invalid though there is "/" at the last?
I'd suggest that you normalize your URLs (add/strip trailing slash, etc. see http://en.wikipedia.org/wiki/URL_normalization) before storing them in the DB and even before validation.
validates_uniqueness_of :link, :case_sensitive => false
before_validation :normalize_urls
def normalize_urls
self.link.strip!
self.link.gsub!(/\/$/,'')
end
This isn't quite what you were asking for but if you don't store normalized URLs, you'll have to query your DB for all possible variations during validation and that could quickly get expensive.
You could always do a custom validator (by using the validate method, for example).
It might look something like this:
class MyModel < ActiveRecord::Base
validate :link_is_unique
def link_is_unique
#Clean up the current link (removing trailing slashes, etc)
link_to_validate = self.link.strip.gsub(/\/$/,'')
# Get the current count of objects having this link
count = MyModel.count(:conditions => ['link = ?', link_to_validate])
# Add an error to the model if the count is not zero
errors.add_to_base("Link must be unique") unless count == 0
end
end
You could then add other logic to clean up the link (i.e. check for http://, www, etc.)
You can customize validations. See this railscast.

Resources