Mongoid indexes and validations - ruby-on-rails

I have an Rails 5 / Mongoid 7 app that parses files and adds the content to the DB. the parsing is taking more and more time after each file is processed and I think this is because I have a validates_uniqueness_of on one of the fields, as the collection grows larger, that validation needs to check a larger collection, makes sense.
So I figured creating an index on that field would help that, but I was wondering if I should still leave the validates_uniqueness_of there anyway or should I remove it?
Can't really seem to find information about that anywhere.
Model:
class SomeModel
include Mongoid::Document
include Mongoid::Timestamps
field :some_field, type: String
index({ some_field: 1 }, { unique: true, name: "some_field_index" })
validates_uniqueness_of :some_field, { case_sensitive: false }
end
Note: I've run rake db:mongoid:create_indexes but I haven't tried a new parse yet, wanted to know how to handle this first.

So I've run several tests and adding the index made a huge difference in processing time, i'll leave my answer here for posterity.
The validates_uniqueness_of can be removed, although now the uniqueness of the field is handled by the index so instead of getting a validation error when trying to save the document, you get an exception thrown, so i had to change how some of the code that was handling the document creation, keep that in mind if you have to deal with a similar situation.

Related

Rails validate uniqueness for the ASCII approximation

I hope the title is not too unclear.
I am making arails app and I have a question about rails validation. Consider this code in the User,rb model file:
validates :name,
presence: true,
length: { maximum: 50 },
uniqueness: { case_sensitive: false }
I am using the friendly_id gem to generate slugs for the users. I wont allow users to change their name. What I now need is to ensure that Names are unique in such a way that there will be no UUID's appended to slugs if two people have the same name converted in ascii approximation.
Current behaviour is:
User 1 signs up with a name and gets a slug like this:
name: "Jaiel" slug: "jaiel"
User 2 now does the same name but a bit different:
name: "Jàìèl" slug: "jaiel-6558c3f1-e6a1-4199-a53e-4ccc565657d4"
The problem here as you see I want such a uniqueness validation that User 2 would have been rejected because both names would generate the slug "jaiel" for their friendly_id's
I would appreciate your help on that matter
Thanks
Take a look into ActiveSupport::Inflector.transliterate:
ActiveSupport::Inflector.transliterate('Ærøskøbing')
#=> "AEroskobing"
Also, with this type of validation you might want to go with custom validation (alongside the one you already have):
class User
validate :unique_slug
private
def unique_slug
names = self.class.all.map(&:asci_name)
raise ActiveRecord::RecordInvalid.new(self) if names.include?(asci_name)
end
def asci_name
ActiveSupport::Inflector.transliterate(name)
end
end
Obviously, this is super inefficient to query whole table on each validation, but this is just to point you into one of the possible directions.
Another option would be going for a callback. Transliterating the name upon creation:
before_validation: :transliterate_name
def transliterate_name
self.name = ActiveSupport::Inflector.transliterate(name)
end
It will first transliterate the name, then validate uniqueness of already transliterated name with the validation you have. Looks like a solution, definitely not as heavy as initial one.

Is there a better way of validating a non model field in rails

I have a form field in ROR 4 app called as 'measure'. It is not a database column, but its values will help model create child entries of its own via acts_as_tree : https://github.com/rails/acts_as_tree
I have to throw a validation when 'measure' is invalid. So I have created a virtual attribute known as measure and check for its validations only on a certain condition.
model someModel
attr_accessor :measure
validates_presence_of :measure, :if => condition?
Problem is when I am saving the code, I am thrown a validation which is fine. I am also thrown the same validation when I am trying to update the record in some other method of the model. The only way to surpass that is by writing this code:
# I do not want to do this, is there a better way?
self.measure = "someRandomvalue"
self.save
I am making this as virtual attribute only for throwing validations. Is there a better way of throwing validations? The form has other validations, I do not want the error for this validations to be shown differently just because it is not an attribute.
I want it to validated only when active record is saved via create and update action of the controller and not when it is being updated by some random method of model.
I have seen other developers in my team doing similar thing and was always curious about one thing - "What are you trying to achieve doing things the way you are doing?". You see, I am not sure if validators should be used for values that will not be serialized.
Anyways, you may try using format validator instead of presence, which worked in my team's case:
# Rails 3/4
validates :measure, format: { with: /^.+$/, allow_nil: true }
# Rails 2
validates_format_of :measure, :with => /^.+$/, :allow_nil => true
You may also try using allow_blank instead of allow_nil.
I would rather create a custom validator along the lines of validates_accessor_of for values that I know will never be serialized.
HTH

validates :name, uniqueness: {scope: user_id}

I have added a validation like this one to my model:
validates :name, uniqueness: {scope: user_id}
And added an add_index like this on my migration:
add_index(:posts, :name)
But I just read on the rails api page the part about data integrity.
And I was wondering if I will have any integrity errors on my models, so my question is: should I rewrite my indexes as?
add_index(:posts, [:name, :user_id]), unique: true
Thanks all,
The data integrity you're talking about can be enforced at 2 levels as you probably already know: at the application level and at the database level.
At the application level: the validation you added.
At the database level: the index you suggested
You already set up the first one. So, as long as everything goes through your Rails model to be saved in db, you won't have any db integrity issue.
However, if other third-party applications may write to your db, it is not a bad idea to enforce the uniqueness also at the db level.
And even if the first one is sufficient, it is not a bad idea neither to set up the second.
In addition, if you happen to often query the name associated with a user_id, it is actually better to use the add_index(:posts, [:name, :user_id]), making your queries a bit faster.
yes - that would be a good idea. Your model validation implies a composite primary key.

Rails 3 and Mongoid: Embedded documents validation

So, I am having some issues with user authentication in embedded documents. I have two documents, one embedded in the other. A business has many members. The models look like this:
class Member
include Mongoid::Document
field :username, type: String
field :password, type: String
embedded_in :business
validates :username, :presence => true, :uniqueness => true, :length => 5..60
end
class Business
include Mongoid::Document
field :name, type: String
embeds_many :members
end
The problem is that it isn't validating the username's uniqueness in each model. When I save a member within a business, I can save a thousand of the same name. This of course is not going to work for a good authentication system. I am using Mongoid 2, Rails 3, and Ruby 1.9
This is a normal behavior when using embedded documents as explained here: MongoID validation
validates_uniqueness_of
Validate that the field is unique in the database: Note that for
embedded documents, this will only check that the field is unique
within the context of the parent document, not the entire database.
I think you want to try to create an Index in the username field that would ensure uniqueness among all the objects of that collection. Something like this:
ensureIndex({username:1},{unique:true});
EDIT: If you want Mongo to throw exception if a document with the same index value exists, you must avoid Mongo to do the “fire and forget” pattern. This means that the database will not wait for a response when you perform an update/write operation on a document.
And you want to pass this parameter: safe:true. By doing so Mongo should raise an exception if for any reason the document can't be inserted.

Ruby on Rails and NoSQL, adding fields

I'm just diving into Mongodb and MongoID with Rails and I find it awesome. One thing the NoSQL helps is when I can add extra fields to my model without any extra effort whenever I want:
class Page
include Mongoid::Document
include Mongoid::MultiParameterAttributes
field :title, :type => String
field :body, :type => String
field :excerpt, :type => String #Added later
field :location, :type => String #Added later
field :published_at, :type => Time
validates :title, :presence => true
validates :body, :presence => true
validates :excerpt, :presence => true
end
And this works perfectly as it should. But my question is, (sorry if this is trivial) the existing entries are blank and have no defined value for the newly added field. For example, in a sample blog application, after I've published two posts, I decide to add an excerpt and a location field to my database (refer code above). Any blog post that is published after the addition of these new fields can be made sure to have a value filled in for the excerpt field. But the posts published prior to the addition of these two new fields have null values (which is understandable why) which I cannot validate. Is there an elegant solution for this?
Thank you.
There are three basic options:
Update everything inside MongoDB to include the excerpt.
Use an after_initialize hook to add a default excerpt to existing objects when you pull them out of MongoDB.
Kludge your validation logic to only check for the existence of excerpt on new objects.
(1) requires a (possible large) time hit when you make the change but it is just a one time thing and you don't have to worry about it after that. You'd pull every Page out of MongoDB, do page.excerpt = 'some default excerpt', and then save it back to MongoDB. If you have a lot of Pages you'll want to process them in chunks of, say, 100 at a time. If you do this, you'll be able to search on the excerpt without worrying about what you should do with nulls. You can also do this inside MongoDB by sending a JavaScript fragment into MongoDB:
connection.eval(%q{
db.pages.find({}, { _id: true }).forEach(function(p) {
db.pages.update(
{ _id: p._id },
{ $set: { excerpt: 'some default excerpt' } }
);
});
})
(2) would go something like this:
after_initialize :add_default_excerpt, :unless => :new_record?
#...
private
def add_default_excerpt
self.excerpt = 'some default excerpt' unless self.excerpt.present?
end
You could move the unless self.excerpt up to the :unless if you didn't mind using a lambda:
after_initialize :add_default_excerpt, :unless => ->{ |o| o.new_record? || o.excerpt.present? }
#...
private
def add_default_excerpt
self.excerpt = 'some default excerpt'
end
This should be pretty quick and easy to set up but there are downsides. First of all, you'd have a bunch of nulls in your MongoDB that you might have to treat specially during searches. Also, you'd be carrying around a bunch of code and logic to deal with old data but this baggage will be used less and less over time. Furthermore, the after_initialize calls do not come for free.
(3) requires you to skip validating the presence of the excerpt for non-new Pages (:unless => :new_record?) or you'd have to find some way to differentiate new objects from old ones while also properly handling edits of both new and old Pages. You could also force people to supply an excerpt when they change a Page and leave your validation as-is; including a :default => '' on your field :excerpt would take care of any nil issues in views and such.
I'd go with (1) if possible. If the update would take too long and you wanted the site up and running while you were fixing up MongoDB, you could add a :default => '' while updating and then remove the :default option, restart, and manually patch up any strays that got through.

Resources