Automatically re-indexing Sunspot/Solr on edit - ruby-on-rails

I am using Sunspot wich is a Solr-based search engine on my rails app.
According to this Rails Cast episode on Search with Sunspot:
Sunspot automatically indexes any new records but not existing ones. We can tell Sunspot to reindex the existing records by running: rake sunspot:reindex
Which should basically reindex the whole Model.
However, I have a Model (TicketSubject), with an attribute, category, and I will like to reindex a ticket_subject each time there is an edit on its category.
How can I go about this?

Why don't you use an ActiveRecord callback for this i.e. after_update?
after_update()
Is called after Base.save on existing objects that have a record. Note that this callback is still wrapped in the transaction around save. For example, if you invoke an external indexer at this point it won‘t see the changes in the database.
http://api.rubyonrails.org/v2.3.8/classes/ActiveRecord/Callbacks.html#M001377
Define: Callbacks are hooks into the life cycle of an Active Record object that allow you to trigger logic before or after an alteration of the object state.
http://api.rubyonrails.org/classes/ActiveRecord/Callbacks.html

Related

Rails - Destroy all records

Is there a way to destroy all records in my database in one line, without specifying my models?
Say I have three models User Picture Post. I can call User.all.destroy_all etc, but can I collect all records without specifying the models themselves?
As Sebastian Palma says you can do the rake task rake db:reset which will drop and setup your database.
Alternatively you can get all the descendants of ActiveRecord. If you're in development mode you'll need to eager_load first.
Rails.application.eager_load!
Then you could do
ActiveRecord.descendants.each(&:destroy_all)
PLEASE BE EXTREMELY CAREFUL! THE ABOVE WILL DELETE ALL RECORDS IN YOUR DATABASE TABLES!

Sunspot Gem Using in STI TABLE

i have Account Model,Asset, Capital and Revenue this table are all inherited in my Account model. i have 3 kind of attributes in my Account model. name, code and type. when i create an account where will be to insert will happen one in my account and the other one is in my type for example
Account.create(name: "test123", code:"test123", type:"Asset")
sql will run Two Insert one for Account model and one for Asset Table
and my sunspot work well it will reindex my database and i can search my params
but when i update my model Account my sql run one insert and one update
my question is how can i reindex my model when i update. with a particular data. i can do Sunspot.reindex but this is will load all data in my sql. that will cause me to slow
sql will run Two Insert one for Account model and one for Asset Table
FYI you use STI when you want to share same database table between multiple models because they are similar in attributes and behavior. Like AdminUser model is likely to have almost same attributes/columns as PublisherUser or ReaderUser. Therefore you might wish to have a common table called users or model User and share this table among the above mentioned models.
Point is: ActiveRecord will run a single SQL query not two, like:
INSERT INTO "accounts" ("name", "code", "type") VALUES ('test123', 'test123', 'Asset')
my question is how can i reindex my model when i update. with a particular data. i can do Sunspot.reindex but this is will load all data in my sql. that will cause me to slow
Actually sunspot_rails is designed to auto-reindex whenever you make changes to your model/record. It listens to the save callbacks.
But you need to make sure that you are not using methods like update_column(s). See the list of silent create/update methods which do not trigger callbacks and validations at all.
In addition, you need to understand the concept of batch size in terms of Solr. For performance reasons, all of your new indexes are not immediately committed. Committed means, writing indexes to database like in RDBMS commits.
By default the batch_size for commits is 50. Meaning after 50 index method executions only the indexes will be committed and you will be able to search the records. To change it, use following
# in config/initializers/sunspot_config.rb
Sunspot.config.indexing.default_batch_size = 1 # or any number
or
# in models; its not considered good though
after_commit do
Sunspot.commit
end
For manual re-indexing, you can use like #Kathryn suggested.
But, I don't think you need to intervene in the auto-operation. I think you were not seeing immediate results so you were worrying.
According to the documentation, objects will be indexed automatically if you are on Rails. But it also mentions you can reindex a class manually:
Account.reindex
Sunspot.commit
It also suggests using Sunspot.index on individual objects.
i put this to my model
after_update do
Sunspot.index Account.where(id: self.id)
end

Rails Elastic Search reindex data after update_all

I am using Searchkick for interacting with elastic search api in my Rails app, and it's working fine almost for all the cases, but the problem I am facing is I am having status field in my Model, and through select all check box a User can change the status of all the records, so I am updating my data using update_all which doesn't fire any callback, and searchkick reindex data through after_commit callback. and since, my data is not getting reindexed in Elastic Search this way it's giving same results, what I am suppose to do, is calling Model.reindex manually is a good option??
I actually solved it, without re indexing whole data which would have been a really naive solution, instead of that we can also re index a single record Like below
product = Product.find 10
product.reindex
# or to reindex in the background
product.reindex_async
You have to call Model.reindex manually. action_all is build to make changes at DB level directly. Find more here.
You can create after_action filter to reindex data.

How to reindex only some objects in Sunspot Solr

We use Sunspot Solr for indexing and searching in our Ruby on Rails application.
We wanted to reindex some objects and someone accidentally ran the Product.reindex command from the Rails Console. The result was that indexing of all products started from scratch and our catalogue appeared empty while indexing was taking place.
Since we have a vast amount of data the reindexing has been taken three days so far. This morning when I checked on the progress of the reindexing, it seems like there was one corrupt data entry which resulted in the reindexing stopping without completing.
I cannot restart the entire Product.reindex operation again as it takes way too long. Is there a way to only run reindexing on selected products? I want to select a range of products that aren't indexed and then just run indexing on thise. How can I add a single product to the index without having to run a complete reindex of entire data set?
Sunspot does index an object in the save callback so you could save each object but maybe that would trigger other callbacks too. A more precise way to do it would be
Sunspot.index [post1, post2]
Sunspot.commit
or with autocommit
Sunspot.index! [post1, post2]
You could even pass in object relations as they are just an array too
Sunspot.index! post1.comments
I have found the answer on https://github.com/sunspot/sunspot#reindexing-objects
Whenever an object is saved, it is automatically reindexed as part of the save callbacks. So all that was needed was to add all the objects that needed reindexing to an array and then loop through the array, calling save on each object. This successfully updated the required objects in the index.

What does User.destroy_all or User.delete_all do?

I am working on a project that has the following cucumber step:
Given /^no registered users$/ do
User.delete_all
end
As a new RoR user this looks a little dangerous even though I'd be testing on our development database because our User table has actual data. What is the line of code doing?
Thanks!
delete_all is from activerecord library not from FactoryGirl.
And the difference between these two is :
delete_all(conditions = nil) public
Deletes the records matching conditions without instantiating the records first, and hence not calling the destroy method nor invoking callbacks.
This is a single SQL DELETE statement that goes straight to the database, much more efficient than destroy_all.
Be careful with relations though, in particular :dependent rules defined on associations are not honored.
Returns the number of rows affected.
destroy_all(conditions = nil) public
Destroys the records matching conditions by instantiating each record and calling its destroy method.
Each object’s callbacks are executed (including :dependent association options and before_destroy/after_destroy Observer methods).
Returns the collection of objects that were destroyed; each will be frozen, to reflect that no changes should be made (since they can’t be persisted).
Note
Instantiation, callback execution, and deletion of each record can be time consuming when you’re removing many records at once. It generates at least one SQL DELETE query per record . If you want to delete many rows quickly, without concern for their associations or callbacks, use delete_all instead.
delete_all is not from FactoryGirl, it is an active record command and it deletes the users from your database. If you are running this from cucumber then it should run against your test database, not development.
A better alternative is destroy_all since that version will run any associated callbacks. For example, if users have posts, and you have a before_destroy callback to remove posts if users are deleted.
Here's a link to more info about delete_all
delete_all will forceably remove records from the corresponding table without activating any rails callbacks.
destroy_all will remove the records but also call the model callbacks
Based on your example, it's probably deleting all users in order to allow the next Cucumber step to register new users. The ActiveRecord::Base#delete_all method says, in part:
Deletes the records matching conditions without instantiating the
records first, and hence not calling the destroy method nor invoking
callbacks. This is a single SQL DELETE statement that goes straight to
the database, much more efficient than destroy_all.
There are probably better ways to write that test, but the intent is clearly to remove the user records as efficiently as possible.
As for it being dangerous, your tests should be running against the test database, not the development or production databases. Since it's possible to misconfigure your testing framework to use the wrong database, you could certainly add a step or conditional that tests if Rails.env.test? is true. That's a fairly small price to pay for peace of mind.

Resources