How to reindex only some objects in Sunspot Solr - ruby-on-rails

We use Sunspot Solr for indexing and searching in our Ruby on Rails application.
We wanted to reindex some objects and someone accidentally ran the Product.reindex command from the Rails Console. The result was that indexing of all products started from scratch and our catalogue appeared empty while indexing was taking place.
Since we have a vast amount of data the reindexing has been taken three days so far. This morning when I checked on the progress of the reindexing, it seems like there was one corrupt data entry which resulted in the reindexing stopping without completing.
I cannot restart the entire Product.reindex operation again as it takes way too long. Is there a way to only run reindexing on selected products? I want to select a range of products that aren't indexed and then just run indexing on thise. How can I add a single product to the index without having to run a complete reindex of entire data set?

Sunspot does index an object in the save callback so you could save each object but maybe that would trigger other callbacks too. A more precise way to do it would be
Sunspot.index [post1, post2]
Sunspot.commit
or with autocommit
Sunspot.index! [post1, post2]
You could even pass in object relations as they are just an array too
Sunspot.index! post1.comments

I have found the answer on https://github.com/sunspot/sunspot#reindexing-objects
Whenever an object is saved, it is automatically reindexed as part of the save callbacks. So all that was needed was to add all the objects that needed reindexing to an array and then loop through the array, calling save on each object. This successfully updated the required objects in the index.

Related

ElasticSearch should_index? not checking the correct condition in Model.reindex

I'm using searchkick gem for elastic-search. I'v implemented the "shouldindex?" method in model to reindex selected records.
For example, First there were 2 boolean columns with AND condition in the "should_index?" method on which basis i was checking the record should be reindex or not . Afterwards i changed the columns count from 2 to 4 and condition from AND to OR in the "should_index?" but it is still reading the old conditions.
should_index? method is working correctly with the model object. Like i do reindex this way "Model.all.map(:reindex)". It reads the condition properly from the "should_index?" method in the model and update the records accordingly.
But when i do reindex this way "Model.reindex". It reads the old conditions of the "should_index?" method rather than the updated once in the modsel.
I did reset the elastic-search and read the gem document but could not find any solution.

Dealing with ThinkingSphinx realtime indices when destroying records

I'm trying to convert a delta-based index in ThinkingSphinx into a realtime one. Per the docs, I've added this callback:
after_save ThinkingSphinx::RealTime.callback_for(:location)
That works just fine for adding and updating records, woo. My problem is in deleting records, which according to the Rails docs, don't trigger after_save callbacks. I've confirmed this by deleting a record, which is not then deleted from my sphinx index.
I tried
after_destroy ThinkingSphinx::RealTime.callback_for(:location)
But this raises an error (as the realtime callbacks do not support after_destroy).
How can I remove an entry from my index when using a real time index?
(thinking-sphinx 3.3.0, rails 5.0.4, if that helps)
Thinking Sphinx automatically adds its own after_destroy callback to all indexed models, so removal of these records from real-time indices should happen without you needing to add any code.
In case you still need to do this manually, for example when soft-deleting objects.
ThinkingSphinx::ActiveRecord::Callbacks::DeleteCallbacks.after_destroy(instance)
Reference: https://github.com/pat/thinking-sphinx/issues/1057

Rails Elastic Search reindex data after update_all

I am using Searchkick for interacting with elastic search api in my Rails app, and it's working fine almost for all the cases, but the problem I am facing is I am having status field in my Model, and through select all check box a User can change the status of all the records, so I am updating my data using update_all which doesn't fire any callback, and searchkick reindex data through after_commit callback. and since, my data is not getting reindexed in Elastic Search this way it's giving same results, what I am suppose to do, is calling Model.reindex manually is a good option??
I actually solved it, without re indexing whole data which would have been a really naive solution, instead of that we can also re index a single record Like below
product = Product.find 10
product.reindex
# or to reindex in the background
product.reindex_async
You have to call Model.reindex manually. action_all is build to make changes at DB level directly. Find more here.
You can create after_action filter to reindex data.

Automatically re-indexing Sunspot/Solr on edit

I am using Sunspot wich is a Solr-based search engine on my rails app.
According to this Rails Cast episode on Search with Sunspot:
Sunspot automatically indexes any new records but not existing ones. We can tell Sunspot to reindex the existing records by running: rake sunspot:reindex
Which should basically reindex the whole Model.
However, I have a Model (TicketSubject), with an attribute, category, and I will like to reindex a ticket_subject each time there is an edit on its category.
How can I go about this?
Why don't you use an ActiveRecord callback for this i.e. after_update?
after_update()
Is called after Base.save on existing objects that have a record. Note that this callback is still wrapped in the transaction around save. For example, if you invoke an external indexer at this point it won‘t see the changes in the database.
http://api.rubyonrails.org/v2.3.8/classes/ActiveRecord/Callbacks.html#M001377
Define: Callbacks are hooks into the life cycle of an Active Record object that allow you to trigger logic before or after an alteration of the object state.
http://api.rubyonrails.org/classes/ActiveRecord/Callbacks.html

Scaffolding user ID resetting

in the application i am currently creating in ruby on rails. I am trying to do some tests in rails console where i have to destroy data in the database and the database is connected to a server. I am importing an XML and parsing it and putting it into a database with scaffolding.
Now what i need: Basically what i am attempting to do is to destroy the data and replace it with a new one every week..but the problem i am getting, the userid is gone up to 700+ and there are only 50 records :S cause it doesnt reset...
To delete all records i am currently using "whatever.destroy_all" does the trick
Any help?
Btw i am using SQLITE
The ID column created in the table usually is set as unique and to increment by 1 for each new record, which is why each time you destroy and add new data the ID keeps getting higher.
The fact that the ID # is getting larger and larger is not an issue at all.
If you really want to start back at zero, I would think you could drop the table and recreate it, but that seems like overkill for a trivial issue.
Regarding the connection to the other scaffold, how are you connecting the two and what do they both represent?
Ideally the data population for testing should be done through fixtures (or easy tools such as factorygirl etc..)
The main advantage of having a fix data set is you can run your tests in any environment. But as per your requirement you can do something like this,
When you populate the date through the active records pass the id parameter as well
Ex: User.new(:id => 1, :name => "sameera").create
By this way you can have constant id's But make sure you increment the id accordingly.

Resources