Dealing with ThinkingSphinx realtime indices when destroying records

Dealing with ThinkingSphinx realtime indices when destroying records - ruby-on-rails

I'm trying to convert a delta-based index in ThinkingSphinx into a realtime one. Per the docs, I've added this callback:
after_save ThinkingSphinx::RealTime.callback_for(:location)
That works just fine for adding and updating records, woo. My problem is in deleting records, which according to the Rails docs, don't trigger after_save callbacks. I've confirmed this by deleting a record, which is not then deleted from my sphinx index.
I tried
after_destroy ThinkingSphinx::RealTime.callback_for(:location)
But this raises an error (as the realtime callbacks do not support after_destroy).
How can I remove an entry from my index when using a real time index?
(thinking-sphinx 3.3.0, rails 5.0.4, if that helps)

Thinking Sphinx automatically adds its own after_destroy callback to all indexed models, so removal of these records from real-time indices should happen without you needing to add any code.

In case you still need to do this manually, for example when soft-deleting objects.
ThinkingSphinx::ActiveRecord::Callbacks::DeleteCallbacks.after_destroy(instance)
Reference: https://github.com/pat/thinking-sphinx/issues/1057

Related

Rails Elastic Search reindex data after update_all

I am using Searchkick for interacting with elastic search api in my Rails app, and it's working fine almost for all the cases, but the problem I am facing is I am having status field in my Model, and through select all check box a User can change the status of all the records, so I am updating my data using update_all which doesn't fire any callback, and searchkick reindex data through after_commit callback. and since, my data is not getting reindexed in Elastic Search this way it's giving same results, what I am suppose to do, is calling Model.reindex manually is a good option??

I actually solved it, without re indexing whole data which would have been a really naive solution, instead of that we can also re index a single record Like below
product = Product.find 10
product.reindex
# or to reindex in the background
product.reindex_async

You have to call Model.reindex manually. action_all is build to make changes at DB level directly. Find more here.
You can create after_action filter to reindex data.

How to reindex only some objects in Sunspot Solr

We use Sunspot Solr for indexing and searching in our Ruby on Rails application.
We wanted to reindex some objects and someone accidentally ran the Product.reindex command from the Rails Console. The result was that indexing of all products started from scratch and our catalogue appeared empty while indexing was taking place.
Since we have a vast amount of data the reindexing has been taken three days so far. This morning when I checked on the progress of the reindexing, it seems like there was one corrupt data entry which resulted in the reindexing stopping without completing.
I cannot restart the entire Product.reindex operation again as it takes way too long. Is there a way to only run reindexing on selected products? I want to select a range of products that aren't indexed and then just run indexing on thise. How can I add a single product to the index without having to run a complete reindex of entire data set?

Sunspot does index an object in the save callback so you could save each object but maybe that would trigger other callbacks too. A more precise way to do it would be
Sunspot.index [post1, post2]
Sunspot.commit
or with autocommit
Sunspot.index! [post1, post2]
You could even pass in object relations as they are just an array too
Sunspot.index! post1.comments

I have found the answer on https://github.com/sunspot/sunspot#reindexing-objects
Whenever an object is saved, it is automatically reindexed as part of the save callbacks. So all that was needed was to add all the objects that needed reindexing to an array and then loop through the array, calling save on each object. This successfully updated the required objects in the index.

What does User.destroy_all or User.delete_all do?

I am working on a project that has the following cucumber step:
Given /^no registered users$/ do
User.delete_all
end
As a new RoR user this looks a little dangerous even though I'd be testing on our development database because our User table has actual data. What is the line of code doing?
Thanks!

delete_all is from activerecord library not from FactoryGirl.
And the difference between these two is :
delete_all(conditions = nil) public
Deletes the records matching conditions without instantiating the records first, and hence not calling the destroy method nor invoking callbacks.
This is a single SQL DELETE statement that goes straight to the database, much more efficient than destroy_all.
Be careful with relations though, in particular :dependent rules defined on associations are not honored.
Returns the number of rows affected.
destroy_all(conditions = nil) public
Destroys the records matching conditions by instantiating each record and calling its destroy method.
Each object’s callbacks are executed (including :dependent association options and before_destroy/after_destroy Observer methods).
Returns the collection of objects that were destroyed; each will be frozen, to reflect that no changes should be made (since they can’t be persisted).
Note
Instantiation, callback execution, and deletion of each record can be time consuming when you’re removing many records at once. It generates at least one SQL DELETE query per record . If you want to delete many rows quickly, without concern for their associations or callbacks, use delete_all instead.

delete_all is not from FactoryGirl, it is an active record command and it deletes the users from your database. If you are running this from cucumber then it should run against your test database, not development.
A better alternative is destroy_all since that version will run any associated callbacks. For example, if users have posts, and you have a before_destroy callback to remove posts if users are deleted.
Here's a link to more info about delete_all

delete_all will forceably remove records from the corresponding table without activating any rails callbacks.
destroy_all will remove the records but also call the model callbacks

Based on your example, it's probably deleting all users in order to allow the next Cucumber step to register new users. The ActiveRecord::Base#delete_all method says, in part:
Deletes the records matching conditions without instantiating the
records first, and hence not calling the destroy method nor invoking
callbacks. This is a single SQL DELETE statement that goes straight to
the database, much more efficient than destroy_all.
There are probably better ways to write that test, but the intent is clearly to remove the user records as efficiently as possible.
As for it being dangerous, your tests should be running against the test database, not the development or production databases. Since it's possible to misconfigure your testing framework to use the wrong database, you could certainly add a step or conditional that tests if Rails.env.test? is true. That's a fairly small price to pay for peace of mind.

Which callback does ActiveRecord use to record timestamp?

I'm just wondering here whether any of you guys know when ActiveRecord use it's "magic" to record the timestamp (e.g. created_at, updated_at).
What i mean when is, at which callback ? (if AR use callback at all).
I'm asking this because I want to create an auto-updating column (that record sequential number for each object) and I want to replicate AR way to do this as much as possible.
EDITED:
It seems that AR does it between after_validation and before_create/before_update. You can do some tests for this by creating a presence validation for created_at column and inserting new record with blank created_at, it would return an error.

I don't know where AR does it, but the proper place for what you describe sounds like before_create

In Rails 3.2.12, this code is located in lib/active_record/timestamp.rb.
As you mention in your question and DGM suggests, Rails will update the timestamps when creating or updating, so sticking your code in before_create and before_update should work.
You may also want to take a look at the ActiveRecord counter_cache functionality. ActiveRecord supports creation of a column that can automatically be incremented/decremented. Additionally, you can perform more complicated logic.

Using ar-extensions' import: synchronise doesn't work

I am using AR-Extensions to import a large number of objects to db, but synching them back from DB just isn't working.
MY code:
posts = [Post.new(:name=>"kuku1"), Post.new(:name=>"kuku2"), ...]
Post.import posts, :synchronize=>posts
posts are submitted to db, and each one is allocated with primary key (id) automatically. But when afterwards checking the objects in posts array, I see that they don't have id field, and new_record flag is still true.
I also tried adding :reload=>true, but that doesn't help as well.
Any idea why synch doesn't work?

This is not possible right now with new records. As of ar-extensions 0.9.3 this will not work when synchronizing new records as synchronizing expects the records you're sync'ing to already exist. It uses the primary key under the covers to determine what to load (but with new records the primary key is nil). This limitation* also exists in activerecord-import 0.2.5. If you can synchronize on other conditions I'd be happy to release a new gem allowing conditions to be passed in. For Rails 3.x you need to use activerecord-import though (it replaces ar-extensions). Please create ticket/issue on github: https://github.com/zdennis/activerecord-import/issues
For Rails 2.x you still want to use ar-extensions, and I'd likely backport the activerecord-import update and push out a new gem as well. If you'd like this functionality here please create a ticket/issue on github: https://github.com/zdennis/ar-extensions/
Patches are welcome as well.
*The limitation here is a database constraint, as its impossible to get the ids of all newly created records after a single insert/import without doing something strange like table locking, which I don't think is a good solution to that problem. If anyone has ideas I'm all ears.
UPDATE
activerecord-import 0.2.6 and ar-extensions 0.9.4 have been released and includes support for specifying the fields you want to synchronize on. Those fields should be unique. See http://www.continuousthinking.com/2011/4/6/activerecord-import-0-2-6-and-ar-extensions-0-9-4

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Dealing with ThinkingSphinx realtime indices when destroying records - ruby-on-rails

Thinking Sphinx automatically adds its own after_destroy callback to all indexed models, so removal of these records from real-time indices should happen without you needing to add any code.

In case you still need to do this manually, for example when soft-deleting objects. ThinkingSphinx::ActiveRecord::Callbacks::DeleteCallbacks.after_destroy(instance) Reference: https://github.com/pat/thinking-sphinx/issues/1057

Related

Rails Elastic Search reindex data after update_all

How to reindex only some objects in Sunspot Solr

What does User.destroy_all or User.delete_all do?

Which callback does ActiveRecord use to record timestamp?

Using ar-extensions' import: synchronise doesn't work

Categories

Resources