Spring Elasticsearch reindex method - spring-data-elasticsearch

I've written a reindex method that does the following:
public void reindex() {
IndexOperations indexOperations = elasticsearchOperations.indexOps(Song.class);
List<Song> songs = songRepository.findAll();
songSearchRepository.deleteAll();
indexOperations.delete();
indexOperations.create();
songSearchRepository.saveAll(songs);
}
It does the job but I'm now sure whether it makes sence just to delete and then create an index. How can I improve this method?

Actually I don't see the point in reindexing to the same index.
If you want to reindex to a different index, you should use the reindex API of Elasticsearch. This is not yet supported directly in Spring Data Elasticsearch.
To reindex to a different index using Spring Data Elasticsearch you should use paged queries in a loop to read from one index and write the data to the second index.
Edit 16.10.2020:
Don't use copying in a loop as I suggested, like #joeyave commented, the indices can get out of sync here.
I created an issue in Spring Data Elasticsearch Jira to have reindex support implemented.

Related

Rails Sunspot/Solr Index Out of Sync

I'm using the sunspot-rails gem for searching and filtering but am running into issues where Solr queries are not returning the correct results. For example, say I want to retrieve all the records of a model with state == Processing, and I do the following Sunspot search:
MyModel.search do
# pagination and ordering stuff
...
with('state', 'Processing')
...
end
Most of the time, this returns the correct results. Sometimes (and so far only on production, I can't duplicate the issue locally) the query will return records with state == In Review. If I do a regular ActiveRecord MyModel.where(state: 'Processing') I always get the correct results.
I thought it might have to do with my solrconfig.yml file but changing those params haven't seemed to change anything. Relevant portion of that file:
<autoCommit>
<maxTime>15000</maxTime>
<maxDocs>1000</maxDocs>
<openSearcher>true</openSearcher>
</autoCommit>
<autoSoftCommit>
<maxTime>5000</maxTime>
</autoSoftCommit>
Does anyone have any pointers for why changes in my db aren't reflected in the Solr index, or how I can debug/log what's going on? This is on a small internal app with maybe 100 users or so. I shouldn't have to reindex Solr daily to keep the results up to date.
Thanks.

How to fetch redis data using id in rails app

I am trying to implement redis cache on a rails application. Till now I am able to cache the active record data using redis cache. I am able to fetch all the records at once using get method. But I am having difficult time figuring out how to fetch a single record using id since the the data produced by redis is in string data type.
Following is the data cached by redis:
"set" "bookstore:authors" "[{\"id\":1,\"name\":\"Stephenie Meyer\",\"created_at\":\"2018-05-03T10:58:20.326Z\",\"updated_at\":\"2018-05-03T10:58:20.326Z\"},{\"id\":2,\"name\":\"V.C. Andrews\",\"created_at\":\"2018-05-03T10:58:20.569Z\",\"updated_at\":\"2018-05-03T10:58:20.569Z\"}]
Now I am calling
authors = $redis.get('authors')
to display all the authors.
How can I fetch a single author using his id?
Helper method to fetch authors
def fetch_authors
authors = $redis.get('authors')
if authors.nil?
authors = Author.all.to_json
$redis.set("authors", authors).to_json
$redis.expire("authors", 5.hour.to_i)
end
JSON.load authors
end
For your use case, using a hash is probably better. You could use the following commands to achieve that:
HSET bookstore:authors 1 "<author json>"
HGET bookstore:authors 1 // Returns that author's json data
Or you can store each author on its own:
SET bookstore:authors:1 <author_json>
GET bookstore:authors:1 // Returns that author's json data

Wrong data returned from elasticsearch after recreate the index

for testing and development reasons we reindex our data from a rails app by deleting an index and recreate with mapping and import existing documents.
But after recreating the index, elasticsearch returns other results than expected and before recreating. If we restart the elasticsearch instance, the results as expected.
This is how we recreate the index.
Tire.index indexname do
delete
create _mappings
import _objects
refresh
end
We also checked the search query directly via curl on elastic search, but we got not the expected result. After restarting the elastic search daemon, same query returns expected data.
What have to be done or what is expected of an elasticsearch instance to return correct data after recreating an index with the same name without restarting? We also tried creating new indexes with timestamp names and aliasing the index name to these index, but with same results.
thanks in advance

Rails, Soulmate, Redis remove record

I use Soulmate to autocomplete search results, however I want to be able to delete records after a while so they don't show up in the searchfield again. To reload the list with Soulmate seems a bit hacky and unnecessary.
I have used json to load and I have a unique record "id"
{"id":1547,"term":"Foo Baar, Baaz","score":85}
How can I delete that record from redis so it wont show up in the search results again?
It is not trivial to do it directly from Redis, using redis-cli commands.
Looking at soulmate code, the data structure is as follows:
a soulmate-index:[type] set containing all the prefixes
a soulmate-data:[type] hash object containing the association between the id and the json object.
per prefix, a soulmate-index:[type]:[prefix] sorted set (with score and id)
So to delete an item, you need to:
Retrieve the json object from its id (you already did it) -> id 1547
HDEL soulmate-data:[type] 1547
Generate all the possible prefixes from "Foo Baar, Baaz"
For each prefix:
SREM soulmate-data:[type] [prefix]
ZREM soulmate-index:[type]:[prefix] 1547
Probably it would be easier to directly call the remove method provided in the Soulmate.Loader class from a Ruby script, which automates everything for you.
https://github.com/seatgeek/soulmate/blob/master/lib/soulmate/loader.rb

Find the list of list of indexed keys in mongomapper

I am working on a rails app with mongodb and mongomapper. I would like to index few keys in the database. This is first project for me on mongo.
The keys i want to index i can specify like this
User.ensure_index(:email)
as described here
My question is, do i need to call this manually(may be wrapping it in a method) to make the indexing really happening?
And how can i find the list of keys which are having indices?
Here are the answers to my questions
Do i need to call this manually(may be wrapping it in a method) to make the indexing really happening?
Yes, we have to manually call the ensure_index method on Model. So we can wrap that in a method and can call from the console or even a rake task.
def self.create_index
self.ensure_index(:email)
self.ensure_index(:first_name)
self.ensure_index(:last_name)
true
end
then from console
User.create_index
you can check what keys are indexed using mongo's getIndexes() method
like this
mongo #=> enter the mongo console
show dbs #=> see the list of available dbs
use my_database #=> switch to your database
db.table_name.getIndexes() #=> replace table_name with your's
and that's it, you can see the list of indices on your table
Thanks!

Resources