Searchkick without stemming - ruby-on-rails

I am indexing some of my data with searchkick (https://github.com/ankane/searchkick) as an array and it works almost fine :)
def search_data
{isbn: isbn,
title: title,
abstract_long: abstract_long,
authors: authors.map(&:name)}
end
The field I'm interested in is authors.
What I'd like to accomplish is to search for "Marias" and find all the authors that actually have that exact string in their surname like (Javier Marias) and not all the Maria/Mario/Marais that Searchkick returns, and have them with a much bigger priority.
This is how I search right now
Book.search(#search_key,
fields: [:authors, :title, {isbn: :exact}],
boost_by: {authors: 10, title: 5, isbn: 1})
Is this possible at all? TIA

In Elasticsearch it has a regular match to deal with this case, but abandoned by Searchkick.
You could choose a walk around way for this case:
def search_data
{
isbn: isbn,
title: title,
abstract_long: abstract_long,
author_ids: authors.map(&:id)
}
end
For search:
author_ids = Author.where(surname: 'Marias').map(&:id)
Book.search(
#search_key,
where: {author_ids: {in: author_ids}},
fields: [:title, {isbn: :exact}],
boost_by: {title: 5, isbn: 1}
)

Related

Mongo return in one query

I have the following n to n (done with mongoid gem) with two collections books and publishers:
{
name: "O'Reilly Media",
founded: 1980,
location: "CA",
books: [123456789, 234567890, ...]
}
{
_id: 123456789,
title: "MongoDB: The Definitive Guide",
author: [ "Kristina Chodorow", "Mike Dirolf" ],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English"
}
{
_id: 234567890,
title: "50 Tips and Tricks for MongoDB Developer",
author: "Kristina Chodorow",
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English"
}
I need to return in one query the publishers, but separated in documents, like this:
{
name: "O'Reilly Media",
founded: 1980,
location: "CA",
book: 123456789 # or books:[123456789]
}
{
name: "O'Reilly Media",
founded: 1980,
location: "CA",
book: 234567890 # or books:[123456789
}
I want to this inside mongo in a query, actually I do it in the rabl file modifying the collection, but this is not good por gaination and using in other representations, So I want to do this transformation in Mongo, not in ruby, Or maybe I should change the query calling for books instead on publishers.
This is the code in ruby:
#publishers is a mongoid::Criteria
#publishers = #publishers.collect do |s|
s.books.count > 1 ? s.publisher_separate_by_books : s
end.flatten
class Publisher
has_and_belongs_to_many :books, inverse_of: :books, dependent: :nullify
def publisher_separate_by_books
books.map {|i| Publisher.new({name: name, founded: founded, location: location, books: [i]})}
end
end
How can achieve this in mongo query
There is no advantage to expanding the query result like that in the database server (any database server). If you wanted to perform additional operations per book in the query (which in case of MongoDB would involve the aggregation pipeline, and for relational databases JOIN operations) then it would make sense. But simply expanding a field like that in the database is wasteful.
MongoDB does support this operation via unwind (https://docs.mongodb.com/manual/reference/operator/aggregation/unwind/#pipe._S_unwind), but then you lose the ability to query the models using DSL provided by Mongoid and have to construct the aggregation pipelines instead.

How to search integer fields using searchkick?

How to make searchkick search integer fields?
Let's say i have a Book model with three properties namely name:string, author:string and pages:integer.
I want to search according to pages field. Right now if i use a query like below it works for string fields i.e name and author but it doesnt work for pages field which is of integer type.
Book.search(q,
misspellings: { below: 5 },
fields: [:name, :author, :pages],
order: { name: 'asc' },
page: params[:page],
per_page: 20)
I go to console and just searched Book.search(120, fields: [:pages]) and it returns empty result even though there are records with pages 120. Why is searchkick not searching for integer fields? I appreciate any help to this dilemma i am facing. Thanks!
I fixed it with this
In Book model
def search_data
{
name: name,
author: author,
pages: pages.to_s
}
end

elasticsearch sort by score - same field searchable but score different?

I'm trying to sort my ES results by 2 fields: searchable and year.
The mapping in my Rails app:
# mapping
def as_indexed_json(options={})
as_json(only: [:id, :searchable, :year])
end
settings index: { number_of_shards: 5, number_of_replicas: 1 } do
mapping do
indexes :id, index: :not_analyzed
indexes :searchable
indexes :year
end
end
The query:
#records = Wine.search(query: {match: {searchable: {query:params[:search], fuzziness:2, prefix_length:1}}}, sort: {_score: {order: :desc}, year: {order: :desc}}, size:100)
The interesting thing in the query:
sort: {_score: {order: :desc}, year: {order: :desc}}
I think the query is working well with the 2 sort params.
My problem is the score is not the same for 2 documents with the same name (searchable field).
For example, I'm searching for "winery":
You can see a very different score, even if the searchable field is the same. I think the issue is due to the ID field (it's an UUID in fact). Looks like this ID field influences the score.
But in my schema mapping, I wrote that ID should not be analyzed and in my ES query, I ask to search ONLY in "searchable" field, not in ID too.
What did I miss to math the same score for same fields ? (actually, sorting by year after score is not useful cos' scores are different for same fields)
Scores are different, because they are calculated independently for each shard. See here for more info.

Rails Sort/Group by dynamic field

I'm working on a rails app that is passing serialized values in a JSON dump to the client. I need the requirements to be sorted in a specific order, with objects that contain equal "deadline_dates" to be grouped together. That field is dynamic and many of my objects don't contain that field at all.
Here is what my model looks like:
#------ app/models/program_requirement.rb ------#
class ProgramRequirement < ActiveRecord::Base
include HasFields
belongs_to :program
has_many :responses
has_many :applications, through: :responses
store_accessor :fields, :deadline_date
end
Here is the method I'm using to pass the objects into my serializer:
#------ app/models/program.rb ------#
def sorted_program_requirements
self.program_requirements.sort_by{|a| [a.deadline, a.position]}.map {
|requirement| ProgramRequirementSerializer.new(requirement, root: false) }
end
The sorting for the "deadline" and "position" values works fine, but some reason, I'm unable to sort by including a param into my sort_by method such as:
a.fields[:deadline_date]
I've attempted using group_by, but that also doesn't seem to do what I'd expect. I just want to group requirements by equal deadline_dates if they exist, and then sort the rest by the other two static fields.
Any help is appreciated!
UPDATE
Output from: program_requirements.each { |pr| p pr.fields[:deadline_date] }
[#<ProgramRequirement:0x007fb198a12d58
id: 126,
program_id: 1159,
title: "Req",
deadline: "manual",
position: 1,
fields: {"deadline_date"=>"2015-09-16"}
#<ProgramRequirement:0x007fb198a12bf0
id: 127,
program_id: 1159,
title: "Req",
deadline: "initial",
position: 2,
fields: {}
#<ProgramRequirement:0x007fb198a12a60
id: 132,
program_id: 1159,
title: "Req",
deadline: "precampaign",
position: 3,
fields: {}
#<ProgramRequirement:0x007fb198a128d0
id: 133,
program_id: 1159,
title: "t444",
description: nil,
deadline: "manual",
position: 4,
fields: {"deadline_date"=>"2015-09-16"}
]
Ok, so if you just need to sort by the deadline_date, well, sort by it:
program_requirements.sort_by { |a| a.deadline_date.to_s }
Since not every program_requirement contains this field, you'll need to convert all nils to strings, so they can be compared to each other. This way all program_requirements that don't contain deadline_dates will be grouped at the top, and everybody else will be sorted on the way to bottom.
Sorting in ruby is not a good idea. Doing it in Sql it much faster:
program_requirements.order(deadline_date: :asc) or :desc

Searchkick boost exact matches

I have 15,000 courses and I would like to boost the title of each class so exact matches of a class are boosted above everything else.
When I do Course.seach_kick('theory of interest' , 1)
The correct search is returned with the course 'theory of interest' as the first result.
However, when I do Course.search_kick('theory of interest 3618', 1)
3618 being the catalog_number, no results are returned. I expected the theory of interest course to be returned, and returned first. It seems the search is looking for the complete string 'theory of interest 3618' be included in the title of the course.
I understand 'and' is the default operator, Is it a requirement that I have to use the 'or' operator? I am hesitant to use the 'or' operator because of the unexpected results.
Thanks, I really enjoy using the gem.
search method:
def self.search_kick(query, page)
search(query,
fields: ["title^10", "description", "crse_id", "subject", "catalog_number" ],
facets: [:subject],
misspellings: false,
page: page,
per_page: 20
)
end
def search_data
{
title: title,
description: description,
crse_id: crse_id,
subject: subject,
catalog_number: catalog_nbr
}
end
Why not filter catalog_number in where clause:
search(query,
fields: ["title^10", "description", "crse_id", "subject" ],
facets: [:subject],
misspellings: false,
where: {catalog_number: 3618},
page: page,
per_page: 20
)
In most cases, where clause comes from an IF:
conditions = {}
if params[:catalog_number].present?
conditions[:catalog_number] = params[:catalog_number].to_i
end
search(query,
fields: ["title^10", "description", "crse_id", "subject" ],
facets: [:subject],
misspellings: false,
where: conditions,
page: page,
per_page: 20
)
You can insert as many as possible filters into where clause, just the same as ActiveRecord.where()
docs ref: https://github.com/ankane/searchkick#queries

Resources