Im using SearchKick which is great, Im migrating from a really bad search implementation and the product team didn't give me much trust as such before migrating to SearchKick and doing an overhaul to our search, they made me add hardcoded query results, so they can say for this search input I want this product to come up first. right now Im taking the query results that answer a certain the requested query from the db and add them at the top ( I don't care if you want the result at position 48, if there are 4 hard coded results it will be the 4th). although if possible it would be nice to do put them in the middle.
What is the cleanest way to do it with SearchKick, so that the querying will happen inside elastic ( index the hardcoded results in the product to do so )
I have 2 models Product and QueryResult, QueryResult contains a product, a query string & a wanted_rank
in my Product model I do have a method to get search results that looks something like this:
def get_search_results(query_string)
# get search results from elastic using searchkick
search_results = Product.search(query_string)
# get hardcoded results matching this query
hardcoded_results = QueryResults.where(query: query).order(:wanted_rank).map(&:product)
# remove hardcoded results from search_results
search_results = search_results - hardcoded_results
# return results where hardcoded results are first
hardcoded_results + search_results
end
In the end I want all the search logic to happen over elastic including inserting hardcoded search results
So after some very helpful comments and more search I found a solution.
first of all my first mistake was to try to fix it using boosts instead of order, to do so we index all QueryResults of a specific product under a field called called query_results.
for a product x with query results:
{
query: 'foo',
wanted_rank: 1,
},
{
query: 'bar',
wanted_rank: 2,
}
I will index:
{
name: 'x',
query_results: {
'foo': 1,
'bar': 2
}
}
than when searching, given a attribute named query, I will do as follow:
Product.search(
query,
order: [
{ "query_results.#{query}": { unmapped_type: :long },
{ _score: :desc }
]
)
2 important things to notice:
use unmapped_type this will tell elastic what mapping to use when there is no mapping, each random query that does not have a query result ( which is most of them ) will have no mapping for "query_results.#{query}" because it wont be indexed, as such we add unmapped_type to tell elastic if you have no mapping act like its long.
both when searching and when saving the db I downcase and strip query so it will match properly.
also I index the queries under another field and do a search over it with low weight to make sure that the product will come up for that query.
Related
We have a Ruby on Rails backend using Searchkick to interact with Elasticsearch.
Here is what our query looks like:
results = Runnable.search(search_term,
fields: ["application_name^10", "name^5"],
match: :word_middle,
order: {application_name: :asc, name: :asc},
operator: "or",
misspellings: false)
The problem is that when users search for a keyword, for example "Word1 word2", the results that contain only "word1" will be ranked better than the results that have both "word1" and "word2". Since we are also using pagination, users are not able to find what they are looking for on page 1, which is not what we desire.
I'm wondering if there is a way to rank the results by the number of words matched in a search term (i.e. 3 matches is better than 1 match) in ElasticSearch? If so, how to implement that with Searchkick?
Working on some search queries using searchkick and having trouble figuring out how to set up the searchkick search to get the results i want. So users can save recipes and recipes have tags associated with them. I want to get all the tags from their saved recipes and search for recipes with the same tags and weight the tags based on how many times they appear.
So to get the tags I can do this:
tags = Tagging.where(taggable_id: current_user.saved_recipes.pluck(:recipe_id)).group(:tag_id).count
This returns something like this:
{23=>1, 56=>1, 27=>2, 30=>1, 28=>1, 36=>1, 39=>1, 16=>1}
Im just not sure how to pass that array to weight the numbers, so in that case tag 27 would have a higher weight then the rest. The field Im searching against in my elasticsearch index is :tags.
Recipe.search "*", where: [tags: tag]
This was what I was thinking but the hash of tags has those numbers and not sure how to make that boost the results. Hope this is clear
You could try using the boost_where option, which allows you to boost results by a different factor for different values.
Specifically, in your case, you could try:
tags = Tagging.where(taggable_id: current_user.saved_recipes.pluck(:recipe_id)).group(:tag_id).count
# Boost any tags with count greater than 1
boost_values = tags.select { |k, v| v > 1 }.map { |k,v| {value: k, factor: v * 10} }
Recipe.search "*", where: [tags: tag], boost_where: {tags: boost_values}
In a rails 4 app, in one model I have a column containing multiple ids as a string with comma separated values.
"123,4568,12"
I have a "search" engine that I use to retrieve the records with one or many values using the full text search of postgresql I can do something like this which is very useful:
records = MyModel.where("my_models.col_name ## ?", ["12","234"])
This return all the records that have both 12 and 234 in the targeted column. The array comes from a form with a multiple select.
Now I'm trying to make a query that will find all the records that have either 12 or 234 in there string.
I was hopping to be able to do something like:
records = MyModel.where("my_models.col_name IN (?)", ["12","234"])
But it's not working.
Should I iterate through all the values in the array to build a query with multiple OR ? Is there something more appropriate to do this?
EDIT / TL;DR
#BoraMa answer is a good way to achieve this.
To find all the records containing one or more ids referenced in the request use:
records = MyModel.where("my_models.col_name ## to_tsquery(?)", ["12","234"].join('|'))
You need the to_tsquery(?) and the join with a single pipe |to do a OR like query.
To find all the records containing exactly all the ids in the query use:
records = MyModel.where("my_models.col_name ## ?", ["12","234"])
And of course replace ["12","234"] with something like params[:params_from_my_form]
Postgres documentation for full text search
If you already started to use the fulltext search in Postgres in the first place,I'd try to leverage it again. I think you can use a fulltext OR query which can be constructed like this:
records = MyModel.where("my_models.col_name ## to_tsquery(?)", ["12","234"].join(" | "));
This uses the | operator for ORing fulltext queries in Postgres. I have not tested this and maybe you'll need to do to_tsvector('my_models.col_name') for this to work.
See the documentation for more info.
Suppose your ids are :
a="1,2,3,4"
You can simply use:
ModelName.find(a)
This will give you all the record of that model whose id is present in a.
I just think a super simple solution, we just sort the ids in saving callback of MyModel, then the query must be easier:
class MyModel < ActiveRecord::Base
before_save :sort_ids_in_col_name, if: :col_name_changed?
private
def sort_ids_in_col_name
self.col_name = self.col_name.to_s.split(',').sort.join(',')
end
end
Then the query will be easy:
ids = ["12","234"]
records = MyModel.where(col_name: ids.sort.join(',')
Rails: 4.1.2
Database: PostgreSQL
For one of my queries, I am using methods from both the textacular gem and Active Record. How can I chain some of the following queries with an "OR" instead of an "AND":
people = People.where(status: status_approved).fuzzy_search(first_name: "Test").where("last_name LIKE ?", "Test")
I want to chain the last two scopes (fuzzy_search and the where after it) together with an "OR" instead of an "AND." So I want to retrieve all People who are approved AND (whose first name is similar to "Test" OR whose last name contains "Test"). I've been struggling with this for quite a while, so any help would be greatly appreciated!
I digged into fuzzy_search and saw that it will be translated to something like:
SELECT "people".*, COALESCE(similarity("people"."first_name", 'test'), 0) AS "rankxxx"
FROM "people"
WHERE (("people"."first_name" % 'abc'))
ORDER BY "rankxxx" DESC
That says if you don't care about preserving order, it will just filter the result by WHERE (("people"."first_name" % 'abc'))
Knowing that and now you can simply write the query with similar functionality:
People.where(status: status_approved)
.where('(first_name % :key) OR (last_name LIKE :key)', key: 'Test')
In case you want order, please specify what would you like the order will be after joining 2 conditions.
After a few days, I came up with the solution! Here's what I did:
This is the query I wanted to chain together with an OR:
people = People.where(status: status_approved).fuzzy_search(first_name: "Test").where("last_name LIKE ?", "Test")
As Hoang Phan suggested, when you look in the console, this produces the following SQL:
SELECT "people".*, COALESCE(similarity("people"."first_name", 'test'), 0) AS "rank69146689305952314"
FROM "people"
WHERE "people"."status" = 1 AND (("people"."first_name" % 'Test')) AND (last_name LIKE 'Test') ORDER BY "rank69146689305952314" DESC
I then dug into the textacular gem and found out how the rank is generated. I found it in the textacular.rb file and then crafted the SQL query using it. I also replaced the "AND" that connected the last two conditions with an "OR":
# Generate a random number for the ordering
rank = rand(100000000000000000).to_s
# Create the SQL query
sql_query = "SELECT people.*, COALESCE(similarity(people.first_name, :query), 0)" +
" AS rank#{rank} FROM people" +
" WHERE (people.status = :status AND" +
" ((people.first_name % :query) OR (last_name LIKE :query_like)))" +
" ORDER BY rank#{rank} DESC"
I took out all of quotation marks in the SQL query when referring to tables and fields because it was giving me error messages when I kept them there and even if I used single quotes.
Then, I used the find_by_sql method to retrieve the People object IDs in an array. The symbols (:status, :query, :query_like) are used to protect against SQL injections, so I set their values accordingly:
# Retrieve all the IDs of People who are approved and whose first name and last name match the search query.
# The IDs are sorted in order of most relevant to the search query.
people_ids = People.find_by_sql([sql_query, query: "Test", query_like: "%Test%", status: 1]).map(&:id)
I get the IDs and not the People objects in an array because find_by_sql returns an Array object and not a CollectionProxy object, as would normally be returned, so I cannot use ActiveRecord query methods such as where on this array. Using the IDs, we can execute another query to get a CollectionProxy object. However, there's one problem: If we were to simply run People.where(id: people_ids), the order of the IDs would not be preserved, so all the relevance ranking we did was for nothing.
Fortunately, there's a nice gem called order_as_specified that will allow us to retrieve all People objects in the specific order of the IDs. Although the gem would work, I didn't use it and instead wrote a short line of code to craft conditions that would preserve the order.
order_by = people_ids.map { |id| "people.id='#{id}' DESC" }.join(", ")
If our people_ids array is [1, 12, 3], it would create the following ORDER statement:
"people.id='1' DESC, people.id='12' DESC, people.id='3' DESC"
I learned from this comment that writing an ORDER statement in this way would preserve the order.
Now, all that's left is to retrieve the People objects from ActiveRecord, making sure to specify the order.
people = People.where(id: people_ids).order(order_by)
And that did it! I didn't worry about removing any duplicate IDs because ActiveRecord does that automatically when you run the where command.
I understand that this code is not very portable and would require some changes if any of the people table's columns are modified, but it works perfectly and seems to execute only one query according to the console.
I'm having trouble figuring out how to loop over the results of a ThinkingSphinx search that has been set to group_by. I currently have the following:
search = Event.search(
{
group_by: 'category_id',
group_function: :attr
}
)
search.each_with_groupby_and_count do |event, group, count|
puts [event, group, count].join(' - ')
end
This, however, only returns one record per category. It seems like the group and count values are correct, but I only get the first Event of each category, which I would have expected to be all the events in the group. Is it possible to get an array of Hashes or similar? Furthermore, if this is possible, would the per_page option be per group?
I would expect each_with_group_and_count to iterate over something like this:
[
{group: 1, hits: [Event1, Event2], count: 2},
{group: 2: hits: [Event3], count: 1}
]
I'm afraid Sphinx's grouping functionality doesn't behave in that matter - it only returns one document (in this situation, one event) per group value.
It may be more appropriate to just sort by category_id instead, and track when it changes as you iterate over it (or use Enumerable#group_by to group all events by category_id) - keep in mind that Sphinx paginates results, so you may want to increase the default page size (with :per_page) depending on how you're using these results.