Solr (Sunspot) query time boost on non keyword searches - ruby-on-rails

Given that I want to find 20 relevant results how would I go about boosting the first criteria inside any_of (with(:id).any_of(co_author_ids)) so that if there are 20 results which match said criteria it will return as opposed to trying to match based on the second criteria?
#solr_search = User.solr_search do
paginate(:per_page => 20)
with(:has_email, true)
any_of do
with(:id).any_of(co_author_ids)
with(:hospitals_id).any_of(hopital_ids)
end
end
Initially I didn't think boosting was necessary as I figured any_of would have a cascading effect but it does not appear to work like that. I know who to do query time boosting on keywords and fulltext searches but have been unable to get it working with with() methods.

since co_author_ids is a multivalued key , i have enough reasons to believe that there is no way to achieve that. although with single value keys it is possible to achive this cascading effect by using solr sort using function query. http://wiki.apache.org/solr/FunctionQuery#Sort_By_Function aong with the adjust_solr-params http://sunspot.github.io/docs/Sunspot/DSL/Adjustable.html
Example:
suppose you have query like this:
#solr_search = User.solr_search do
paginate(:per_page => 20)
with(:has_email, true)
any_of do
with(:id,author_id) #assuming author id is a solr index
with(:hospitals_id).any_of(hopital_ids)
end
end
and now in this case you want to have a cascading effect and want to give more preference to exact matches with author_id you can do this way
#solr_search = User.solr_search do
paginate(:per_page => 20)
with(:has_email, true)
any_of do
with(:id,author_id) #assuming author id is a solr index
with(:hospitals_id).any_of(hopital_ids)
end
adjust_solr_params do |p|
p["sort"] = "if(author_id_i = #{id},1,0) desc" #note author_id_i solr eq of author_id
end
end
so this will sort on the basis of the value of if(author_id_i = #{id},1,0) and in return will put all the records with auhtor_id as same of the user on top.
i somehow was getting problems using IF function so i instead used (practicaaly both of them are same):
#solr_search = User.solr_search do
paginate(:per_page => 20)
with(:has_email, true)
any_of do
with(:id,author_id) #assuming author id is a solr index
with(:hospitals_id).any_of(hopital_ids)
end
adjust_solr_params do |p|
p[:sort] = "min(abs(sub(author_id_i,#{id})),1) asc"
end
end
i stumbled upon this also http://wiki.apache.org/solr/SpatialSearch while looking for a solution for this and if you want to sort by distance you can do something like:
#solr_search = User.solr_search do
paginate(:per_page => 20)
with(:has_email, true)
any_of do
with(:id,author_id) #assuming author id is a solr index
with(:hospitals_id).any_of(hopital_ids)
end
adjust_solr_params do |p|
p[:pt] = "#{latitude_of_your_interest},#{longitude_of_your_interest}"
p[:sfield] = :author_location #your solr index which stores location of the author
p[:sort] = "geodist() asc"
end
end
overall i would say you can do a lot of cool things with p["sort"] but in this particular case it cant be done( imho) because it being a multivalued field
ex:
Using multivalued field in map function
Solr function query that operates on count of multivalued field
I wish they could just provide a include function for multivalued field and we can just write
p["sort"] ="if(include(co_authors_ids,#{id}), 1, 0) desc"
but as of now its not possible(again imho).

Related

tire elasticsearch kaminari pagination with overriden search in active record model

How to paginate results optimally when using a custom search method in model. I've also pushed ordering of the results to elasticsearch and after fetching from db the results are again sorted based on elasticsearch order.
My model search method looks like this:
def self.search query
model_objs = Model.tire.search do
query do
boolean do
should { string "field:#{query}", boost: 10}
should { string "#{query}"}
//other boolean queries
end
end
sort do
by "fieldname"
end
end
ids = model_objs.results.map {|x| x.id.to_i}
model_objs = Model.find(ids)
ids.collect {|id| model_objs.detect {|x| x.id == id}}
end
And in my controller I just have an action to get the results.
def search
search_term = params[:search_term].strip
#model_objs = Model.search search_term
end
I have two goals here, first I want to optimize the number of calls going to elasticsearch or to my database. And I want to paginate the results.
The default pagination mentioned in tire does not work cause I've overridden my search method.
#articles = Article.search params[:q], :page => (params[:page] || 1)
Also using the approach of getting paginated results from elastic search using the from and size would mean I make calls to elasticsearch over and over to fetch results, so I dont want to do something like this.
def self.search query, page_num
model_objs = Model.tire.search do
query do
boolean do
should { string "field:#{query}", boost: 10}
should { string "#{query}"}
//other boolean queries
end
end
sort do
by "fieldname"
end
size 10
from (page_num - 1) * 10
end
ids = model_objs.results.map {|x| x.id.to_i}
model_objs = Model.find(ids)
ids.collect {|id| model_objs.detect {|x| x.id == id}}
end
How can I achieve this with limited network calls?
you can pass page parameter to elastic search
model_objs = Model.tire.search({page: page_num}) do
query do
boolean do
should { string "field:#{query}", boost: 10}
should { string "#{query}"}
//other boolean queries
end
end
no need to pass add 'from'.
Disclaimer: I work with the OP, and this is just to post the solution we took.
Instead of fetching part of the attributes from ES and fetching the rest from the database, the Elastic Search index can be made to hold all the fields that's required for displaying the search results.
This way, query call to elasticsearch is good to display the search result page, without making any extra network calls.
The actual detailed view of the entity can fetch all that is required, but that would be only per entity, which works for us.

rails tire elasticsearch weird error

I have indexed a Car model with one car record mercedes benz in the database. If I search for the word benz I get an error:
ActiveRecord::RecordNotFound in CarsController#index
Couldn't find all Cars with IDs (1, 3) (found 1 results, but was looking for 2)
If I search for hello I get:
Couldn't find Car with id=2
Other random search terms work returning accurate results.
So it's basically random errors generated by random search terms. What could be the cause of this?
Controller:
def index
if params[:query].present?
#cars = Car.search(params)
else
#cars = Car.paginate(:page => params[:page], :per_page => 10)
end
end
Model:
def self.search(params)
tire.search(load: true, page: params[:page], per_page: 10) do |s|
s.query { string params[:query]} if params[:query].present?
end
end
This happens because, you are using the load => true option to load the search results from database. The activerecord seems to be missing in DB, but the elasticsearch index contains the same document.
Reindex is not always the solution IMHO.
The best solution is to delete the document when it is deleted in db. You can use the after_destroy callback for this.
Tire remove api is used to remove a single document from index.
Tire.index('my-index-name').remove('my-document-type', 'my-document-id')
Reference: https://github.com/karmi/tire/issues/43

How do I combine ActiveRecord results from multiple has_many :through queries?

Basically, I have an app with a tagging system and when someone searches for tag 'badger', I want it to return records tagged "badger", "Badger" and "Badgers".
With a single tag I can do this to get the records:
#notes = Tag.find_by_name(params[:tag_name]).notes.order("created_at DESC")
and it works fine. However if I get multiple tags (this is just for upper and lower case - I haven't figured out the 's' bit either yet):
Tag.find(:all, :conditions => [ "lower(name) = ?", 'badger'])
I can't use .notes.order("created_at DESC") because there are multiple results.
So, the question is.... 1) Am I going about this the right way? 2) If so, how do I get all my records back in order?
Any help much appreciated!
One implementation would be to do:
#notes = []
Tag.find(:all, :conditions => [ "lower(name) = ?", 'badger']).each do |tag|
#notes << tag.notes
end
#notes.sort_by {|note| note.created_at}
However you should be aware that this is what is known as an N + 1 query, in that it makes one query in the outer section, and then one query per result. This can be optimized by changing the first query to be:
Tag.find(:all, :conditions => [ "lower(name) = ?", 'badger'], :includes => :notes).each do |tag|
If you are using Rails 3 or above, it can be re-written slightly:
Tag.where("lower(name) = ?", "badger").includes(:notes) do |tag|
Edited
First, get an array of all possible tag names, plural, singular, lower, and upper
tag_name = params[:tag_name].to_s.downcase
possible_tag_names = [tag_name, tag_name.pluralize, tag_name.singularize].uniq
# It's probably faster to search for both lower and capitalized tags than to use the db's `lower` function
possible_tag_names += possible_tag_names.map(&:capitalize)
Are you using a tagging library? I know that some provide a method for querying multiple tags. If you aren't using one of those, you'll need to do some manual SQL joins in your query (assuming you're using a relational db like MySQL, Postgres or SQLite). I'd be happy to assist with that, but I don't know your schema.

Rails 2.3.5 Problem Building Conditions Array dynamically when using in (?)

Rails 2.3.5
I've looked at a number of other questions relating to building conditions dynamically for an ActiveRecord find.
I'm aware there are some great gems out there like search logic and that this is better in Rails3. However, I'm using geokit for geospacial search and I'm trying to build just a standard conditions set that will allow me to combine a slew of different filters.
I have 12 different filters that I'm trying to combine dynamically for an advanced search. I need to be able to mix equality, greater than, less than, in (?) and IS NULLs conditions.
Here's an example of what I'm trying to get working:
conditions = []
conditions << ["sites.site_type in (?)", params[:site_categories]] if params[:site_categories]
conditions << [<< ["sites.operational_status = ?", 'operational'] if params[:oponly] == 1
condition_set = [conditions.map{|c| c[0] }.join(" AND "), *conditions.map{|c| c[1..-1] }.flatten]
#sites = Site.find :all,
:origin => [lat,lng],
:units => distance_unit,
:limit => limit,
:within => range,
:include => [:chargers, :site_reports, :networks],
:conditions => condition_set,
:order => 'distance asc'
I seem to be able to get this working fine when there are only single variables for the conditions expression but when I have something that is a (?) and has an array of values I'm getting an error for the wrong number of bind conditions. The way I'm joining and flattening the conditions (based on the answer from Combine arrays of conditions in Rails) seems not to handle an array properly and I don't understand the flattening logic enough to track down the issue.
So let's say I have 3 values in params[:site_categories] I'll the above code leaves me with the following:
Conditions is
[["sites.operational_status = ?", "operational"], ["sites.site_type in (?)", ["shopping", "food", "lodging"]]]
The flattened attempt is:
["sites.operational_status = ? AND sites.site_type in (?)", ["operational"], [["shopping", "food", "lodging"]]]
Which gives me:
wrong number of bind variables (4 for 2)
I'm going to step back and work on converting all of this to named scopes but I'd really like to understand how to get this working this way.
Rails 4
users = User.all
users = User.where(id: params[id]) if params[id].present?
users = User.where(state: states) if states.present?
users.each do |u|
puts u.name
end
Old answer
Monkey patch the Array class. Create a file called monkey_patch.rb in config/initializers directory.
class Array
def where(*args)
sql = args.first
unless (sql.is_a?(String) and sql.present?)
return self
end
self[0] = self.first.present? ? " #{self.first} AND #{sql} " : sql
self.concat(args[1..-1])
end
end
Now you can do this:
cond = []
cond.where("id = ?", params[id]) if params[id].present?
cond.where("state IN (?)", states) unless states.empty?
User.all(:conditions => cond)

Using rails gem geokit sort by distance and pagination?

I come across a small issue in my app. I'm currently using geokit to find objects near a given location, and I use the sort_by_distance_from on the found set.
See below:
#find = Item.find(:all, :origin =>[self.geocode.lat.to_f,self.geocode.lng.to_f], :within=>50, :include=>[:programs], :conditions=>["programs.name = ?", self.name])
#find.sort_by_distance_from([self.geocode.lat.to_f,self.geocode.lng.to_f]
Is there any way with geokit to paginate form the DB when sorting by distance?
AKA, not calling the full found set?
The distance column isn't working anymore:
"In the current version of geokit-rails, it is not possible to add a where clause using the distance column. I've tried many different ways to do this and didn't get it working."
It behaves the same way for where and order clauses.
One would expect to build a query like this :
scoped = Location.geo_scope(:origin => #somewhere)
scoped = scoped.where('distance <= 5')
results = scoped.all
This is not possible right now, it must be done in a single step like this:
scoped = Location.within(5, :origin => #somewhere)
results = scoped.all
github.com/geokit/geokit-rails
My approach to solve this, would use the :offset and :limit parameters for the find()
also, there is a distance field for geokit models, :order=>'distance asc'
eg.
page = 0 unless params[:page]
items_per_page = 20
offset = page * items_per_page
#find = Item.find(:all, :origin =>[self.geocode.lat.to_f,self.geocode.lng.to_f], :within=>50, :include=>[:programs], :conditions=>["programs.name = ?", self.name], :order => 'distance asc', :limit => items_per_page, :offset => page)

Resources