Indexing fields + custom text in with Thinking Sphinx - ruby-on-rails

I've got indexes on a few different models, and sometimes the user might search for a value which exists in multiple models. Now, if the user is really only interested in data from one of the models I'd like the user to be able to pre/postfix the query with something to limit the scope.
For instance, if I only want to find a match in my Municipality model, I've set up an index in that model so that the user now can query "xyz municipality" (in quotes):
define_index do
indexes :name, :sortable => true
indexes "name || ' municipality' name", :as => :extended_name, :type => :string
end
This works just fine. Now I also have a Person model, with a relation to Municipality. I'd like, when searching only on the Person model, to have the same functionality available, so that I can say Person.search("xyz municipality") and get all people connected to that municipality. This is my current definition in the Person model:
has_many :municipalities, :through => :people_municipalities
define_index do
indexes [lastname, firstname], :as => :name, :sortable => true
indexes municipalities.name, :as => :municipality_name, :sortable => true
end
But is there any way I can create an index on this model, referencing municipalities, like the one I have on the Municipality model itself?

If you look at the generated SQL in the sql_query setting of config/development.sphinx.conf for source person_core_0, you'll see how municipalities.name is being concatenated together (I'd post an example, but it depends on your database - MySQL and PostgreSQL handle this completely differently).
I would recommend duplicating the field, and insert something like this (SQL is pseudo-code):
indexes "GROUP_CONCAT(' municipality ' + municipalities.name)",
:as => :extended_municipality_names
Also: there's not much point adding :sortable true to either this nor the original field from the association - are you going to sort by all of the municipality names concat'd together? I'm guessing not :)

Related

Rails 5, thinking sphinx, indexing and searching has meny through relationships

I have a Rails App in which I want to use Thinking Sphinx for search. I have a has many though relationship between the following models, Product has many Types through ProductType.
# Product.rb
has_many :product_types
has_many :types, through: :product_types
# Type.rb
has_many :product_types
has_many :products, through: :product_types
# ProductType.rb
belongs_to :product
belongs_to :type
In my ProductsController index action I want to be able to filter which products are shown in the view based on given Variant ids.
My relevant indexes currently looks like this (note, I haven't used ThinkingSphinx in a long time):
# product_index.rb
ThinkingSphinx::Index.define :product, :with => :active_record do
indexes name, :sortable => true
indexes description
indexes brand.name, as: :brand, sortable: true
indexes product_types.type.id, as: :product_types
has created_at, updated_at
end
# type_index.rb
ThinkingSphinx::Index.define :type, :with => :active_record do
indexes name, :sortable => true
end
# product_type_index.rb
ThinkingSphinx::Index.define :product_type, :with => :active_record do
has product_id, type: :integer
has type_id, type: :integer
end
I currently pass an array of :product_types ids in a link_to, like this (let me know if there is a better way to do it):
= link_to "Web shop", products_path(product_types: Type.all.map(&:id), brand: Brand.all.map(&:id)), class: "nav-link"
In my ProductsController I try to filter the result based on the given Type ids like this:
product_types = params[:product_types]
#products = Product.search with_all: { product_types: product_types.collect(&:to_i) }
When I run rake ts:rebuild I get the following error:
indexing index 'product_type_core'...
ERROR: index 'product_type_core': No fields in schema - will not index
And when I tries to view the view in the browser I get the following error:
index product_core: no such filter attribute 'product_types'
- SELECT * FROM `product_core` WHERE `sphinx_deleted` = 0 AND
`product_types` = 1 AND `product_types` = 2 AND `product_types` = 3
LIMIT 0, 20; SHOW META
Any ideas in how to properly set up my indexes (and query) for this case?
There's a few issues to note here:
Firstly, the error you're seeing during rake ts:rebuild is pointing out that you've not set any fields in your ProductType Sphinx index - no indexes calls for text data you wish to search on. Are you actually searching on ProductType at all? If so, what text are you expecting people to match by?
If you're not searching on that model, there's no need to have a Sphinx index for it.
Secondly, the issue with your search - you're filtering on product_types with integers, which makes sense. However, in your index, you've defined product_types as a field (using indexes) rather than an attribute (using has). Given it's integer values and you're likely not expecting someone to type in an ID into a search input, you'll almost certainly want this to be an attribute instead - so change the indexes to a has for that line in your Product index definition, and run ts:rebuild.

Thinking sphinx results based on model preference

I have two models: 'A' and 'B', and want to search objects from both of them using Thinking sphinx, but I want all results of model 'A' first and then 'B'. How can I do that?
I pass the following options to sphinx query
{:match_mode=>:extended, :sort_mode=>:extended, :star=>true, :order=>"#relevance DESC", :ignore_errors=>true, :populate=>true, :per_page=>10, :retry_stale=>true, :classes => [A,B]}
And then get search results using:
ThinkingSphinx.search "*xy*", options
But it gives results in mixed ordering, whereas I need all 'A' objects first. How can I do that?
The easiest way is to add an attribute to both models' indices:
has "1", :as => :sort_order, :type => :integer
The number within the string should be different per model. And then your :order argument becomes:
:order => 'sort_order ASC, #relevance DESC'

Ordering by count with Thinking Sphinx

I want my search engine to be able to order Lawyers on the count of cases of a certain case type. The most a lawyer has finalized cases of a certain type, the higher he will be ranked.
lawyer.rb
has_many :cases
has_many :case_types, :through => :cases
define_index do
indexes case_types.name, :as => :case_types
has case_types(:id), :as => :case_types_id
has "SUM(case_types)", :as => :case_type_count #this line gives an error, as my lawyer table does't have a case_type column, also, I need to count DISTINCT case_types
end
In my search_controller.rb, I would like to do something like that, suggestion being the name of a case type
#lawyers = Lawyer.search params[:suggestion], :order => "#case_type_count DESC"
Am I going the wrong way? should I think of a less Sphinx oriented method? The problem is I need to do an each_with_geodist on #lawyers, so I would need to get my lawyers through a Sphinx search.
Add the following to your define_index:
has "COUNT(case_types.id)", :as => :case_type_count, :type => :integer
join case_types
Then retrieve by case_count:
Lawyer.search("", :order => "case_type_count desc")
I have found it useful to read the sql code in development.sphinx.conf which allows me to see the column names being generated.

Searching with thinking_sphinx and filtering results

I have this scenario where I thought it would be pretty basic, but found out that I can't really achieve what I need. This is why I have this question for a thinking_sphinx's expert.
The scenario is this: I need do a search within a list of companies and only return those who has an address (there can be many address by company) which belongs to a particular city or none at all (this I can do).
I have the following models :
class Company < ActiveRecord::Base
has_many :company_addresses
define_index
indexes :name
indexes :description
indexes :keywords
end
end
and
class CompanyAddress < ActiveRecord::Base
end
The CompanyAddress has a city_id property. Without looping through all returned records from a sphinx search, is there a way to achieve the same thing more easily?
I'm using Rails 3.0.3 and thinking_sphinx.
You'll want to add an attribute pointing to the city_id values for the company:
has company_addresses.city_id, :as => :city_ids
And then you can filter on Companies belonging to a specific city:
Company.search 'foo', :with => {:city_ids => #city.id}
If you want both matching to a specific city or has no cities, that's a little trickier, as OR logic for attribute filters is more than a little tricky at best. Ideally what you want is a single attribute that contains either 0, or all city ids. Doing this depends on your database, as MySQL and Postgres functions vary.
As a rough idea, though - this might work in MySQL:
has "IF(COUNT(city_id) = 0, '0', GROUP_CONCAT(city_id SEPARATOR ',')",
:as => :city_ids, :type => :multi
Postgres is reasonably similar, though you may need to use a CASE statement instead of IF, and you'll definitely want to use a couple of functions for the group concatenation:
array_to_string(array_accum(city_id, '0')), ',')
(array_accum is provided by Thinking Sphinx, as there was no direct equivalent of GROUP_CONCAT in PostgreSQL).
Anyway, if you need this approach, and get the SQL all figured out, then your query looks something like:
Company.search 'foo', :with => {:city_ids => [0, #city.id]}
This will match on either 0 (representing no cities), or the specific city.
Finally: if you don't reference the company_addresses association anywhere in your normal fields and attributes, you'll need to force to join in your define_index:
join company_addresses
Hopefully that provides enough clues - feel free to continue the discussion here or on the Google Group.

Rails find conditions... where attribute is not a database column

I think it's safe to say everyone loves doing something like this in Rails:
Product.find(:all, :conditions => {:featured => true})
This will return all products where the attribute "featured" (which is a database column) is true. But let's say I have a method on Product like this:
def display_ready?
(self.photos.length > 0) && (File.exist?(self.file.path))
end
...and I want to find all products where that method returns true. I can think of several messy ways of doing it, but I think it's also safe to say we love Rails because most things are not messy.
I'd say it's a pretty common problem for me... I'd have to imagine that a good answer will help many people. Any non-messy ideas?
The only reliable way to filter these is the somewhat ugly method of retrieving all records and running them through a select:
display_ready_products = Product.all.select(&:display_ready?)
This is inefficient to the extreme especially if you have a large number of products which are probably not going to qualify.
The better way to do this is to have a counter cache for your photos, plus a flag set when your file is uploaded:
class Product < ActiveRecord::Base
has_many :photos
end
class Photo < ActiveRecord::Base
belongs_to :product, :counter_cache => true
end
You'll need to add a column to the Product table:
add_column :products, :photos_count, :default => 0
This will give you a column with the number of photos. There's a way to pre-populate these counters with the correct numbers at the start instead of zero, but there's no need to get into that here.
Add a column to record your file flag:
add_column :products, :file_exists, :boolean, :null => false, :default => false
Now trigger this when saving:
class Product < ActiveRecord::Base
before_save :assign_file_exists_flag
protected
def assign_file_exists_flag
self.file_exists = File.exist?(self.file.path)
end
end
Since these two attributes are rendered into database columns, you can now query on them directly:
Product.find(:all, :conditions => 'file_exists=1 AND photos_count>0')
You can clean that up by writing two named scopes that will encapsulate that behavior.
You need to do a two level select:
1) Select all possible rows from the database. This happens in the db.
2) Within Ruby, select the valid rows from all of the rows. Eg
possible_products = Product.find(:all, :conditions => {:featured => true})
products = possible_products.select{|p| p.display_ready?}
Added:
Or:
products = Product.find(:all, :conditions => {:featured => true}).select {|p|
p.display_ready?}
The second select is the select method of the Array object. Select is a very handy method, along with detect. (Detect comes from Enumerable and is mixed in with Array.)

Resources