Rails 5, thinking sphinx, indexing and searching has meny through relationships - ruby-on-rails

I have a Rails App in which I want to use Thinking Sphinx for search. I have a has many though relationship between the following models, Product has many Types through ProductType.
# Product.rb
has_many :product_types
has_many :types, through: :product_types
# Type.rb
has_many :product_types
has_many :products, through: :product_types
# ProductType.rb
belongs_to :product
belongs_to :type
In my ProductsController index action I want to be able to filter which products are shown in the view based on given Variant ids.
My relevant indexes currently looks like this (note, I haven't used ThinkingSphinx in a long time):
# product_index.rb
ThinkingSphinx::Index.define :product, :with => :active_record do
indexes name, :sortable => true
indexes description
indexes brand.name, as: :brand, sortable: true
indexes product_types.type.id, as: :product_types
has created_at, updated_at
end
# type_index.rb
ThinkingSphinx::Index.define :type, :with => :active_record do
indexes name, :sortable => true
end
# product_type_index.rb
ThinkingSphinx::Index.define :product_type, :with => :active_record do
has product_id, type: :integer
has type_id, type: :integer
end
I currently pass an array of :product_types ids in a link_to, like this (let me know if there is a better way to do it):
= link_to "Web shop", products_path(product_types: Type.all.map(&:id), brand: Brand.all.map(&:id)), class: "nav-link"
In my ProductsController I try to filter the result based on the given Type ids like this:
product_types = params[:product_types]
#products = Product.search with_all: { product_types: product_types.collect(&:to_i) }
When I run rake ts:rebuild I get the following error:
indexing index 'product_type_core'...
ERROR: index 'product_type_core': No fields in schema - will not index
And when I tries to view the view in the browser I get the following error:
index product_core: no such filter attribute 'product_types'
- SELECT * FROM `product_core` WHERE `sphinx_deleted` = 0 AND
`product_types` = 1 AND `product_types` = 2 AND `product_types` = 3
LIMIT 0, 20; SHOW META
Any ideas in how to properly set up my indexes (and query) for this case?

There's a few issues to note here:
Firstly, the error you're seeing during rake ts:rebuild is pointing out that you've not set any fields in your ProductType Sphinx index - no indexes calls for text data you wish to search on. Are you actually searching on ProductType at all? If so, what text are you expecting people to match by?
If you're not searching on that model, there's no need to have a Sphinx index for it.
Secondly, the issue with your search - you're filtering on product_types with integers, which makes sense. However, in your index, you've defined product_types as a field (using indexes) rather than an attribute (using has). Given it's integer values and you're likely not expecting someone to type in an ID into a search input, you'll almost certainly want this to be an attribute instead - so change the indexes to a has for that line in your Product index definition, and run ts:rebuild.

Related

MongoDB conditional aggregate query on a HABTM relationship (Mongoid, RoR)?

Rails 4.2.5, Mongoid 5.1.0
I have three models - Mailbox, Communication, and Message.
mailbox.rb
class Mailbox
include Mongoid::Document
belongs_to :user
has_many :communications
end
communication.rb
class Communication
include Mongoid::Document
include Mongoid::Timestamps
include AASM
belongs_to :mailbox
has_and_belongs_to_many :messages, autosave: true
field :read_at, type: DateTime
field :box, type: String
field :touched_at, type: DateTime
field :import_thread_id, type: Integer
scope :inbox, -> { where(:box => 'inbox') }
end
message.rb
class Message
include Mongoid::Document
include Mongoid::Timestamps
attr_accessor :communication_id
has_and_belongs_to_many :communications, autosave: true
belongs_to :from_user, class_name: 'User'
belongs_to :to_user, class_name: 'User'
field :subject, type: String
field :body, type: String
field :sent_at, type: DateTime
end
I'm using the authentication gem devise, which gives access to the current_user helper, which points at the current user logged in.
I have built a query for a controller that satisfied the following conditions:
Get the current_user's mailbox, whose communication's are filtered by the box field, where box == 'inbox'.
It was constructed like this (and is working):
current_user.mailbox.communications.where(:box => 'inbox')
My issue arrises when I try to build upon this query. I wish to chain queries so that I only obtain messages whose last message is not from the current_user. I am aware of the .last method, which returns the most recent record. I have come up with the following query but cannot understand what would need to be adjusted in order to make it work:
current_user.mailbox.communications.where(:box => 'inbox').where(:messages.last.from_user => {'$ne' => current_user})
This query produces the following result:
undefined method 'from_user' for #<Origin::Key:0x007fd2295ff6d8>
I am currently able to accomplish this by doing the following, which I know is very inefficient and want to change immediately:
mb = current_user.mailbox.communications.inbox
comms = mb.reject {|c| c.messages.last.from_user == current_user}
I wish to move this logic from ruby to the actual database query. Thank you in advance to anyone who assists me with this, and please let me know if anymore information is helpful here.
Ok, so what's happening here is kind of messy, and has to do with how smart Mongoid is actually able to be when doing associations.
Specifically how queries are constructed when 'crossing' between two associations.
In the case of your first query:
current_user.mailbox.communications.where(:box => 'inbox')
That's cool with mongoid, because that actually just desugars into really 2 db calls:
Get the current mailbox for the user
Mongoid builds a criteria directly against the communication collection, with a where statement saying: use the mailbox id from item 1, and filter to box = inbox.
Now when we get to your next query,
current_user.mailbox.communications.where(:box => 'inbox').where(:messages.last.from_user => {'$ne' => current_user})
Is when Mongoid starts to be confused.
Here's the main issue: When you use 'where' you are querying the collection you are on. You won't cross associations.
What the where(:messages.last.from_user => {'$ne' => current_user}) is actually doing is not checking the messages association. What Mongoid is actually doing is searching the communication document for a property that would have a JSON path similar to: communication['messages']['last']['from_user'].
Now that you know why, you can get at what you want, but it's going to require a little more sweat than the equivalent ActiveRecord work.
Here's more of the way you can get at what you want:
user_id = current_user.id
communication_ids = current_user.mailbox.communications.where(:box => 'inbox').pluck(:_id)
# We're going to need to work around the fact there is no 'group by' in
# Mongoid, so there's really no way to get the 'last' entry in a set
messages_for_communications = Messages.where(:communications_ids => {"$in" => communications_ids}).pluck(
[:_id, :communications_ids, :from_user_id, :sent_at]
)
# Now that we've got a hash, we need to expand it per-communication,
# And we will throw out communications that don't involve the user
messages_with_communication_ids = messages_for_communications.flat_map do |mesg|
message_set = []
mesg["communications_ids"].each do |c_id|
if communication_ids.include?(c_id)
message_set << ({:id => mesg["_id"],
:communication_id => c_id,
:from_user => mesg["from_user_id"],
:sent_at => mesg["sent_at"]})
end
message_set
end
# Group by communication_id
grouped_messages = messages_with_communication_ids.group_by { |msg| mesg[:communication_id] }
communications_and_message_ids = {}
grouped_messages.each_pair do |k,v|
sorted_messages = v.sort_by { |msg| msg[:sent_at] }
if sorted_messages.last[:from_user] != user_id
communications_and_message_ids[k] = sorted_messages.last[:id]
end
end
# This is now a hash of {:communication_id => :last_message_id}
communications_and_message_ids
I'm not sure my code is 100% (you probably need to check the field names in the documents to make sure I'm searching through the right ones), but I think you get the general pattern.

Thinking-Sphinx with ActsAsTaggableOn tags search

I have a Rails app in which I use Thinking-Sphinx for search and ActsAsTaggableOn for tagging. I want to be able to include the currently selected tag in my search query. I have tried the following but not got it to work.
In my controller:
def show
#team_tags = #team.ideas.tag_counts_on(:tags)
if params[:tag]
#ideas = #team.ideas.search(params[:search], :conditions => { tags: "tag" })
else
#ideas = #team.ideas.search(params[:search])
end
end
My index for my Idea model:
ThinkingSphinx::Index.define :idea, :with => :real_time do
[...]
indexes tags.name, :as => :tags
has user_id, type: :integer
has team_id, type: :integer
[...]
end
This gives me he following error:
ActionView::Template::Error (index idea_core: query error: no field 'tags' found in schema
When I have a tag selected my URLs looks like this:
/teams/1/tags/tag
So, what should I do to get Thinking-Sphinx and ActsAsTaggableOn to work together?
What you've got for your field will only work with SQL-backed indices, not real-time indices.
In your case, what you want to do is create a method in your model that returns all the tag names as a single string:
def tag_names
tags.collect(&:name).join(' ')
end
And then you can refer to that in your index definition:
indexes tag_names, :as => :tags
Once you've done that, you'll need to regenerate your Sphinx indices, as you've changed the structure: rake ts:regenerate.

Extract Mongoid documents based on the DateTime of their last has_many relations?

I have a bunch of orders, and some of them have order_confirmations.
1: I wish to extract a list of orders based on the DateTime of its last order_confirmation. This is my failed attempt (returns 0 records):
Order.where(:order_confirmations.exists => true).desc("order_confirmations.last.datetime")
2: I wish to extract a list of orders where the last order_confirmation is between 5 and 10 days old. This is my failed attempt (returns 0 results):
Order.lte("order_confirmations.last.datetime" => 5.days.ago).gte("order_confirmations.last.datetime" => 10.days.ago)
My relations:
class Order
include Mongoid::Document
has_many :order_confirmations
end
class OrderConfirmation
include Mongoid::Document
field :datetime, type: DateTime
belongs_to :order
end
With referenced relationships, you cannot directly query referenced documents.
That said, you would probably want to query order confirmations first, and then select the orders like this:
OrderConfirmation.between(datetime: 10.days.ago..5.days.ago)
.distinct(:order_id).map { |id| Order.find(id) }
If you had confirmations embedded into the order, like this
class Order
include Mongoid::Document
embeds_many :order_confirmations
end
class OrderConfirmation
include Mongoid::Document
field :datetime, type: DateTime
embedded_in :order
end
Then you could query order confirmation inside order query with $elemMatch:
Order.elem_match(order_confirmations:
{ :datetime.gte => 10.days.ago, :datetime.lte => 5.days.ago })
Regarding your first question, I don't think it's possible to do that with just MongoDB queries, so you could do something like
# if you go embedded rels
Order.all.map { |o| o.order_confirmations.desc(:datetime).first }
.sort_by(&:datetime).map(&:order)
# if you stay on referenced rels
OrderConfirmation.desc(:datetime).group_by(&:order)
.map { |k, v| v.first }.map(&:order)
Check out the elemMatch function.
where('$elemMatch' => [{...}]
I do believe there is a bug in mongoid though related to elemMatch and comparing dates, not sure if its been fixed.

Indexing fields + custom text in with Thinking Sphinx

I've got indexes on a few different models, and sometimes the user might search for a value which exists in multiple models. Now, if the user is really only interested in data from one of the models I'd like the user to be able to pre/postfix the query with something to limit the scope.
For instance, if I only want to find a match in my Municipality model, I've set up an index in that model so that the user now can query "xyz municipality" (in quotes):
define_index do
indexes :name, :sortable => true
indexes "name || ' municipality' name", :as => :extended_name, :type => :string
end
This works just fine. Now I also have a Person model, with a relation to Municipality. I'd like, when searching only on the Person model, to have the same functionality available, so that I can say Person.search("xyz municipality") and get all people connected to that municipality. This is my current definition in the Person model:
has_many :municipalities, :through => :people_municipalities
define_index do
indexes [lastname, firstname], :as => :name, :sortable => true
indexes municipalities.name, :as => :municipality_name, :sortable => true
end
But is there any way I can create an index on this model, referencing municipalities, like the one I have on the Municipality model itself?
If you look at the generated SQL in the sql_query setting of config/development.sphinx.conf for source person_core_0, you'll see how municipalities.name is being concatenated together (I'd post an example, but it depends on your database - MySQL and PostgreSQL handle this completely differently).
I would recommend duplicating the field, and insert something like this (SQL is pseudo-code):
indexes "GROUP_CONCAT(' municipality ' + municipalities.name)",
:as => :extended_municipality_names
Also: there's not much point adding :sortable true to either this nor the original field from the association - are you going to sort by all of the municipality names concat'd together? I'm guessing not :)

Rails find conditions... where attribute is not a database column

I think it's safe to say everyone loves doing something like this in Rails:
Product.find(:all, :conditions => {:featured => true})
This will return all products where the attribute "featured" (which is a database column) is true. But let's say I have a method on Product like this:
def display_ready?
(self.photos.length > 0) && (File.exist?(self.file.path))
end
...and I want to find all products where that method returns true. I can think of several messy ways of doing it, but I think it's also safe to say we love Rails because most things are not messy.
I'd say it's a pretty common problem for me... I'd have to imagine that a good answer will help many people. Any non-messy ideas?
The only reliable way to filter these is the somewhat ugly method of retrieving all records and running them through a select:
display_ready_products = Product.all.select(&:display_ready?)
This is inefficient to the extreme especially if you have a large number of products which are probably not going to qualify.
The better way to do this is to have a counter cache for your photos, plus a flag set when your file is uploaded:
class Product < ActiveRecord::Base
has_many :photos
end
class Photo < ActiveRecord::Base
belongs_to :product, :counter_cache => true
end
You'll need to add a column to the Product table:
add_column :products, :photos_count, :default => 0
This will give you a column with the number of photos. There's a way to pre-populate these counters with the correct numbers at the start instead of zero, but there's no need to get into that here.
Add a column to record your file flag:
add_column :products, :file_exists, :boolean, :null => false, :default => false
Now trigger this when saving:
class Product < ActiveRecord::Base
before_save :assign_file_exists_flag
protected
def assign_file_exists_flag
self.file_exists = File.exist?(self.file.path)
end
end
Since these two attributes are rendered into database columns, you can now query on them directly:
Product.find(:all, :conditions => 'file_exists=1 AND photos_count>0')
You can clean that up by writing two named scopes that will encapsulate that behavior.
You need to do a two level select:
1) Select all possible rows from the database. This happens in the db.
2) Within Ruby, select the valid rows from all of the rows. Eg
possible_products = Product.find(:all, :conditions => {:featured => true})
products = possible_products.select{|p| p.display_ready?}
Added:
Or:
products = Product.find(:all, :conditions => {:featured => true}).select {|p|
p.display_ready?}
The second select is the select method of the Array object. Select is a very handy method, along with detect. (Detect comes from Enumerable and is mixed in with Array.)

Resources