Tire gem: index a Rails parent/child relationship in Elasticsearch - ruby-on-rails

Having a tough time wrapping my head around Tire's syntax, that of Elasticsearch and how they map together.
I have successfully indexed PDFs in a Rails app via Tire. But I need to break the full reports down into individual pages so queries can be more granular. It's easy enough to split the PDFs into individual pages and add them to a Page model that belongs_to the full Report model. What I'm struggling with is how to set up the mapping and where?!? I'd like to take advantage of Elasticsearch's Parent Field mapping so I can realize this ultimate goal.
Hoping someone can set me straight.
Report model (this is working for me if I index an entire PDF as the :attachment):
class Report < ActiveRecord::Base
include Tire::Model::Search
include Tire::Model::Callbacks
has_many :pages, :dependent => :destroy
attr_accessible :filename, :title
tire.mapping do
indexes :id, :type =>'integer'
indexes :title
indexes :attachment, :type => 'attachment',
:fields => {
:content_type => { :store => 'yes' },
:author => { :store => 'yes' },
:title => { :store => 'yes' },
:attachment => { :term_vector => 'with_positions_offsets', :store => 'yes' },
:date => { :store => 'yes' }
}
end
...
end
Page model:
class Page < ActiveRecord::Base
include Tire::Model::Search
include Tire::Model::Callbacks
belongs_to :report
attr_accessible :filename, :page_num,
tire.mapping do
indexes :id, :type => 'integer'
indexes :page_num, :type => 'integer'
indexes :report_id, :type => 'integer' ###<== how is this associated with _parent?
indexes :attachment, :type => 'attachment',
:fields => {
...
...
end
end

Related

Elasticsearch - No value specified for terms query

Calling all elasticsearch experts:
Environment:
elasticsearch (6.8.3)
rails 5.1.7
Linux
We changed our association from our user model to the person model from a has_one to a has_many. We did not however remove the person_id column as this can cause other issues (e.g. running code from another branch that still tries to use it)
It seems no matter what we do to the index definitions for ElasticSearch, if any null values exist in person_id (which they do now with all newly added records) we get
"No value specified for terms query"
I have confirmed this by populating that column with an integer value resulting in no error occurring.
I have removed person_id from any indexes I even removed any reference to the peoples table in the ES index declaration but it appears that as soon as ES sees there is an association to a Person model and it sees the person_id column, it tries to use it.
Besides writing an arbitrary value in the person_id column, does anyone else have a better approach? We DO NEED to search fields in the associated Person model (peoples table) but since it's a has_many, it shouldn't be using the person_id column which is no longer relevant.
We also considered reversing the search from Person to User but that would require a LOT of recoding.
Is it possible a newer version of ES will fix it? We'd probably have to upgrade the ES server as well if we update the gem right?
Here is the association:
has_many :people, :class_name => Person.to_s # I believe it's this class name that's causting it to try to use the person_id column
Here is the index definition
def as_indexed_json(_options = {})
as_json(
:only => %i[id email portal_id],
:include => {
:portal => { :only => %i[name slug] },
:people => { :methods => %i[first last full_name], :only => %i[first last full_name] },
:events => { :only => [:id] },
:account_brokerage_firms => { :only => [:id] },
:brokerage_firms => { :only => [:id] },
:roles => { :methods => [:role_name], :only => [:role_name] }
}
)
end
settings AUTOCOMPLETE_SETTINGS do
mapping :dynamic => 'false' do
indexes :id, :type => 'long'
indexes :email, :type => 'text', :analyzer => 'email'
indexes :portal_id, :type => 'long'
indexes :people do
indexes :first, :type => 'text', :analyzer => 'autocomplete'
indexes :last, :type => 'text', :analyzer => 'autocomplete'
indexes :full_name, :type => 'text', :analyzer => 'autocomplete'
end
indexes :portal do
indexes :name, :type => 'text', :analyzer => 'autocomplete'
indexes :slug, :type => 'text', :analyzer => 'autocomplete'
end
indexes :events do
indexes :id, :type => 'long'
end
indexes :account_brokerage_firms do
indexes :id, :type => 'long'
end
indexes :brokerage_firms do
indexes :id, :type => 'long'
end
indexes :roles do
indexes :role_name, :type => 'text', :analyzer => 'autocomplete'
end
end
end
# end of elastic search settings
# this function assumes there is only one user role between a user and an account

globalize with search_cop - unknown attribute

I'm trying to use the globalize gem and search_cop together. In my model I have:
class Museum < ApplicationRecord
include SearchCop
has_one_attached :hero_image
translates :name, :address, :description, :facilities, :hours, :tickets
search_scope :search do
attributes :name, :address
options :name, :type => :fulltext
options :address, :type => :fulltext
end
end
But when I go to search I get:
irb(main):006:0> Museum.search("art")
SearchCop::UnknownAttribute: Unknown attribute museums.name
Is it possible to use Globalize and SearchCop together? if so, how do I specify the translated fields to search on?
To use Globalize with SearchCop you need to define the translated attributes through their association. So something like:
search_scope :search do
attributes name: "translations.name", address: "translations.address"
options :name, :type => :fulltext
options :address, :type => :fulltext
end

Elasticsearch / Tire - Flattening Nested Objects

In reference to the Query DSL Explained Tutorial Slides 14-15
How do I flatten Nested Objects?
I have a Model named Entry and another named Category and they share a HABTM association.
Everything is currently working and the search results seem to be correct, but I don't know if my mapping is correct. The tutorial says that when you flatten objects the Document will look like this :
{
tweet => "Perl is GREAT!",
posted => "2011-08-15",
user.name => "Clinton Gormley",
user.email => "drtech#cpan.org",
tags => ["perl","opinion"],
posts => 2,
}
with the Object user being flattened. When I look at the source of my JSON document it looks like this:
{
"title":"First",
"description":"first test",
"categories":
{"categories_name":"CAP and Using the CAP website"},
"attachment":"VEVTVCE=\n",
"published":true
}
So, I'm assuming that its supposed to say categories.categories_name but I don't know how to specify that or if that's even necessary. Here's some Model code:
class Entry < ActiveRecord::Base
include Tire::Model::Search
include Tire::Model::Callbacks
has_and_belongs_to_many :categories
mount_uploader :doc, EntryDocUploader
tire.mapping do
indexes :title
indexes :description
indexes :categories do
indexes :categories_name, type: 'string', index: 'not_analyzed'
end
indexes :attachment, :type => 'attachment',
:fields => {
:title => { :store => 'yes' },
:attachment => { :term_vector => 'with_positions_offsets', :store => 'yes' }
}
end
def to_indexed_json
{
:title => title,
:description => description,
:categories => {:categories_name => cats}, #categories.map { |c| { :categories_name => c.name}}.to_sentence,
:attachment => attachment,
}.to_json
end
def self.search(params)
tire.search(load: true) do
query { string params[:query], default_operator: "AND" } if params[:query].present?
filter :term, :published => "true"
end
end
def cats
categories.map(&:name).to_sentence
end
end

Paperclip isn't saving file in time

I dont know what I did or what's changed because this was working before.
I have a Model Entry and I use Paperclip to attach a file document to it. Now, for some weird reason I keep getting a
Errno::ENOENT in EntriesController#create
No such file or directory - /var/www/capsf-web/public/assets/entries/test.pdf
I'm guessing that before Paperclip has saved the file to the directory I'm already trying to encode the file. Here's what Entry looks like. I'm using ElasticSearch's attachment mapper which is why I Encode it.
class Entry < ActiveRecord::Base
include Tire::Model::Search
include Tire::Model::Callbacks
has_and_belongs_to_many :categories
has_and_belongs_to_many :subcategories
belongs_to :entry_type
has_attached_file :document,
:url => "/assets/entries/:basename.:extension",
:path => ":rails_root/public/assets/entries/:basename.:extension"
before_post_process :image?
validates_presence_of :entry_type
attr_accessible :description, :title, :url, :category_ids, :subcategory_ids, :entry_type_id, :document
mapping do
indexes :title
indexes :description
indexes :categories do
indexes :name
end
indexes :subcategories do
indexes :name
end
indexes :entry_type
indexes :document, :type => 'attachment'
end
def to_indexed_json
#to_json( methods: [:category_name, :subcategory_name, :entry_type_name])
{
:title => title,
:description => description,
:categories => categories.map { |c| { :name => c.name}},
:subcategories => subcategories.map { |s| { :name => s.name}},
:entry_type => entry_type_name,
:document => attachment
}.to_json
end
def image?
!(document_content_type =~ /^image.*/).nil?
end
def attachment
if document.present?
path_to_document = Rails.public_path+"/assets/entries/#{document_file_name}"
Base64.encode64(open(path_to_document) { |pdf| pdf.read})
#If I comment out the line above everything works just fine.
end
end
end

Using factory_girl with mongoid to test referenced_in/references_many

I am trying to test an associated document for a subscription service. Each subscription is embedded in an account and references a plan. Below is the various bits of code:
The account:
Factory.define :account, :class => Account do |a|
a.subdomain 'test'
a.agents { [ Factory.build(:user) ] }
a.subscription { Factory.build(:free_subscription) }
end
The subscription:
Factory.define :free_subscription, :class => Subscription do |s|
s.started_at Time.now
s.plan { Factory.build(:free_plan) }
end
The plan:
Factory.define :free_plan, :class => Plan do |p|
p.plan_name 'Free'
p.cost 0
end
The error:
Mongoid::Errors::InvalidCollection: Access to the collection for Subscription is not allowed since it is an embedded document, please access a collection from the root document.
If I comment out the line that links the plan to the subscription then the tests work, but obviously I can't test that the subscription has a plan.
Any suggestions would be greatly appreciated.
UPDATE:
Here are the models:
class Account
include Mongoid::Document
field :company_name, :type => String
field :subdomain, :type => String
field :joined_at, :type => DateTime
embeds_one :subscription
accepts_nested_attributes_for :subscription
before_create :set_joined_at_date
private
def set_joined_at_date
self.joined_at = Time.now
end
end
class Subscription
include Mongoid::Document
field :coupon_verified, :type => Boolean
field :started_at, :type => DateTime
referenced_in :plan
embedded_in :account, :inverse_of => :subscription
end
class Plan
include Mongoid::Document
field :plan_name, :type => String
field :cost, :type => Integer
field :active_ticket_limit, :type => Integer
field :agent_limit, :type => Integer
field :company_limit, :type => Integer
field :client_limit, :type => Integer
field :sla_support, :type => Boolean
field :report_support, :type => Boolean
references_many :subscriptions
end
You need to create an account with the subscription for it to be valid.
Factory.define :free_subscription, :class => Subscription do |s|
s.started_at Time.now
s.plan { Factory.build(:free_plan) }
s.account { Factory(:account) }
end

Resources