Ruby class attributes not querying correctly - ruby-on-rails

I have a class in my Ruby on Rails project for Institutions that I recently had to add an attribute to, an :identifier. The class has a custom metadata field that accompanies it for searching and indexing purposes. The problem is, the new attribute I added isn't helping me find objects the way I wanted. If I try to query for an object using the :identifier to do so I get consistently get an empty array. And yes, I have checked multiple times to ensure that the test object actually exists.
This is the model:
class Institution < ActiveFedora::Base
include Hydra::AccessControls::Permissions
# NOTE with rdf datastreams must query like so ins = Institution.where(desc_metadata__name_tesim: "APTrust")
has_metadata "rightsMetadata", type: Hydra::Datastream::RightsMetadata
has_metadata 'descMetadata', type: InstitutionMetadata
has_many :intellectual_objects, property: :is_part_of
has_attributes :name, :brief_name, :identifier, datastream: 'descMetadata', multiple: false
validates :name, :identifier, presence: true
validate :name_is_unique
validate :identifier_is_unique
def users
User.where(institution_pid: self.pid).to_a.sort_by(&:name)
end
private
def name_is_unique
errors.add(:name, "has already been taken") if Institution.where(desc_metadata__name_ssim: self.name).reject{|r| r == self}.any?
end
def identifier_is_unique
count = 0;
Institution.all.each do |inst|
count += 1 if inst.identifier == self.identifier
end
if(count > 0)
errors.add(:identifier, "has already been taken")
end
#errors.add(:identifier, "has already been taken") if Institution.where(desc_metadata__identifier_ssim: self.identifier).reject{|r| r.identifier == self.identifier}.any?
end
end
As you can see, I had to write a very different method to check for the uniqueness of an identifier because the .where method wasn't returning anything. I didn't realize that was the problem though until I started working on the show model in the controller (below):
def show
identifier = params[:identifier] << "." << params[:format]
#institution = Institution.where(desc_metadata__identifier_ssim: identifier)
end
This never returns anything even though I have several Institution objects in my database and have double and triple checked that the URL parameters are correct. And part of double checking that was searching for objects in the console. Here's the output:
Loading development environment (Rails 4.0.3)
2.0.0-p353 :001 > ap = Institution.where(desc_metadata__name_ssim: "APTrust")
ActiveFedora: loading fedora config from /Users/kec6en/HydraApp/fluctus/config/fedora.yml
ActiveFedora: loading solr config from /Users/kec6en/HydraApp/fluctus/config/solr.yml
Loaded datastream list for aptrust-dev:379 (3.2ms)
Loaded datastream profile aptrust-dev:379/RELS-EXT (2.7ms)
Loaded datastream content aptrust-dev:379/RELS-EXT (2.4ms)
Loaded datastream profile aptrust-dev:379/descMetadata (2.6ms)
Loaded datastream profile aptrust-dev:379/descMetadata (3.5ms)
Loaded datastream content aptrust-dev:379/descMetadata (3.1ms)
=> [#<Institution pid: "aptrust-dev:379", name: "APTrust", brief_name: "apt", identifier: "aptrust.org">]
2.0.0-p353 :002 > apt = Institution.where(desc_metadata__identifier_ssim: "aptrust.org")
=> []
2.0.0-p353 :003 >
As you can see, I'm querying for an identifier that does exist but it's not finding anything. For reference, here is the metadata datastream that I'm working off of. Note that the identifier is indexed as :stored_searchable so I should be able to query for it.
class InstitutionMetadata < ActiveFedora::RdfxmlRDFDatastream
map_predicates do |map|
map.name(in: RDF::DC, to: 'title') { |index| index.as :symbol, :stored_searchable }
map.brief_name(in: RDF::DC, to: 'alternative')
map.identifier(in: RDF::DC, to: 'identifier') { |index| index.as :symbol, :stored_searchable }
end
end
I modeled it after the name attribute because that one appears to be working. Any ideas why the identifier isn't?
Thanks!

Why don't you just search the identifier directly instead of using the desc_metadata?
Institution.where identifier: "aptrust.org"

Using desc_metadata__identifier_tesim instead of desc_metadata__identifier_ssim seems to work for finding objects, although it still doesn't work for the uniquess checking method I wrote.

Related

attr_accessor not updating value from rails model

I have the following model
class Job < ActiveRecord::Base
attr_accessor :incentive
end
I want to be able to store a temporary column in my model via attr_accessor.
I want to be able to do something like this
job = Job.last
job.incentive = {id: 1}
and i expect if i do job.incentive, it should return {id: 1}
I also tried doing this as well
def incentive =(val)
#incentive = val
end
def incentive
#incentive
end
But that also didn't work. How can i be able to store temporary column values in rails 4
You script is fine, you'll find the below script working perfectly in your rails console:
job = Job.last
job.incentive = { id: 1 }
p job.incentive # returns { id: 1 }
If you restart or refresh your console (or webpage) this information is gone, since it is only set in memory and not stored to the database.

ruby mongoDB insert_many - success message but no inserts

I'm trying to do insert_many using Ruby Driver of MongoDB but it's not working. Any help would be appreciated.
Here's my sample model:
class User
include Mongoid::Document
include Mongoid::Timestamps
field :message
end
MongoDB Rails code:
client = Mongo::Client.new('mongodb://127.0.0.1:27017/development')
collection = client[:user]
u = Hash.new
u['message'] = 'hi'
documents = []
documents << u
result = collection.insert_many(documents)
#<Mongo::BulkWrite::Result:0x00007fa6ed4e99b8 #results={"n_inserted"=>1, "n"=>1, "inserted_ids"=>[BSON::ObjectId('5e9ac4c6c40dc6a955465a8f')]}>
When I verify the insert, it seems to work, but when I query the model, there's no data:
result
#<Mongo::BulkWrite::Result:0x00007fa6ed4e99b8 #results={"n_inserted"=>1, "n"=>1, "inserted_ids"=>[BSON::ObjectId('5e9ac4c6c40dc6a955465a8f')]}>
User.count
0
Any suggestions?
I finally figured out the issue. It's the collection itself. All I need to do is this to get the collection, then I can apply MongDB Ruby Driver methods:
user_collection = User.collection

ActiveRecord #becomes! record not saving

I've got some STI in my data model. There are two types of Task records: PrimaryTask and SecondaryTask. So my ActiveRecord models look like this:
class Task < ActiveRecord::Base
end
class PrimaryTask < Task
has_many :secondary_tasks
end
class SecondaryTask < Task
belongs_to :primary_task
end
I want to provide a way to "promote" a SecondaryTask to a PrimaryTask permanently (as in, persisted in the database). From perusing the docs, looks like the #becomes! method is what I want, but I can't get it to save the changes in the database.
id = 1
secondary_task = SecondaryTask.find(id)
primary_task = secondary_task.becomes!(PrimaryTask)
primary_task.id # => 1
primary_task.class # => PrimaryTask
primary_task.type # => "PrimaryTask"
primary_task.new_record? # => false
primary_task.changes # => { "type"=>[nil,"PrimaryTask"] }
primary_task.save! # => true
primary_task.reload # => raises ActiveRecord::RecordNotFound: Couldn't find PrimaryTask with id=1 [WHERE "tasks"."type" IN ('PrimaryTask')]
# Note: secondary_task.reload works fine, because the task's type did not change in the DB
Any idea what's up? I tried the following things, to no avail. Am I misunderstanding becomes!?
Force the record to be 'dirty' in case the save! call was a no-op because none of the attributes were marked dirty (primary_task.update_attributes(updated_at: Time.current) -- didn't help)
Destroy secondary_task in case the fact that they both have the same id was a problem. Didn't help. The SecondaryTask record was deleted but no PrimaryTask was created (despite the call to save! returning true)
UPDATE 1
The logs show the probable issue:
UPDATE "tasks" SET "type" = $1 WHERE "tasks"."type" IN ('PrimaryTask') AND "tasks"."id" = 2 [["type", "PrimaryTask"]]
So the update is failing because the WHERE clause causes the record not to be found.
Figured it out. Turns out there was a bug in ActiveRecord version 4.0.0. It has since been patched. The key change this patch introduced was to set the changes correctly in both instances. So now you can call save on the original instance (in my example secondary_task) and it will change the type in the database. Note that calling save on the new instance (for me primary_task) will NOT save the changes, because of the behavior described in the question: it will include a WHERE clause in the SQL UPDATE call that will cause the record not to be found and thus the call to do nothing.
Here's what works with ActiveRecord > 4.1.0:
id = 1
secondary_task = SecondaryTask.find(id)
primary_task = secondary_task.becomes!(PrimaryTask)
secondary_task.changes # => { "type"=>["SecondaryTask","PrimaryTask"] }
primary_task.changes # => { "type"=>["SecondaryTask","PrimaryTask"] }
secondary_task.save! # => true
primary_task.reload # => works because the record was updated as expected
secondary_task.reload # => raises ActiveRecord::RecordNotFound, as expected

Unit Testing Tire (Elastic Search) - Filtering Results with Method from to_indexed_json

I am testing my Tire / ElasticSearch queries and am having a problem with a custom method I'm including in to_indexed_json. For some reason, it doesn't look like it's getting indexed properly - or at least I cannot filter with it.
In my development environment, my filters and facets work fine and I am get the expected results. However in my tests, I continuously see zero results.. I cannot figure out where I'm going wrong.
I have the following:
def to_indexed_json
to_json methods: [:user_tags, :location_users]
end
For which my user_tags method looks as follows:
def user_tags
tags.map(&:content) if tags.present?
end
Tags is a polymorphic relationship with my user model:
has_many :tags, :as => :tagable
My search block looks like this:
def self.online_sales(params)
s = Tire.search('users') { query { string '*' }}
filter = []
filter << { :range => { :created_at => { :from => params[:start], :to => params[:end] } } }
filter << { :terms => { :user_tags => ['online'] }}
s.facet('online_sales') do
date :created_at, interval: 'day'
facet_filter :and, filter
end
end
end
I have checked the user_tags are included using User.last.to_indexed_json:
{"id":2,"username":"testusername", ... "user_tags":["online"] }
In my development environment, if I run the following query, I get a per day list of online sales for my users:
#sales = User.online_sales(start_date: Date.today - 100.days).results.facets["online_sales"]
"_type"=>"date_histogram", "entries"=>[{"time"=>1350950400000, "count"=>1, "min"=>6.0, "max"=>6.0, "total"=>6.0, "total_count"=>1, "mean"=>6.0}, {"time"=>1361836800000, "count"=>7, "min"=>3.0, "max"=>9.0, "total"=>39.0, "total_count"=>7, "mean"=>#<BigDecimal:7fabc07348f8,'0.5571428571 428571E1',27(27)>}....
In my unit tests, I get zero results unless I remove the facet filter..
{"online_sales"=>{"_type"=>"date_histogram", "entries"=>[]}}
My test looks like this:
it "should test the online sales facets", focus: true do
User.index.delete
User.create_elasticsearch_index
user = User.create(username: 'testusername', value: 'pass', location_id: #location.id)
user.tags.create content: 'online'
user.tags.first.content.should eq 'online'
user.index.refresh
ws = User.online_sales(start: (Date.today - 10.days), :end => Date.today)
puts ws.results.facets["online_sales"]
end
Is there something I'm missing, doing wrong or have just misunderstood to get this to pass? Thanks in advance.
-- EDIT --
It appears to be something to do with the tags relationship. I have another method, ** location_users ** which is a has_many through relationship. This is updated on index using:
def location_users
location.users.map(&:id)
end
I can see an array of location_users in the results when searching. Doesn't make sense to me why the other polymorphic relationship wouldn't work..
-- EDIT 2 --
I have fixed this by putting this in my test:
User.index.import User.all
sleep 1
Which is silly. And, I don't really understand why this works. Why?!
Elastic search by default updates it's indexes once per second.
This is a performance thing because committing your changes to Lucene (which ES uses under the hood) can be quite an expensive operation.
If you need it to update immediately include refresh=true in the URL when inserting documents. You normally don't want this since committing every time when inserting lots of documents is expensive, but unit testing is one of those cases where you do want to use it.
From the documentation:
refresh
To refresh the index immediately after the operation occurs, so that the document appears in search results immediately, the refresh parameter can be set to true. Setting this option to true should ONLY be done after careful thought and verification that it does not lead to poor performance, both from an indexing and a search standpoint. Note, getting a document using the get API is completely realtime.

Rails Cache Key generated as ActiveRecord::Relation

I am attempting to generate a fragment cache (using a Dalli/Memcached store) however the key is being generated with "#" as part of the key, so Rails doesn't seem to be recognizing that there is a cache value and is hitting the database.
My cache key in the view looks like this:
cache([#jobs, "index"]) do
The controller has:
#jobs = #current_tenant.active_jobs
With the actual Active Record query like this:
def active_jobs
self.jobs.where("published = ? and expiration_date >= ?", true, Date.today).order("(featured and created_at > now() - interval '" + self.pinned_time_limit.to_s + " days') desc nulls last, created_at desc")
end
Looking at the rails server, I see the cache read, but the SQL Query still runs:
Cache read: views/#<ActiveRecord::Relation:0x007fbabef9cd58>/1-index
Read fragment views/#<ActiveRecord::Relation:0x007fbabef9cd58>/1-index (1.0ms)
(0.6ms) SELECT COUNT(*) FROM "jobs" WHERE "jobs"."tenant_id" = 1 AND (published = 't' and expiration_date >= '2013-03-03')
Job Load (1.2ms) SELECT "jobs".* FROM "jobs" WHERE "jobs"."tenant_id" = 1 AND (published = 't' and expiration_date >= '2013-03-03') ORDER BY (featured and created_at > now() - interval '7 days') desc nulls last, created_at desc
Any ideas as to what I might be doing wrong? I'm sure it has to do w/ the key generation and ActiveRecord::Relation, but i'm not sure how.
Background:
The problem is that the string representation of the relation is different each time your code is run:
|This changes|
views/#<ActiveRecord::Relation:0x007fbabef9cd58>/...
So you get a different cache key each time.
Besides that it is not possible to get rid of database queries completely. (Your own answer is the best one can do)
Solution:
To generate a valid key, instead of this
cache([#jobs, "index"])
do this:
cache([#jobs.to_a, "index"])
This queries the database and builds an array of the models, from which the cache_key is retrieved.
PS: I could swear using relations worked in previous versions of Rails...
We've been doing exactly what you're mentioning in production for about a year. I extracted it into a gem a few months ago:
https://github.com/cmer/scope_cache_key
Basically, it allows you to use a scope as part of your cache key. There are significant performance benefits to doing so since you can now cache a page containing multiple records in a single cache element rather than looping each element in the scope and retrieving caches individually. I feel that combining this with with the standard "Russian Doll Caching" principles is optimal.
I have had similar problems, I have not been able to successfully pass relations to the cache function and your #jobs variable is a relation.
I coded up a solution for cache keys that deals with this issue along with some others that I was having. It basically involves generating a cache key by iterating through the relation.
A full write up is on my site here.
http://mark.stratmann.me/content_items/rails-caching-strategy-using-key-based-approach
In summary I added a get_cache_keys function to ActiveRecord::Base
module CacheKeys
extend ActiveSupport::Concern
# Instance Methods
def get_cache_key(prefix=nil)
cache_key = []
cache_key << prefix if prefix
cache_key << self
self.class.get_cache_key_children.each do |child|
if child.macro == :has_many
self.send(child.name).all.each do |child_record|
cache_key << child_record.get_cache_key
end
end
if child.macro == :belongs_to
cache_key << self.send(child.name).get_cache_key
end
end
return cache_key.flatten
end
# Class Methods
module ClassMethods
def cache_key_children(*args)
#v_cache_key_children = []
# validate the children
args.each do |child|
#is it an association
association = reflect_on_association(child)
if association == nil
raise "#{child} is not an association!"
end
#v_cache_key_children << association
end
end
def get_cache_key_children
return #v_cache_key_children ||= []
end
end
end
# include the extension
ActiveRecord::Base.send(:include, CacheKeys)
I can now create cache fragments by doing
cache(#model.get_cache_key(['textlabel'])) do
I've done something like Hopsoft, but it uses the method in the Rails Guide as a template. I've used the MD5 digest to distinguish between relations (so User.active.cache_key can be differentiated from User.deactivated.cache_key), and used the count and max updated_at to auto-expire the cache on updates to the relation.
require "digest/md5"
module RelationCacheKey
def cache_key
model_identifier = name.underscore.pluralize
relation_identifier = Digest::MD5.hexdigest(to_sql.downcase)
max_updated_at = maximum(:updated_at).try(:utc).try(:to_s, :number)
"#{model_identifier}/#{relation_identifier}-#{count}-#{max_updated_at}"
end
end
ActiveRecord::Relation.send :include, RelationCacheKey
While I marked #mark-stratmann 's response as correct I actually resolved this by simplifying the implementation. I added touch: true to my model relationship declaration:
belongs_to :tenant, touch: true
and then set the cache key based on the tenant (with a required query param as well):
<% cache([#current_tenant, params[:query], "#{#current_tenant.id}-index"]) do %>
That way if a new Job is added, it touches the Tenant cache as well. Not sure if this is the best route, but it works and seems pretty simple.
Im using this code:
class ActiveRecord::Base
def self.cache_key
pluck("concat_ws('/', '#{table_name}', group_concat(#{table_name}.id), date_format(max(#{table_name}.updated_at), '%Y%m%d%H%i%s'))").first
end
def self.updated_at
maximum(:updated_at)
end
end
maybe this can help you out
https://github.com/casiodk/class_cacher , it generates a cache_key from the Model itself, but maybe you can use some of the principles in the codebase
As a starting point you could try something like this:
def self.cache_key
["#{model_name.cache_key}-all",
"#{count}-#{updated_at.utc.to_s(cache_timestamp_format) rescue 'empty'}"
] * '/'
end
def self.updated_at
maximum :updated_at
end
I'm having normalized database where multiple models relate to the same other model, think of clients, locations, etc. all having addresses by means of a street_id.
With this solution you can generate cache_keys based on scope, e.g.
cache [#client, #client.locations] do
# ...
end
cache [#client, #client.locations.active, 'active'] do
# ...
end
and I could simply modify self.updated from above to also include associated objects (because has_many does not support "touch", so if I updated the street, it won't be seen by the cache otherwise):
belongs_to :street
def cache_key
[street.cache_key, super] * '/'
end
# ...
def self.updated_at
[maximum(:updated_at),
joins(:street).maximum('streets.updated_at')
].max
end
As long as you don't "undelete" records and use touch in belongs_to, you should be alright with the assumption that a cache key made of count and max updated_at is sufficient.
I'm using a simple patch on ActiveRecord::Relation to generate cache keys for relations.
require "digest/md5"
module RelationCacheKey
def cache_key
Digest::MD5.hexdigest to_sql.downcase
end
end
ActiveRecord::Relation.send :include, RelationCacheKey

Resources