Cannot alter model attribute directly when carrierwave uploader is mounted - ruby-on-rails

I am using activeadmin, and the following code is supposed to assign a photo to a Card if found in a directory.
The Card model won't accept the new photo, it never assigns the new value, save returns true.
UPDATE: The problem is that the attribute has a carrierwave uploader mounted on it. Commenting out the mount made it work, but it really isn't an acceptable solution since this is an action that will have to be made repeatedly... Is there a way to unmount the uploader or set the attribute value directly with the uploader mounted? I can't seem to find anything on that on the web...
UPDATE: Even ActiveRecord::Base.connection.execute("update cards set photo='#{f.to_s}' where id=#{card.id};") fails when carrierwave is mounted... It's taken over for good I guess
UPDATE: It turns out my question is a duplicate of
Manually updating attributes mounted by Carrierwave Uploader
from the log:
Card Load (0.5ms) SELECT `cards`.* FROM `cards` WHERE `cards`.`gatherer_id` = 244675 LIMIT 1
(0.2ms) BEGIN
Card Load (0.4ms) SELECT `cards`.* FROM `cards` WHERE `cards`.`id` = 1377 LIMIT 1
SQL (0.5ms) UPDATE `cards` SET `photo` = NULL, `updated_at` = '2013-11-22 11:20:44' WHERE `cards`.`id` = 1377
(4.1ms) COMMIT
After carrierwave is not mounted and a value for the attribute is updated just fine and afterwards, when carrierwave is mounted again it returns the file location according to the uploader store_dir normally. The only problem is I cannot set the value when the uploader is mounted. As a last resort I will do it with plain old sql, I just don't like the idea much.
activeadmin action:
ActiveAdmin.register_page 'Import FTP photos' do
menu label: 'Import FTP photos', parent: 'Cards'
#action_item do
# link_to 'View Site', '/'
#end
content do
require 'fileutils'
para 'Importing photos...'
from_dir = 'var/uploads/card_photos'
uploads_dir = 'public/uploads/card/photo/'
Dir.foreach(from_dir) do |f|
next if f == '.' or f == '..'
from_file = from_dir + '/' + f.to_s
id = f.gsub(/\.\w+$/, '')
card = Card.find_by_gatherer_id(id)
if card
to_dir = uploads_dir+card.id.to_s
to_file = to_dir + '/' + f.to_s
Dir.mkdir(to_dir) if Dir.new(to_dir).nil?
FileUtils.remove(to_file) if FileTest.exists?(to_file)
FileUtils.mv(from_file, to_file)
card.photo = f.to_s
card.save
end
end
end
end
card model
class Card < ActiveRecord::Base
mount_uploader :photo, PhotoUploader
belongs_to :card_set
belongs_to :card_artist
def to_s
caption
end
private
def card_params
params.permit!
end
end

Related

ActiveStorage how to prevent duplicate file uploads ; find by filename

I am parsing email attachments and uploading them to ActiveStorage in S3.
We would like it ignore duplicates but i cannot see to query by these attributes.
class Task < ApplicationRecord
has_many_attached :documents
end
then in my email webhook job
attachments.each do |attachment|
tempfile = open(attachment[:url], http_basic_authentication: ["api", ENV.fetch("MAILGUN_API_KEY")])
# i'd like to do something like this
next if task.documents.where(filename: tempfile.filename, bytesize: temfile.bytesize).exist?
# this is what i'm currently doing
task.documents.attach(
io: tempfile,
filename: attachment[:name],
content_type: attachment[:content_type]
)
end
Unfortunately if someone forwards the same files, we've got duplicated and often more.
Edit with current solution:
tempfile = open(attachment[:url], http_basic_authentication: ["api", ENV.fetch("MAILGUN_API_KEY")])
md5_digest = Digest::MD5.file(tempfile).base64digest
# if this digest already exists as attached to the file then we're all good.
next if ActiveStorage::Blob.joins(:attachments).where({
checksum: md5_digest,
active_storage_attachments: {name: 'documents', record_type: 'Task', record_id: task.id
}).exists?
Rails utilizes 2 tables for storing attachment data; active_storage_attachments and active_storage_blobs
The active_storage_blobs table houses a checksum of the uploaded file.
You can easily join this table to verify the existence of a file.
Going from #gustavo's answer I came up with the following:
attachments.each do |attachment|
tempfile = TempFile.new
tempfile.write open(attachment[:url], http_basic_authentication: ["api", ENV.fetch("MAILGUN_API_KEY")])
checksum = Digest::MD5.file(tempfile.path).base64digest
if task.documents.joins(:documents_blobs).exists?(active_storage_blobs: {checksum: checksum})
tempfile.unlink
next
end
#... Your attachment saving code here
end
Note: Remember to require 'tempfile' in the class where you are using this
What happens if they change the filename anyway (which happens many times with things like filename(2).xlsx) but the content is the same?
Maybe a better approach would be to compare the checksum? I believe that the ActiveStorage object will already store that, for saved files. You could do something like:
attachments.each do |attachment|
tempfile = open(attachment[:url], http_basic_authentication: ["api", ENV.fetch("MAILGUN_API_KEY")])
checksum = Digest::MD5.file(tempfile.path).base64digest
# i'd like to do something like this
next if task.documents.where(checksum: checksum).exist?
#...
end
That way you know it is the same physical file regardless of the incoming filename.

ActiveRecord #becomes! record not saving

I've got some STI in my data model. There are two types of Task records: PrimaryTask and SecondaryTask. So my ActiveRecord models look like this:
class Task < ActiveRecord::Base
end
class PrimaryTask < Task
has_many :secondary_tasks
end
class SecondaryTask < Task
belongs_to :primary_task
end
I want to provide a way to "promote" a SecondaryTask to a PrimaryTask permanently (as in, persisted in the database). From perusing the docs, looks like the #becomes! method is what I want, but I can't get it to save the changes in the database.
id = 1
secondary_task = SecondaryTask.find(id)
primary_task = secondary_task.becomes!(PrimaryTask)
primary_task.id # => 1
primary_task.class # => PrimaryTask
primary_task.type # => "PrimaryTask"
primary_task.new_record? # => false
primary_task.changes # => { "type"=>[nil,"PrimaryTask"] }
primary_task.save! # => true
primary_task.reload # => raises ActiveRecord::RecordNotFound: Couldn't find PrimaryTask with id=1 [WHERE "tasks"."type" IN ('PrimaryTask')]
# Note: secondary_task.reload works fine, because the task's type did not change in the DB
Any idea what's up? I tried the following things, to no avail. Am I misunderstanding becomes!?
Force the record to be 'dirty' in case the save! call was a no-op because none of the attributes were marked dirty (primary_task.update_attributes(updated_at: Time.current) -- didn't help)
Destroy secondary_task in case the fact that they both have the same id was a problem. Didn't help. The SecondaryTask record was deleted but no PrimaryTask was created (despite the call to save! returning true)
UPDATE 1
The logs show the probable issue:
UPDATE "tasks" SET "type" = $1 WHERE "tasks"."type" IN ('PrimaryTask') AND "tasks"."id" = 2 [["type", "PrimaryTask"]]
So the update is failing because the WHERE clause causes the record not to be found.
Figured it out. Turns out there was a bug in ActiveRecord version 4.0.0. It has since been patched. The key change this patch introduced was to set the changes correctly in both instances. So now you can call save on the original instance (in my example secondary_task) and it will change the type in the database. Note that calling save on the new instance (for me primary_task) will NOT save the changes, because of the behavior described in the question: it will include a WHERE clause in the SQL UPDATE call that will cause the record not to be found and thus the call to do nothing.
Here's what works with ActiveRecord > 4.1.0:
id = 1
secondary_task = SecondaryTask.find(id)
primary_task = secondary_task.becomes!(PrimaryTask)
secondary_task.changes # => { "type"=>["SecondaryTask","PrimaryTask"] }
primary_task.changes # => { "type"=>["SecondaryTask","PrimaryTask"] }
secondary_task.save! # => true
primary_task.reload # => works because the record was updated as expected
secondary_task.reload # => raises ActiveRecord::RecordNotFound, as expected

Ruby class attributes not querying correctly

I have a class in my Ruby on Rails project for Institutions that I recently had to add an attribute to, an :identifier. The class has a custom metadata field that accompanies it for searching and indexing purposes. The problem is, the new attribute I added isn't helping me find objects the way I wanted. If I try to query for an object using the :identifier to do so I get consistently get an empty array. And yes, I have checked multiple times to ensure that the test object actually exists.
This is the model:
class Institution < ActiveFedora::Base
include Hydra::AccessControls::Permissions
# NOTE with rdf datastreams must query like so ins = Institution.where(desc_metadata__name_tesim: "APTrust")
has_metadata "rightsMetadata", type: Hydra::Datastream::RightsMetadata
has_metadata 'descMetadata', type: InstitutionMetadata
has_many :intellectual_objects, property: :is_part_of
has_attributes :name, :brief_name, :identifier, datastream: 'descMetadata', multiple: false
validates :name, :identifier, presence: true
validate :name_is_unique
validate :identifier_is_unique
def users
User.where(institution_pid: self.pid).to_a.sort_by(&:name)
end
private
def name_is_unique
errors.add(:name, "has already been taken") if Institution.where(desc_metadata__name_ssim: self.name).reject{|r| r == self}.any?
end
def identifier_is_unique
count = 0;
Institution.all.each do |inst|
count += 1 if inst.identifier == self.identifier
end
if(count > 0)
errors.add(:identifier, "has already been taken")
end
#errors.add(:identifier, "has already been taken") if Institution.where(desc_metadata__identifier_ssim: self.identifier).reject{|r| r.identifier == self.identifier}.any?
end
end
As you can see, I had to write a very different method to check for the uniqueness of an identifier because the .where method wasn't returning anything. I didn't realize that was the problem though until I started working on the show model in the controller (below):
def show
identifier = params[:identifier] << "." << params[:format]
#institution = Institution.where(desc_metadata__identifier_ssim: identifier)
end
This never returns anything even though I have several Institution objects in my database and have double and triple checked that the URL parameters are correct. And part of double checking that was searching for objects in the console. Here's the output:
Loading development environment (Rails 4.0.3)
2.0.0-p353 :001 > ap = Institution.where(desc_metadata__name_ssim: "APTrust")
ActiveFedora: loading fedora config from /Users/kec6en/HydraApp/fluctus/config/fedora.yml
ActiveFedora: loading solr config from /Users/kec6en/HydraApp/fluctus/config/solr.yml
Loaded datastream list for aptrust-dev:379 (3.2ms)
Loaded datastream profile aptrust-dev:379/RELS-EXT (2.7ms)
Loaded datastream content aptrust-dev:379/RELS-EXT (2.4ms)
Loaded datastream profile aptrust-dev:379/descMetadata (2.6ms)
Loaded datastream profile aptrust-dev:379/descMetadata (3.5ms)
Loaded datastream content aptrust-dev:379/descMetadata (3.1ms)
=> [#<Institution pid: "aptrust-dev:379", name: "APTrust", brief_name: "apt", identifier: "aptrust.org">]
2.0.0-p353 :002 > apt = Institution.where(desc_metadata__identifier_ssim: "aptrust.org")
=> []
2.0.0-p353 :003 >
As you can see, I'm querying for an identifier that does exist but it's not finding anything. For reference, here is the metadata datastream that I'm working off of. Note that the identifier is indexed as :stored_searchable so I should be able to query for it.
class InstitutionMetadata < ActiveFedora::RdfxmlRDFDatastream
map_predicates do |map|
map.name(in: RDF::DC, to: 'title') { |index| index.as :symbol, :stored_searchable }
map.brief_name(in: RDF::DC, to: 'alternative')
map.identifier(in: RDF::DC, to: 'identifier') { |index| index.as :symbol, :stored_searchable }
end
end
I modeled it after the name attribute because that one appears to be working. Any ideas why the identifier isn't?
Thanks!
Why don't you just search the identifier directly instead of using the desc_metadata?
Institution.where identifier: "aptrust.org"
Using desc_metadata__identifier_tesim instead of desc_metadata__identifier_ssim seems to work for finding objects, although it still doesn't work for the uniquess checking method I wrote.

Rails Cache Key generated as ActiveRecord::Relation

I am attempting to generate a fragment cache (using a Dalli/Memcached store) however the key is being generated with "#" as part of the key, so Rails doesn't seem to be recognizing that there is a cache value and is hitting the database.
My cache key in the view looks like this:
cache([#jobs, "index"]) do
The controller has:
#jobs = #current_tenant.active_jobs
With the actual Active Record query like this:
def active_jobs
self.jobs.where("published = ? and expiration_date >= ?", true, Date.today).order("(featured and created_at > now() - interval '" + self.pinned_time_limit.to_s + " days') desc nulls last, created_at desc")
end
Looking at the rails server, I see the cache read, but the SQL Query still runs:
Cache read: views/#<ActiveRecord::Relation:0x007fbabef9cd58>/1-index
Read fragment views/#<ActiveRecord::Relation:0x007fbabef9cd58>/1-index (1.0ms)
(0.6ms) SELECT COUNT(*) FROM "jobs" WHERE "jobs"."tenant_id" = 1 AND (published = 't' and expiration_date >= '2013-03-03')
Job Load (1.2ms) SELECT "jobs".* FROM "jobs" WHERE "jobs"."tenant_id" = 1 AND (published = 't' and expiration_date >= '2013-03-03') ORDER BY (featured and created_at > now() - interval '7 days') desc nulls last, created_at desc
Any ideas as to what I might be doing wrong? I'm sure it has to do w/ the key generation and ActiveRecord::Relation, but i'm not sure how.
Background:
The problem is that the string representation of the relation is different each time your code is run:
|This changes|
views/#<ActiveRecord::Relation:0x007fbabef9cd58>/...
So you get a different cache key each time.
Besides that it is not possible to get rid of database queries completely. (Your own answer is the best one can do)
Solution:
To generate a valid key, instead of this
cache([#jobs, "index"])
do this:
cache([#jobs.to_a, "index"])
This queries the database and builds an array of the models, from which the cache_key is retrieved.
PS: I could swear using relations worked in previous versions of Rails...
We've been doing exactly what you're mentioning in production for about a year. I extracted it into a gem a few months ago:
https://github.com/cmer/scope_cache_key
Basically, it allows you to use a scope as part of your cache key. There are significant performance benefits to doing so since you can now cache a page containing multiple records in a single cache element rather than looping each element in the scope and retrieving caches individually. I feel that combining this with with the standard "Russian Doll Caching" principles is optimal.
I have had similar problems, I have not been able to successfully pass relations to the cache function and your #jobs variable is a relation.
I coded up a solution for cache keys that deals with this issue along with some others that I was having. It basically involves generating a cache key by iterating through the relation.
A full write up is on my site here.
http://mark.stratmann.me/content_items/rails-caching-strategy-using-key-based-approach
In summary I added a get_cache_keys function to ActiveRecord::Base
module CacheKeys
extend ActiveSupport::Concern
# Instance Methods
def get_cache_key(prefix=nil)
cache_key = []
cache_key << prefix if prefix
cache_key << self
self.class.get_cache_key_children.each do |child|
if child.macro == :has_many
self.send(child.name).all.each do |child_record|
cache_key << child_record.get_cache_key
end
end
if child.macro == :belongs_to
cache_key << self.send(child.name).get_cache_key
end
end
return cache_key.flatten
end
# Class Methods
module ClassMethods
def cache_key_children(*args)
#v_cache_key_children = []
# validate the children
args.each do |child|
#is it an association
association = reflect_on_association(child)
if association == nil
raise "#{child} is not an association!"
end
#v_cache_key_children << association
end
end
def get_cache_key_children
return #v_cache_key_children ||= []
end
end
end
# include the extension
ActiveRecord::Base.send(:include, CacheKeys)
I can now create cache fragments by doing
cache(#model.get_cache_key(['textlabel'])) do
I've done something like Hopsoft, but it uses the method in the Rails Guide as a template. I've used the MD5 digest to distinguish between relations (so User.active.cache_key can be differentiated from User.deactivated.cache_key), and used the count and max updated_at to auto-expire the cache on updates to the relation.
require "digest/md5"
module RelationCacheKey
def cache_key
model_identifier = name.underscore.pluralize
relation_identifier = Digest::MD5.hexdigest(to_sql.downcase)
max_updated_at = maximum(:updated_at).try(:utc).try(:to_s, :number)
"#{model_identifier}/#{relation_identifier}-#{count}-#{max_updated_at}"
end
end
ActiveRecord::Relation.send :include, RelationCacheKey
While I marked #mark-stratmann 's response as correct I actually resolved this by simplifying the implementation. I added touch: true to my model relationship declaration:
belongs_to :tenant, touch: true
and then set the cache key based on the tenant (with a required query param as well):
<% cache([#current_tenant, params[:query], "#{#current_tenant.id}-index"]) do %>
That way if a new Job is added, it touches the Tenant cache as well. Not sure if this is the best route, but it works and seems pretty simple.
Im using this code:
class ActiveRecord::Base
def self.cache_key
pluck("concat_ws('/', '#{table_name}', group_concat(#{table_name}.id), date_format(max(#{table_name}.updated_at), '%Y%m%d%H%i%s'))").first
end
def self.updated_at
maximum(:updated_at)
end
end
maybe this can help you out
https://github.com/casiodk/class_cacher , it generates a cache_key from the Model itself, but maybe you can use some of the principles in the codebase
As a starting point you could try something like this:
def self.cache_key
["#{model_name.cache_key}-all",
"#{count}-#{updated_at.utc.to_s(cache_timestamp_format) rescue 'empty'}"
] * '/'
end
def self.updated_at
maximum :updated_at
end
I'm having normalized database where multiple models relate to the same other model, think of clients, locations, etc. all having addresses by means of a street_id.
With this solution you can generate cache_keys based on scope, e.g.
cache [#client, #client.locations] do
# ...
end
cache [#client, #client.locations.active, 'active'] do
# ...
end
and I could simply modify self.updated from above to also include associated objects (because has_many does not support "touch", so if I updated the street, it won't be seen by the cache otherwise):
belongs_to :street
def cache_key
[street.cache_key, super] * '/'
end
# ...
def self.updated_at
[maximum(:updated_at),
joins(:street).maximum('streets.updated_at')
].max
end
As long as you don't "undelete" records and use touch in belongs_to, you should be alright with the assumption that a cache key made of count and max updated_at is sufficient.
I'm using a simple patch on ActiveRecord::Relation to generate cache keys for relations.
require "digest/md5"
module RelationCacheKey
def cache_key
Digest::MD5.hexdigest to_sql.downcase
end
end
ActiveRecord::Relation.send :include, RelationCacheKey

In ActiveRecord how do I use 'changed' (dirty) in a before_save callback?

I want to set my summary field to a sanitized version of the body field, but only if the user does not supply their own summary ie. params[:document][:summary] is blank.
This appears to work fine if I create a new record, if I enter a summary it is saved, if I don't the body is used to generate the summary.
However when I update the record the summary always gets overridden. From my log files I can see that 'generate_summary' gets called twice, the second time the 'changes' hash is empty.
class Document << ActiveRecord::Base
# Callbacks
before_save :generate_summary
private
def generate_summary
#counter ||= 1
logger.debug '$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$'
logger.debug #counter.to_s
logger.debug 'changes: ' + self.changes.inspect
self.summary = Sanitize.clean(self.body).to(255) if self.body && (!self.summary_changed? or self.summary.blank?)
#counter = #counter + 1
end
Log on Update:
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
1
changes: {"summary"=>["asdasdasdasd", "three.co.uk"]}
Page Update (0.7ms) UPDATE documents SET meta_description = 'three.co.uk', summary = 'three.co.uk', updated_at = '2009-09-30 11:37:08' WHERE id = 77
SQL (0.6ms) COMMIT
SQL (0.1ms) BEGIN
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
2
changes: {}
Page Update (0.5ms) UPDATE documents SET meta_description = 'asdasdasdasd', summary = 'asdasdasdasd', updated_at = '2009-09-30 11:37:08' WHERE id = 77
Your controller probably saves twice as said by #nasmorn. You can also check that your body as changed before updating your summary.
if self.body_changed? && (!self.summary_changed? or self.summary.blank?)
self.summary = Sanitize.clean(self.body).to(255)
end
Only logical explanation is that the controller somehow saves twice.
Is this log from the console where you call update on the record or is it from a real request that comes through the controller?
It seems 'update_attributes' triggers the before_save callback, so in my controller 'generate_summary' is called twice once by 'update_attributes' and once by 'save'. This is not expected behaviour.
Checking the body has changed as suggested by #vincent seems to prevent the unexpected behaviour.
The way I bypass the multiple before_save calls is introducing
attr_accessor :object_saved
and them inside the call back method
before_save :before_save_method
I do this
def before_save_method
if self.object_saved.nil?
self.object_saved = true
# Do Something
end
end

Resources