Rails fixtures use empty_clob() with CLOB fields but nothing else

Rails fixtures use empty_clob() with CLOB fields but nothing else - ruby-on-rails

I'm back with Rails fixtures after seeing they were much improved since the last time I used them.
#models.yml
one:
id: 1
clob_field: "My Text"
When the models fixture is loaded into the DB - I can see that the clob text (My Text) is substituted with an empty_clob() call (in the insert statement)
According to my understanding, the Oracle enhanced adapter should make another update statement that sets the clob_field appropriately - but this doesn't get executed (and the value remains blank).
Any idea why that is?

I traced the fixture loading and found out that that was due to a mixture of specifying a schema_name along with the table_name (self.table_name = "SCHEMA_OWNER.TABLE_NAME") as well as using upper-case TABLE_NAMES.
I've worked-around the issue by overriding insert_fixture method (in the oracle-enhanced adapter) to properly manipulate table_name.
Now the write_lobs is being called correctly.
UPDATE
Here's the change as requested by #jeff-k
# config/initializers/oracle_enhanced_adapter.rb
...
ActiveSupport.on_load(:active_record) do
ActiveRecord::ConnectionAdapters::OracleEnhancedAdapter.class_eval do
# Overriding this method to account for including the schema_name in the table name
# which is implemented to work around another limitation of having the schema_owner different
# than the connected user
#
# Inserts the given fixture into the table. Overridden to properly handle lobs.
def insert_fixture(fixture, table_name) #:nodoc:
super
if table_name =~ /\./i
table_name = table_name.downcase.split('.')[1]
end
if ActiveRecord::Base.pluralize_table_names
klass = table_name.to_s.singularize.camelize
else
klass = table_name.to_s.camelize
end
klass = klass.constantize rescue nil
if klass.respond_to?(:ancestors) && klass.ancestors.include?(ActiveRecord::Base)
write_lobs(table_name, klass, fixture, klass.lob_columns)
end
end
end
end

Related

solr, sunspot, bad request, illegal character

I am introducing sunspot search into my project. I got a POC by just searching by the name field. When I introduced the description field and reindexed sold I get the following error.
** Invoke sunspot:reindex (first_time)
** Invoke environment (first_time)
** Execute environment
** Execute sunspot:reindex
Skipping progress bar: for progress reporting, add gem 'progress_bar' to your Gemfile
rake aborted!
RSolr::Error::Http: RSolr::Error::Http - 400 Bad Request
Error: {'responseHeader'=>{'status'=>400,'QTime'=>18},'error'=>{'msg'=>'Illegal character ((CTRL-CHAR, code 11))
at [row,col {unknown-source}]: [42,1]','code'=>400}}
Request Data: "<?xml version=\"1.0\" encoding=\"UTF-8\"?><add><doc><field name=\"id\">ItemsDesign 1322</field><field name=\"type\">ItemsDesign</field><field name=\"type\">ActiveRecord::Base</field><field name=\"class_name\">ItemsDesign</field><field name=\"name_text\">River City Clocks Musical Multi-Colored Quartz Cuckoo Clock</field><field name=\"description_text\">This colorful chalet style German quartz cuckoo clock accurately keeps time and plays 12 different melodies. Many colorful flowers are painted on the clock case and figures of a Saint Bernard and Alpine horn player are on each side of the clock dial. Two decorative pine cone weights are suspended beneath the clock case by two chains. The heart shaped pendulum continously swings back and forth.
On every
I assuming that the bad char is 
 that you can see at the bottom. that 
 is littered in a lot of the descriptions. I'm not even sure what char that is.
What can I do to get solr to ignore it or clean the data so that sold can handle it.
Thanks

Put the following in an initializer to automatically clean sunspot calls of any UTF8 control characters:
# config/initializers/sunspot.rb
module Sunspot
#
# DataExtractors present an internal API for the indexer to use to extract
# field values from models for indexing. They must implement the #value_for
# method, which takes an object and returns the value extracted from it.
#
module DataExtractor #:nodoc: all
#
# AttributeExtractors extract data by simply calling a method on the block.
#
class AttributeExtractor
def initialize(attribute_name)
#attribute_name = attribute_name
end
def value_for(object)
Filter.new( object.send(#attribute_name) ).value
end
end
#
# BlockExtractors extract data by evaluating a block in the context of the
# object instance, or if the block takes an argument, by passing the object
# as the argument to the block. Either way, the return value of the block is
# the value returned by the extractor.
#
class BlockExtractor
def initialize(&block)
#block = block
end
def value_for(object)
Filter.new( Util.instance_eval_or_call(object, &#block) ).value
end
end
#
# Constant data extractors simply return the same value for every object.
#
class Constant
def initialize(value)
#value = value
end
def value_for(object)
Filter.new(#value).value
end
end
#
# A Filter to allow easy value cleaning
#
class Filter
def initialize(value)
#value = value
end
def value
strip_control_characters #value
end
def strip_control_characters(value)
return value unless value.is_a? String
value.chars.inject("") do |str, char|
unless char.ascii_only? and (char.ord < 32 or char.ord == 127)
str << char
end
str
end
end
end
end
end
Source (Sunspot Github Issues): Sunspot Solr Reindexing failing due to illegal characters

I tried the solution #thekingoftruth proposed, however it did not solve the problem. Found an alternative version of the Filter class in the same github thread that he links to and that solved my problem.
The main difference was the i use nested models through HABTM relationships.
This is my search block in the model:
searchable do
text :name, :description, :excerpt
text :venue_name do
venue.name if venue.present?
end
text :artist_name do
artists.map { |a| a.name if a.present? } if artists.present?
end
end
Here is the initializer that worked for me:
(in: config/initializers/sunspot.rb)
module Sunspot
#
# DataExtractors present an internal API for the indexer to use to extract
# field values from models for indexing. They must implement the #value_for
# method, which takes an object and returns the value extracted from it.
#
module DataExtractor #:nodoc: all
#
# AttributeExtractors extract data by simply calling a method on the block.
#
class AttributeExtractor
def initialize(attribute_name)
#attribute_name = attribute_name
end
def value_for(object)
Filter.new( object.send(#attribute_name) ).value
end
end
#
# BlockExtractors extract data by evaluating a block in the context of the
# object instance, or if the block takes an argument, by passing the object
# as the argument to the block. Either way, the return value of the block is
# the value returned by the extractor.
#
class BlockExtractor
def initialize(&block)
#block = block
end
def value_for(object)
Filter.new( Util.instance_eval_or_call(object, &#block) ).value
end
end
#
# Constant data extractors simply return the same value for every object.
#
class Constant
def initialize(value)
#value = value
end
def value_for(object)
Filter.new(#value).value
end
end
#
# A Filter to allow easy value cleaning
#
class Filter
def initialize(value)
#value = value
end
def value
if #value.is_a? String
strip_control_characters_from_string #value
elsif #value.is_a? Array
#value.map { |v| strip_control_characters_from_string v }
elsif #value.is_a? Hash
#value.inject({}) do |hash, (k, v)|
hash.merge( strip_control_characters_from_string(k) => strip_control_characters_from_string(v) )
end
else
#value
end
end
def strip_control_characters_from_string(value)
return value unless value.is_a? String
value.chars.inject("") do |str, char|
unless char.ascii_only? && (char.ord < 32 || char.ord == 127)
str << char
end
str
end
end
end
end
end

You need to get rid of control characters from UTF8 while saving your content. Solr will not reindex this properly and throw this error.
http://en.wikipedia.org/wiki/UTF-8#Codepage_layout
You can use something like this:
name.gsub!(/\p{Cc}/, "")
edit:
If you want to override it globally I think it could be possible by overriding value_for_methods in AttributeExtractor and if needed BlockExtractor.
https://github.com/sunspot/sunspot/blob/master/sunspot/lib/sunspot/data_extractor.rb
I wasn't checking this.
If you manage to add some global patch, please let me know.
I had lately same issue.

Rails Cache Key generated as ActiveRecord::Relation

I am attempting to generate a fragment cache (using a Dalli/Memcached store) however the key is being generated with "#" as part of the key, so Rails doesn't seem to be recognizing that there is a cache value and is hitting the database.
My cache key in the view looks like this:
cache([#jobs, "index"]) do
The controller has:
#jobs = #current_tenant.active_jobs
With the actual Active Record query like this:
def active_jobs
self.jobs.where("published = ? and expiration_date >= ?", true, Date.today).order("(featured and created_at > now() - interval '" + self.pinned_time_limit.to_s + " days') desc nulls last, created_at desc")
end
Looking at the rails server, I see the cache read, but the SQL Query still runs:
Cache read: views/#<ActiveRecord::Relation:0x007fbabef9cd58>/1-index
Read fragment views/#<ActiveRecord::Relation:0x007fbabef9cd58>/1-index (1.0ms)
(0.6ms) SELECT COUNT(*) FROM "jobs" WHERE "jobs"."tenant_id" = 1 AND (published = 't' and expiration_date >= '2013-03-03')
Job Load (1.2ms) SELECT "jobs".* FROM "jobs" WHERE "jobs"."tenant_id" = 1 AND (published = 't' and expiration_date >= '2013-03-03') ORDER BY (featured and created_at > now() - interval '7 days') desc nulls last, created_at desc
Any ideas as to what I might be doing wrong? I'm sure it has to do w/ the key generation and ActiveRecord::Relation, but i'm not sure how.

Background:
The problem is that the string representation of the relation is different each time your code is run:
|This changes|
views/#<ActiveRecord::Relation:0x007fbabef9cd58>/...
So you get a different cache key each time.
Besides that it is not possible to get rid of database queries completely. (Your own answer is the best one can do)
Solution:
To generate a valid key, instead of this
cache([#jobs, "index"])
do this:
cache([#jobs.to_a, "index"])
This queries the database and builds an array of the models, from which the cache_key is retrieved.
PS: I could swear using relations worked in previous versions of Rails...

We've been doing exactly what you're mentioning in production for about a year. I extracted it into a gem a few months ago:
https://github.com/cmer/scope_cache_key
Basically, it allows you to use a scope as part of your cache key. There are significant performance benefits to doing so since you can now cache a page containing multiple records in a single cache element rather than looping each element in the scope and retrieving caches individually. I feel that combining this with with the standard "Russian Doll Caching" principles is optimal.

I have had similar problems, I have not been able to successfully pass relations to the cache function and your #jobs variable is a relation.
I coded up a solution for cache keys that deals with this issue along with some others that I was having. It basically involves generating a cache key by iterating through the relation.
A full write up is on my site here.
http://mark.stratmann.me/content_items/rails-caching-strategy-using-key-based-approach
In summary I added a get_cache_keys function to ActiveRecord::Base
module CacheKeys
extend ActiveSupport::Concern
# Instance Methods
def get_cache_key(prefix=nil)
cache_key = []
cache_key << prefix if prefix
cache_key << self
self.class.get_cache_key_children.each do |child|
if child.macro == :has_many
self.send(child.name).all.each do |child_record|
cache_key << child_record.get_cache_key
end
end
if child.macro == :belongs_to
cache_key << self.send(child.name).get_cache_key
end
end
return cache_key.flatten
end
# Class Methods
module ClassMethods
def cache_key_children(*args)
#v_cache_key_children = []
# validate the children
args.each do |child|
#is it an association
association = reflect_on_association(child)
if association == nil
raise "#{child} is not an association!"
end
#v_cache_key_children << association
end
end
def get_cache_key_children
return #v_cache_key_children ||= []
end
end
end
# include the extension
ActiveRecord::Base.send(:include, CacheKeys)
I can now create cache fragments by doing
cache(#model.get_cache_key(['textlabel'])) do

I've done something like Hopsoft, but it uses the method in the Rails Guide as a template. I've used the MD5 digest to distinguish between relations (so User.active.cache_key can be differentiated from User.deactivated.cache_key), and used the count and max updated_at to auto-expire the cache on updates to the relation.
require "digest/md5"
module RelationCacheKey
def cache_key
model_identifier = name.underscore.pluralize
relation_identifier = Digest::MD5.hexdigest(to_sql.downcase)
max_updated_at = maximum(:updated_at).try(:utc).try(:to_s, :number)
"#{model_identifier}/#{relation_identifier}-#{count}-#{max_updated_at}"
end
end
ActiveRecord::Relation.send :include, RelationCacheKey

While I marked #mark-stratmann 's response as correct I actually resolved this by simplifying the implementation. I added touch: true to my model relationship declaration:
belongs_to :tenant, touch: true
and then set the cache key based on the tenant (with a required query param as well):
<% cache([#current_tenant, params[:query], "#{#current_tenant.id}-index"]) do %>
That way if a new Job is added, it touches the Tenant cache as well. Not sure if this is the best route, but it works and seems pretty simple.

Im using this code:
class ActiveRecord::Base
def self.cache_key
pluck("concat_ws('/', '#{table_name}', group_concat(#{table_name}.id), date_format(max(#{table_name}.updated_at), '%Y%m%d%H%i%s'))").first
end
def self.updated_at
maximum(:updated_at)
end
end

maybe this can help you out
https://github.com/casiodk/class_cacher , it generates a cache_key from the Model itself, but maybe you can use some of the principles in the codebase

As a starting point you could try something like this:
def self.cache_key
["#{model_name.cache_key}-all",
"#{count}-#{updated_at.utc.to_s(cache_timestamp_format) rescue 'empty'}"
] * '/'
end
def self.updated_at
maximum :updated_at
end
I'm having normalized database where multiple models relate to the same other model, think of clients, locations, etc. all having addresses by means of a street_id.
With this solution you can generate cache_keys based on scope, e.g.
cache [#client, #client.locations] do
# ...
end
cache [#client, #client.locations.active, 'active'] do
# ...
end
and I could simply modify self.updated from above to also include associated objects (because has_many does not support "touch", so if I updated the street, it won't be seen by the cache otherwise):
belongs_to :street
def cache_key
[street.cache_key, super] * '/'
end
# ...
def self.updated_at
[maximum(:updated_at),
joins(:street).maximum('streets.updated_at')
].max
end
As long as you don't "undelete" records and use touch in belongs_to, you should be alright with the assumption that a cache key made of count and max updated_at is sufficient.

I'm using a simple patch on ActiveRecord::Relation to generate cache keys for relations.
require "digest/md5"
module RelationCacheKey
def cache_key
Digest::MD5.hexdigest to_sql.downcase
end
end
ActiveRecord::Relation.send :include, RelationCacheKey

How can I hide a column from a model in Rails 3.2?

Prior to Rails 3.1, we could update the self.columns method of ActiveRecord::Base.
But that doesn't seem to work now.
Now it seems if I remove a column from a table, I am forced to restart the Rails server. If I don't I keep getting errors when INSERTs to the table happen. Rails still thinks the old column exists, even though it's not in the database anymore.

Active Record does not support this out of the box, because it queries the database to get the columns of a model (unlike Merb's ORM tool, Datamapper).
Nonetheless, you can patch this feature on Rails with (assuming, for instance, you want to ignore columns starting with "deprecated" string):
module ActiveRecord
module ConnectionAdapters
class SchemaCache
def initialize(conn)
#connection = conn
#tables = {}
#columns = Hash.new do |h, table_name|
columns = conn.columns(table_name, "#{table_name} Columns").reject { |c| c.name.start_with? "deprecated"}
h[table_name] = columns
end
#columns_hash = Hash.new do |h, table_name|
h[table_name] = Hash[columns[table_name].map { |col|
[col.name, col]
}]
end
#primary_keys = Hash.new do |h, table_name|
h[table_name] = table_exists?(table_name) ? conn.primary_key(table_name) : nil
end
end
end
end
end

You can clear the ActiveRecord schema cache:
ActiveRecord::Base.connection.schema_cache.clear_table_cache(:table_name)!
Then it'll be reloaded the next time you reference a model that uses that table.

How to test the number of database calls in Rails

I am creating a REST API in rails. I'm using RSpec. I'd like to minimize the number of database calls, so I would like to add an automatic test that verifies the number of database calls being executed as part of a certain action.
Is there a simple way to add that to my test?
What I'm looking for is some way to monitor/record the calls that are being made to the database as a result of a single API call.
If this can't be done with RSpec but can be done with some other testing tool, that's also great.

The easiest thing in Rails 3 is probably to hook into the notifications api.
This subscriber
class SqlCounter< ActiveSupport::LogSubscriber
def self.count= value
Thread.current['query_count'] = value
end
def self.count
Thread.current['query_count'] || 0
end
def self.reset_count
result, self.count = self.count, 0
result
end
def sql(event)
self.class.count += 1
puts "logged #{event.payload[:sql]}"
end
end
SqlCounter.attach_to :active_record
will print every executed sql statement to the console and count them. You could then write specs such as
expect do
# do stuff
end.to change(SqlCounter, :count).by(2)
You'll probably want to filter out some statements, such as ones starting/committing transactions or the ones active record emits to determine the structures of tables.

You may be interested in using explain. But that won't be automatic. You will need to analyse each action manually. But maybe that is a good thing, since the important thing is not the number of db calls, but their nature. For example: Are they using indexes?
Check this:
http://weblog.rubyonrails.org/2011/12/6/what-s-new-in-edge-rails-explain/

Use the db-query-matchers gem.
expect { subject.make_one_query }.to make_database_queries(count: 1)

Fredrick's answer worked great for me, but in my case, I also wanted to know the number of calls for each ActiveRecord class individually. I made some modifications and ended up with this in case it's useful for others.
class SqlCounter< ActiveSupport::LogSubscriber
# Returns the number of database "Loads" for a given ActiveRecord class.
def self.count(clazz)
name = clazz.name + ' Load'
Thread.current['log'] ||= {}
Thread.current['log'][name] || 0
end
# Returns a list of ActiveRecord classes that were counted.
def self.counted_classes
log = Thread.current['log']
loads = log.keys.select {|key| key =~ /Load$/ }
loads.map { |key| Object.const_get(key.split.first) }
end
def self.reset_count
Thread.current['log'] = {}
end
def sql(event)
name = event.payload[:name]
Thread.current['log'] ||= {}
Thread.current['log'][name] ||= 0
Thread.current['log'][name] += 1
end
end
SqlCounter.attach_to :active_record
expect do
# do stuff
end.to change(SqlCounter, :count).by(2)

Rails - Exclude an attribute from being saved

I have a column named updated_at in postgres. I'm trying to have the db set the time by default. But Rails still executes the query updated_at=NULL. But postgres will only set the timestamp by default when updated_at is not in the query at all.
How do I have Rails exclude a column?

You can disable this behaviour by setting ActiveRecord::Base class variable
record_timestamps to false.
In config/environment.rb, Rails::Initializer.run block :
config.active_record.record_timestamps = false
(if this doesn't work, try instead ActiveRecord::Base.record_timestamps = false at the end of the file)
If you want to set only for a given model :
class Foo < ActiveRecord::Base
self.record_timestamps = false
end
Credit to Jean-FranÃ§ois at http://www.ruby-forum.com/topic/72569

I've been running into a similar issue in Rails 2.2.2. As of this version there is an attr_readonly method in ActiveRecord but create doesn't respect it, only update. I don't know if this has been changed in the latest version. I overrode the create method to force is to respect this setting.
def create
if self.id.nil? && connection.prefetch_primary_key?(self.class.table_name)
self.id = connection.next_sequence_value(self.class.sequence_name)
end
quoted_attributes = attributes_with_quotes(true, false)
statement = if quoted_attributes.empty?
connection.empty_insert_statement(self.class.table_name)
else
"INSERT INTO #{self.class.quoted_table_name} " +
"(#{quoted_attributes.keys.join(', ')}) " +
"VALUES(#{quoted_attributes.values.join(', ')})"
end
self.id = connection.insert(statement, "#{self.class.name} Create",
self.class.primary_key, self.id, self.class.sequence_name)
#new_record = false
id
end
The change is just to pass false as the second parameter to attributes_with_quotes, and use quoted_attributes.keys for the column names when building the SQL. This has worked for me. The downside is that by overriding this you will lose before_create and after_create callbacks, and I haven't had time to dig into it enough to figure out why. If anyone cares to expand/improve on this solution or offer a better solution, I'm all ears.

Categories

HOME

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Rails fixtures use empty_clob() with CLOB fields but nothing else - ruby-on-rails

Related

solr, sunspot, bad request, illegal character

Rails Cache Key generated as ActiveRecord::Relation

How can I hide a column from a model in Rails 3.2?

How to test the number of database calls in Rails

Rails - Exclude an attribute from being saved

Categories

Resources