I saw other threads stating how to do it for mySql, and even how to do it in java, but not how to set the query timeout in ruby.
I'm trying to use the setQueryTimeout function in Jruby using OJDBC7, but can't find how to do it in ruby. I've tried the following:
#c.connection.instance_variable_get(:#connection).instance_variable_set(:#query_timeout, 1)
#c.connection.instance_variable_get(:#connection).instance_variable_set(:#read_timeout, 1)
#c.connection.setQueryTimeout(1)
I also tried modifying my database.yml file to include
adapter: jdbc
driver: oracle.jdbc.driver.OracleDriver
timeout: 1
none of the above had any effect, other then the setQueryTimeout which threw a method error.
Any help would be great
So I found a way to make it work, but I don't like it. It's very hackish and orphans queries on the database, but it at least allows my app to continue executing. I would still love to find a way to cancel the statement so i'm not orphaning queries that take longer then 10 seconds.
query_thread = Thread.new {
#execute query
}
begin
Timeout::timeout(10) do
query_thread.join()
end
rescue
Thread.kill(query_thread)
results = Array.new
end
Query timeout on Oracle-DB works for me with Rails 4 and JRuby
With JRuby you can use JBDC-function statement.setQueryTimeout to define query timeout.
Suddenly this requires patching of oracle-enhanced_adapter as shown below.
This example is an implementation of iterator-query without storing result in array, which also uses query timeout.
# hold open SQL-Cursor and iterate over SQL-result without storing whole result in Array
# Peter Ramm, 02.03.2016
# expand class by getter to allow access on internal variable #raw_statement
ActiveRecord::ConnectionAdapters::OracleEnhancedJDBCConnection::Cursor.class_eval do
def get_raw_statement
#raw_statement
end
end
# Class extension by Module-Declaration : module ActiveRecord, module ConnectionAdapters, module OracleEnhancedDatabaseStatements
# does not work as Engine with Winstone application server, therefore hard manipulation of class ActiveRecord::ConnectionAdapters::OracleEnhancedAdapter
# and extension with method iterate_query
ActiveRecord::ConnectionAdapters::OracleEnhancedAdapter.class_eval do
# Method comparable with ActiveRecord::ConnectionAdapters::OracleEnhancedDatabaseStatements.exec_query,
# but without storing whole result in memory
def iterate_query(sql, name = 'SQL', binds = [], modifier = nil, query_timeout = nil, &block)
type_casted_binds = binds.map { |col, val|
[col, type_cast(val, col)]
}
log(sql, name, type_casted_binds) do
cursor = nil
cached = false
if without_prepared_statement?(binds)
cursor = #connection.prepare(sql)
else
unless #statements.key? sql
#statements[sql] = #connection.prepare(sql)
end
cursor = #statements[sql]
binds.each_with_index do |bind, i|
col, val = bind
cursor.bind_param(i + 1, type_cast(val, col), col)
end
cached = true
end
cursor.get_raw_statement.setQueryTimeout(query_timeout) if query_timeout
cursor.exec
if name == 'EXPLAIN' and sql =~ /^EXPLAIN/
res = true
else
columns = cursor.get_col_names.map do |col_name|
#connection.oracle_downcase(col_name).freeze
end
fetch_options = {:get_lob_value => (name != 'Writable Large Object')}
while row = cursor.fetch(fetch_options)
result_hash = {}
columns.each_index do |index|
result_hash[columns[index]] = row[index]
row[index] = row[index].strip if row[index].class == String # Remove possible 0x00 at end of string, this leads to error in Internet Explorer
end
result_hash.extend SelectHashHelper
modifier.call(result_hash) unless modifier.nil?
yield result_hash
end
end
cursor.close unless cached
nil
end
end #iterate_query
end #class_eval
class SqlSelectIterator
def initialize(stmt, binds, modifier, query_timeout)
#stmt = stmt
#binds = binds
#modifier = modifier # proc for modifikation of record
#query_timeout = query_timeout
end
def each(&block)
# Execute SQL and call block for every record of result
ActiveRecord::Base.connection.iterate_query(#stmt, 'sql_select_iterator', #binds, #modifier, #query_timeout, &block)
end
end
Use above class SqlSelectIterator like this example:
SqlSelectIterator.new(stmt, binds, modifier, query_timeout).each do |record|
process(record)
end
Related
I'm intrigued. How User.where(name: 'Foo').where(age: 20) works with only one database call?
Does the method where know if it is the last of the chain and change its behavior? And if so, how that can be done?
It returns self after each time it adds the query to its internal query builder.
where doesn't necessarily know where it's at; ActiveRecord waits for itself to be enumerated on before even querying the database. Example:
users = User.where(active: true)
users.loaded? # false
users.each { }
users.loaded? # true
each, map, first, last, etc. all will trigger the query to be loaded.
Here's an example of a super-naive query builder:
class FakeRecord
include Enumerable
def self.all_args
#all_args ||= []
end
def self.where(*args)
all_args << args
self
end
def self.each
puts "Executing sql #{all_args.join(", ")}"
yield [1, 2, 3]
end
end
FakeRecord.where(potato: true).where(dinosaur: false).each do |thing|
puts thing
end
I have an API endpoint that accounts for a little less than half of the average response time (on averaging taking about 514 ms, yikes). The endpoint simply returns some statistics about stored data scoped to particular time periods, such as this week, last week, this month, and so on...
There are a number of ways that we could reduce it's impact, like getting the clients to hit it less and with more particular queries such as only querying for "this week" when only that data is used. Here we focus on what can be done at the database-level first. In our current implementation we generate this data for all "time scopes" on-the-fly and the number of queries is enormous and made multiple times per second. No caching is used, but maybe there is a way to use Rails's cache_key, or the low-level Rails.cache?
The current implementation look something like this:
class FooSummaries
include SummaryStructs
def self.generate_for(user)
#user = user
summaries = Struct::Summaries.new
TimeScope::TIME_SCOPES.each do |scope|
foos = user.foos.by_scope(scope.to_sym)
summary = Struct::Summary.new
# e.g: summaries.last_week = build_summary(foos)
summaries.send("#{scope}=", build_summary(summary, foos))
end
summaries
end
private_class_method
def self.build_summary(summary, foos)
summary.all_quuz = #user.foos_count
summary.all_quux = all_quux(foos)
summary.quuw = quuw(foos).to_f
%w[foo bar baz qux].product(
%w[quux quuz corge]
).each do |a, b|
# e.g: summary.foo_quux = quux(foos, "foo")
summary.send("#{a.downcase}_#{b}=", send(b, foos, a) || 0)
end
summary
end
def self.all_quuz(foos)
foos.count
end
def self.all_quux(foos)
foos.sum(:quux)
end
def self.quuw(foos)
foos.quuwable.total_quuw
end
def self.corge(foos, foo_type)
return if foos.count.zero?
count = self.quuz(foos, foo_type) || 0
count.to_f / foos.count
end
def self.quux(foos, foo_type)
case foo_type
when "foo"
foos.where(foo: true).sum(:quux)
when "bar"
foos.bar.where(foo: false).sum(:quux)
when "baz"
foos.baz.where(foo: false).sum(:quux)
when "qux"
foos.qux.sum(:quux)
end
end
def self.quuz(foos, foo_type)
case trip_type
when "foo"
foos.where(foo: true).count
when "bar"
foos.bar.where(foo: false).count
when "baz"
foos.baz.where(foo: false).count
when "qux"
foos.qux.count
end
end
end
To avoid making changes to the model, or creating migrations to create a table to store this data (both of which may be valid and better solutions) I decided maybe it would be easier to construct one large sql query that will be executed at once in the hopes that it will be faster to build the query string and execute it without the overhead of active record set up and tear down of SQL queries.
The new approach looks something like this, it is horrifying to me and I know there must be a more elegant way:
class FooSummaries
include SummaryStructs
def self.generate_for(user)
results = ActiveRecord::Base.connection.execute(build_query_for(user))
results.each do |result|
# build up summary struct from query results
end
end
def self.build_query_for(user)
TimeScope::TIME_SCOPES.map do |scope|
time_scope = TimeScope.new(scope)
%w[foo bar baz qux].map do |foo_type|
%[
select
'#{scope}_#{foo_type}',
sum(quux) as quux,
count(*), as quuz,
round(100.0 * (count(*) / #{user.foos_count.to_f}), 3) as corge
from
"foos"
where
"foo"."user_id" = #{user.id}
and "foos"."foo_type" = '#{foo_type.humanize}'
and "foos"."end_time" between '#{time_scope.from}' AND '#{time_scope.to}'
and "foos"."foo" = '#{foo_type == 'foo' ? 't' : 'f'}'
union
]
end
end.join.reverse.sub("union".reverse, "").reverse
end
end
The funny way of replacing the last occurance of union also horrifies but it seems to work. There must be a beter way as there are probably many things that are wrong with the above implementation(s). It may be helpful to note that I use Postgresql and have no problem with writing queries that are not portable to other DB's. Any advice is truly appreciated!
Thanks for reading!
Update: I found a solution that works for me and sped up the endpoint that uses this service object by 500% ! Essentially the idea is, instead of building a query string and then executing it for each set of parameters, we create a prepared statement using prepare followed by an exec_prepared passing in parameters to the query. Since this query is made many times over this is a useful optmization because, as per the documentation:
A prepared statement is a server-side object that can be used to optimize performance. When the PREPARE statement is executed, the specified statement is parsed, analyzed, and rewritten. When an EXECUTE command is subsequently issued, the prepared statement is planned and executed. This division of labor avoids repetitive parse analysis work, while allowing the execution plan to depend on the specific parameter values supplied.
We prepare the query like so:
def prepare_query!
ActiveRecord::Base.transaction do
connection.prepare("foos_summary",
%[with scoped_foos as (
select
*
from
"foos"
where
"foos"."user_id" = $3
and ("foos"."end_time" between $4 and $5)
)
select
$1::text as scope,
$2::text as foo_type,
sum(quux)::float as quux,
sum(eggs + bacon + ham)::float as food,
count(*) as count,
round((sum(quux) / nullif(
(select
sum(quux)
from
scoped_foos), 0))::numeric,
5)::float as quuz
from
scoped_foos
where
(case $6
when 'Baz'
then (baz = 't')
else
(baz = 'f' and foo_type = $6)
end
)
])
end
You can see in this query we use a common table expression for more readability and to avoid writing the same select query twice over.
Then we execute the query, passing in the parameters we need:
def connection
#connection ||= ActiveRecord::Base.connection.raw_connection
end
def query_results
prepare_query! unless query_already_prepared?
#results ||= TimeScope::TIME_SCOPES.map do |scope|
time_scope = TimeScope.new(scope)
%w[bacon eggs ham spam].map do |foo_type|
connection.exec_prepared("foos_summary",
[scope,
foo_type,
#user.id,
time_scope.from,
time_scope.to,
foo_type.humanize])
end
end
end
Where query_already_prepared? is a simple check in the prepared statements table maintained by postgres:
def query_already_prepared?
connection.exec(%(select
name
from
pg_prepared_statements
where name = 'foos_summary')).count.positive?
end
A nice solution, I thought! Hopefully the technique illustrated here will help others with a similar problems.
I have this method in my models/images.rb model. I am starting with testing and having a hard time coming up with tests for it. Would appreciate your help.
def self.tags
t = "db/data.csv"
#arr = []
csvdata = CSV.read(t)
csvdata.shift
csvdata.each do |row|
row.each_with_index do |l, i|
unless l.nil?
#arr << l
end
end
end
#arr
end
First off a word of advice - CSV is probably the worst imaginable data format and is best avoided unless absolutely unavoidable - like if the client insists that manipulating data in MS Excel is a good idea (it is not).
If you have to use CSV don't use a method name like .tags which can confused for a regular ActiveRecord relation.
Testing methods that read from the file system can be quite difficult.
To start with you might want to alter the signature of the method so that you can pass a file path.
def self.tags(file = "db/data.csv")
# ...
end
That way you can pass a fixture file so that you can test it deterministically.
RSpec.describe Image do
describe "tags" do
let(:file) { Rails.root.join('spec', 'support', 'fixtures', 'tags.csv') }
it 'returns an array' do
expect(Image.tags(file)).to eq [ { foo: 'bar' }, { foo: 'baz' } ]
end
end
end
However your method is very ideosyncratic -
def self.tags
t = "db/data.csv"
#arr = []
self.tags makes it a class method yet you are declaring #arr as an instance variable.
Additionally Ruby's enumerable module provides so many methods for manipulating arrays that using an outer variable in a loop is not needed.
def self.tags(file = "db/data.csv")
csv_data = CSV.read(file)
csv_data.shift
csv_data.compact # removes nil elements
end
I have a User model in a ROR application that has multiple methods like this
#getClient() returns an object that knows how to find certain info for a date
#processHeaders() is a function that processes output and updates some values in the database
#refreshToken() is function that is called when an error occurs when requesting data from the object returned by getClient()
def transactions_on_date(date)
if blocked?
# do something
else
begin
output = getClient().transactions(date)
processHeaders(output)
return output
rescue UnauthorizedError => ex
refresh_token()
output = getClient().transactions(date)
process_fitbit_rate_headers(output)
return output
end
end
end
def events_on_date(date)
if blocked?
# do something
else
begin
output = getClient().events(date)
processHeaders(output)
return output
rescue UnauthorizedError => ex
refresh_token()
output = getClient().events(date)
processHeaders(output)
return output
end
end
end
I have several functions in my User class that look exactly the same. The only difference among these functions is the line output = getClient().something(date). Is there a way that I can make this code look cleaner so that I do not have a repetitive list of functions.
The answer is usually passing in a block and doing it functional style:
def handle_blocking(date)
if blocked?
# do something
else
begin
output = yield(date)
processHeaders(output)
output
rescue UnauthorizedError => ex
refresh_token
output = yield(date)
process_fitbit_rate_headers(output)
output
end
end
end
Then you call it this way:
handle_blocking(date) do |date|
getClient.something(date)
end
That allows a lot of customization. The yield call executes the block of code you've supplied and passes in the date argument to it.
The process of DRYing up your code often involves looking for patterns and boiling them down to useful methods like this. Using a functional approach can keep things clean.
Yes, you can use Object#send: getClient().send(:method_name, date).
BTW, getClient is not a proper Ruby method name. It should be get_client.
How about a combination of both answers:
class User
def method_missing sym, *args
m_name = sym.to_s
if m_name.end_with? '_on_date'
prop = m_name.split('_').first.to_sym
handle_blocking(args.first) { getClient().send(prop, args.first) }
else
super(sym, *args)
end
end
def respond_to? sym, private=false
m_name.end_with?('_on_date') || super(sym, private)
end
def handle_blocking date
# see other answer
end
end
Then you can call "transaction_on_date", "events_on_date", "foo_on_date" and it would work.
I am introducing sunspot search into my project. I got a POC by just searching by the name field. When I introduced the description field and reindexed sold I get the following error.
** Invoke sunspot:reindex (first_time)
** Invoke environment (first_time)
** Execute environment
** Execute sunspot:reindex
Skipping progress bar: for progress reporting, add gem 'progress_bar' to your Gemfile
rake aborted!
RSolr::Error::Http: RSolr::Error::Http - 400 Bad Request
Error: {'responseHeader'=>{'status'=>400,'QTime'=>18},'error'=>{'msg'=>'Illegal character ((CTRL-CHAR, code 11))
at [row,col {unknown-source}]: [42,1]','code'=>400}}
Request Data: "<?xml version=\"1.0\" encoding=\"UTF-8\"?><add><doc><field name=\"id\">ItemsDesign 1322</field><field name=\"type\">ItemsDesign</field><field name=\"type\">ActiveRecord::Base</field><field name=\"class_name\">ItemsDesign</field><field name=\"name_text\">River City Clocks Musical Multi-Colored Quartz Cuckoo Clock</field><field name=\"description_text\">This colorful chalet style German quartz cuckoo clock accurately keeps time and plays 12 different melodies. Many colorful flowers are painted on the clock case and figures of a Saint Bernard and Alpine horn player are on each side of the clock dial. Two decorative pine cone weights are suspended beneath the clock case by two chains. The heart shaped pendulum continously swings back and forth.
On every
I assuming that the bad char is
that you can see at the bottom. that
is littered in a lot of the descriptions. I'm not even sure what char that is.
What can I do to get solr to ignore it or clean the data so that sold can handle it.
Thanks
Put the following in an initializer to automatically clean sunspot calls of any UTF8 control characters:
# config/initializers/sunspot.rb
module Sunspot
#
# DataExtractors present an internal API for the indexer to use to extract
# field values from models for indexing. They must implement the #value_for
# method, which takes an object and returns the value extracted from it.
#
module DataExtractor #:nodoc: all
#
# AttributeExtractors extract data by simply calling a method on the block.
#
class AttributeExtractor
def initialize(attribute_name)
#attribute_name = attribute_name
end
def value_for(object)
Filter.new( object.send(#attribute_name) ).value
end
end
#
# BlockExtractors extract data by evaluating a block in the context of the
# object instance, or if the block takes an argument, by passing the object
# as the argument to the block. Either way, the return value of the block is
# the value returned by the extractor.
#
class BlockExtractor
def initialize(&block)
#block = block
end
def value_for(object)
Filter.new( Util.instance_eval_or_call(object, &#block) ).value
end
end
#
# Constant data extractors simply return the same value for every object.
#
class Constant
def initialize(value)
#value = value
end
def value_for(object)
Filter.new(#value).value
end
end
#
# A Filter to allow easy value cleaning
#
class Filter
def initialize(value)
#value = value
end
def value
strip_control_characters #value
end
def strip_control_characters(value)
return value unless value.is_a? String
value.chars.inject("") do |str, char|
unless char.ascii_only? and (char.ord < 32 or char.ord == 127)
str << char
end
str
end
end
end
end
end
Source (Sunspot Github Issues): Sunspot Solr Reindexing failing due to illegal characters
I tried the solution #thekingoftruth proposed, however it did not solve the problem. Found an alternative version of the Filter class in the same github thread that he links to and that solved my problem.
The main difference was the i use nested models through HABTM relationships.
This is my search block in the model:
searchable do
text :name, :description, :excerpt
text :venue_name do
venue.name if venue.present?
end
text :artist_name do
artists.map { |a| a.name if a.present? } if artists.present?
end
end
Here is the initializer that worked for me:
(in: config/initializers/sunspot.rb)
module Sunspot
#
# DataExtractors present an internal API for the indexer to use to extract
# field values from models for indexing. They must implement the #value_for
# method, which takes an object and returns the value extracted from it.
#
module DataExtractor #:nodoc: all
#
# AttributeExtractors extract data by simply calling a method on the block.
#
class AttributeExtractor
def initialize(attribute_name)
#attribute_name = attribute_name
end
def value_for(object)
Filter.new( object.send(#attribute_name) ).value
end
end
#
# BlockExtractors extract data by evaluating a block in the context of the
# object instance, or if the block takes an argument, by passing the object
# as the argument to the block. Either way, the return value of the block is
# the value returned by the extractor.
#
class BlockExtractor
def initialize(&block)
#block = block
end
def value_for(object)
Filter.new( Util.instance_eval_or_call(object, &#block) ).value
end
end
#
# Constant data extractors simply return the same value for every object.
#
class Constant
def initialize(value)
#value = value
end
def value_for(object)
Filter.new(#value).value
end
end
#
# A Filter to allow easy value cleaning
#
class Filter
def initialize(value)
#value = value
end
def value
if #value.is_a? String
strip_control_characters_from_string #value
elsif #value.is_a? Array
#value.map { |v| strip_control_characters_from_string v }
elsif #value.is_a? Hash
#value.inject({}) do |hash, (k, v)|
hash.merge( strip_control_characters_from_string(k) => strip_control_characters_from_string(v) )
end
else
#value
end
end
def strip_control_characters_from_string(value)
return value unless value.is_a? String
value.chars.inject("") do |str, char|
unless char.ascii_only? && (char.ord < 32 || char.ord == 127)
str << char
end
str
end
end
end
end
end
You need to get rid of control characters from UTF8 while saving your content. Solr will not reindex this properly and throw this error.
http://en.wikipedia.org/wiki/UTF-8#Codepage_layout
You can use something like this:
name.gsub!(/\p{Cc}/, "")
edit:
If you want to override it globally I think it could be possible by overriding value_for_methods in AttributeExtractor and if needed BlockExtractor.
https://github.com/sunspot/sunspot/blob/master/sunspot/lib/sunspot/data_extractor.rb
I wasn't checking this.
If you manage to add some global patch, please let me know.
I had lately same issue.