I have a piece of code:
config_item_relation = OTRS::Relation.new
config_item_threads = []
config_items.each do |ci|
config_item_threads << Thread.new do
config_item_relation << Tracker::ConfigItem.object_preprocessor(ci.first)
end
end
Which is causing this error:
LoadError: Expected app/models/tracker/config_item.rb to define Tracker::ConfigItem
If I comment out the thread creation as such:
config_item_relation = self.superclass::Relation.new
config_item_threads = []
config_items.each do |ci|
#config_item_threads << Thread.new do
config_item_relation << Tracker::ConfigItem.object_preprocessor(ci.first)
#end
end
The code runs just fine, except of course it won't do it in separate threads.
The referenced file in the error is indeed defining Tracker::ConfigItem.
class Tracker::ConfigItem < OTRS::ConfigItem
It's class I use many many places elsewhere with no issue until I use it with Thread here.
I have the same Thread usage against a different, but extremely similar class (same inheritance) in the same code chunk that works perfectly fine:
ticket_threads = []
if tickets
ticket_relation = self.superclass::Relation.new
tickets.each do |t|
ticket_threads << Thread.new do
ticket_relation << Tracker::Ticket.object_preprocessor(t)
end
end
end
Am I missing something with these threads?
It looks like Tracker::ConfigItem isn't loaded at this point so multiple threads start to try to load the same file.
Require is very much thread dangerous in current versions of ruby and can give rise to a variety of race conditions. I'd stick a
require_dependency 'tracker/config_item'
The file above so that you are sure that the requiring happens on the main thread rather than your child threads fighting it out
Related
I have a class and attributes there
class Person
attr_accessor :name
def say_hello
puts "person name #{self.name} "
end
end
Now I want execute the method say_hello but with this thread
queue_thread= []
1..100.times do |number|
person= Person.new
person.name= number.to_s
thread_to_run=Thread.new {person.say_hello}
queue_thread << thread_to_run
end
queue_thread.map {|thread_current| thread_current.join}
do you have some idea how do it ? I looked and the proplem is than thread is not recognition the vars by instance of object.
the answers correct should be this in console
"person name 1"
"person name 2"
"person name ..."
"person name etc"
The issue with this code is that it fires off multiple threads before calling join - in this time, some of the threads could be called in incorrect order, owing to the asynchronous nature of threads.
One option would be to simply join as soon as the thread is invoked. This will in effect pause the iteration until the thread completes, so you know they'll stay in order:
100.times do |number|
person= Person.new
person.name= number.to_s
Thread.new {person.say_hello}.join
end
Note there is really no point in using a thread here, but it will at least show you can join works.
Another option (which also unnecessarily uses threads) is so delay the thread invocation by storing it as a lambda. This is basically the same thing but allows you to split it up into two iterations:
queue_threads= []
1..100.times do |number|
person= Person.new
person.name= number.to_s
thread_lambda = -> { Thread.new {person.say_hello} }
queue_threads.push(thread_lambda)
end
queue_threads.map {|thread_lambda| thread_lambda.call.join}
Also note that 1..100.times isn't doing what you think it is. It is the same thing as saying 100.times, so, for example, if you said 99..100.times, the 99 is ignored and it would be 100 iterations and not only 1. If you want to iterate over a range, you could use something like 99..100.each do |i|.
Can someone please give a concrete example demonstrating non-thread safety? (in a similar manner to a functioning version of mine below if possible)
I need an example class that demonstrates a non-thread safe operation such that I can assert on the failure, and then enforce a Mutex such that I can test that my code is then thread safe.
I have tried the following with no success, as the threads do not appear to run in parallel. Assuming the ruby += operator is not threadsafe, this test always passes when it should not:
class TestLock
attr_reader :sequence
def initialize
#sequence = 0
end
def increment
#sequence += 1
end
end
#RSpec test
it 'does not allow parallel calls to increment' do
test_lock = TestLock.new
threads = []
list1 = []
list2 = []
start_time = Time.now + 2
threads << Thread.new do
loop do
if Time.now > start_time
5000.times { list1 << test_lock.increment }
break
end
end
end
threads << Thread.new do
loop do
if Time.now > start_time
5000.times { list2 << test_lock.increment }
break
end
end
end
threads.each(&:join) # wait for all threads to finish
expect(list1 & list2).to eq([])
end
Here is an example which instead of find a race condition with addition, concatenation, or something like that, uses a blocking file write.
To summarize the parts:
file_write method performs a blocking write for 2 seconds.
file_read reads the file and assigns it to a global variable to be referenced elsewhere.
NonThreadsafe#test calls these methods in succession, in their own threads, without a mutex. sleep 0.2 is inserted between the calls to ensure that the blocking file write has begun by the time the read is attempted. join is called on the second thread, so we be sure it's set the read value to a global variable. It returns the read-value from the global variable.
Threadsafe#test does the same thing, but wraps each method call in a mutex.
Here it is:
module FileMethods
def file_write(text)
File.open("asd", "w") do |f|
f.write text
sleep 2
end
end
def file_read
$read_val = File.read "asd"
end
end
class NonThreadsafe
include FileMethods
def test
`rm asd`
`touch asd`
Thread.new { file_write("hello") }
sleep 0.2
Thread.new { file_read }.join
$read_val
end
end
class Threadsafe
include FileMethods
def test
`rm asd`
`touch asd`
semaphore = Mutex.new
Thread.new { semaphore.synchronize { file_write "hello" } }
sleep 0.2
Thread.new { semaphore.synchronize { file_read } }.join
$read_val
end
end
And tests:
expect(NonThreadsafe.new.test).to be_empty
expect(Threadsafe.new.test).to eq("hello")
As for an explanation. The reason the non-threadsafe shows the file's read val as empty is because the blocking writing operation is still happening when the read takes place. When you use synchronize the Mutex, though, the write will complete before the read. Note also that the .join in the threadsafe example takes longer than in the non-threadsafe value - that's because it's sleeping for the full duration specified in the write thread.
So i stumbled across this: https://github.com/typhoeus/typhoeus
I'm wondering if this is what i need to speed up my rake task
Event.all.each do |row|
begin
url = urlhere + row.first + row.second
doc = Nokogiri::HTML(open(url))
doc.css('.table__row--event').each do |tablerow|
table = tablerow.css('.table__cell__body--location').css('h4').text
next unless table == row.eventvenuename
tablerow.css('.table__cell__body--availability').each do |button|
buttonurl = button.css('a')[0]['href']
if buttonurl.include? '/checkout/external'
else
row.update(row: buttonurl)
end
end
end
rescue Faraday::ConnectionFailed
puts "connection failed"
next
end
end
I'm wondering if this would speed it up, Or because i'm doing a .each it wouldn't?
If it would could you provide an example?
Sam
If you set up Typhoeus::Hydra to run parallel requests, you might be able to speed up your code, assuming that the Kernel#open calls are what's slowing you down. Before you optimize, you might want to run benchmarks to validate this assumption.
If it is true, and parallel requests would speed it up, you would need to restructure your code to load events in batches, build a queue of parallel requests for each batch, and then handle them after they execute. Here's some sketch code.
class YourBatchProcessingClass
def initialize(batch_size: 200)
#batch_size = batch_size
#hydra = Typhoeus::Hydra.new(max_concurrency: #batch_size)
end
def perform
# Get an array of records
Event.find_in_batches(batch_size: #batch_size) do |batch|
# Store all the requests so we can access their responses later.
requests = batch.map do |record|
request = Typhoeus::Request.new(your_url_build_logic(record))
#hydra.queue request
request
end
#hydra.run # Run requests in parallel
# Process responses from each request
requests.each do |request|
your_response_processing(request.response.body)
end
end
rescue WhateverError => e
puts e.message
end
private
def your_url_build_logic(event)
# TODO
end
def your_response_processing(response_body)
# TODO
end
end
# Run the service by calling this in your Rake task definition
YourBatchProcessingClass.new.perform
Ruby can be used for pure scripting, but it functions best as an object-oriented language. Decomposing your processing work into clear methods can help clarify your code and help you catch things like Tom Lord mentioned in the comments on your question. Also, instead of wrapping your whole script in a begin..rescue block, you can use method-level rescues as in #perform above, or just wrap #hydra.run.
As a note, .all.each is a memory hog, and is thus considered a bad solution to iterating over records: .all loads all of the records into memory before iterating over them with .each. To save memory, it's better to use .find_each or .find_in_batches, depending on your use case. See: http://api.rubyonrails.org/classes/ActiveRecord/Batches.html
When I try to perform following code on specs it gives me stack level too deep. Works fine in the console.
def order_fulfillments_without_receipts
#order_fulfillments_without_receipts = []
OrderReconciliation.includes(:order_fulfillment).
where(data_entry_status: OrderReconciliation.data_entry_statuses[:pending_entry]).
find_in_batches do |group|
group.select do |reconciliation|
select_reconciliation?(reconciliation)
end
end
#order_fulfillments_without_receipts
end
def select_reconciliation?(reconciliation)
order_fulfillment = reconciliation.order_fulfillment
receipt_urls_empty = order_fulfillment.get_receipt_urls.empty?
order_fulfillment_id = order_fulfillment.id
#order_fulfillments_without_receipts << order_fulfillment_id
receipt_urls_empty || order_fulfillments_without_receipts.include?(order_fulfillment_id)
end
end
How should I fix it to avoid stack level too deep?
You have a bug in your code, last line of the select_reconciliation? method after the || you have order_fulfillments_without_receipts but I think you meant #order_fulfillments_without_receipts
Without the # you're calling the order_fulfillments_without_receipts method, hence the infinite loop.
Why this is happening in your tests and not in your console must be to do with what receipt_urls_empty is in each case, in your tests it's false and in your console it's true.
I'm using rails 3.2.11 and ruby 1.9.3.
I have a slow page and I know I have many ways to optimize it. Currently I am focused on the method update_attributes.
Here is my code:
def create
#user = current_user
#demo = #user.demos.new
race_ethnicity_response = []
params[:race_ethnicity_response].each do |response, value|
race_ethnicity_response << response if value != '0'
end
params[:demo][:race_ethnicity_response] = race_ethnicity_response.join(', ')[0, 254]
#demo.update_attributes(params[:demo])
end
Or should I use something like build and save or create?
#demo = #user.demos.build
...
#demo.save!
Or
#users.demos.create!(params[demo])
I am curious which is faster. I know if it save 2ms then I should use the one which is more code correct/readable.
On a small operation like this you aren't going to see much of a performance difference. Go for readability + maintainability. The code above does seem a little scattered, particularly the middle block. Here is a straightforward approach although I may be missing something related to the params[:race_ethnicity_response] loop.
#demo = Demo.new(:params)
#demo.race_ethnicity_response = race_ethnicity_response.reject{|i| i == 0 }.join(', ')[0, 254]
current_user.demos << #demo