Multithreading in Ruby in EC2 causing weird behavior

Multithreading in Ruby in EC2 causing weird behavior - ruby-on-rails

I have the following code that I run in a rake task in rails:
10.times do |i|
Thread.new do
puts "#{i}"
end
end
When I run this locally, I get the following:
0
3
5
1
7
8
2
4
9
6 (with new lines)
However, when I run the same code in EC2 via the same rake task, it will print out maybe one or two lines, and then the task will terminate. I'm not sure why, but it seems my EC2 instance can't handle the multithreading for some reason.
Any insights why?

You've just been getting lucky locally - there is nothing that guarantees that your 10 threads will execute to completion before your program exits. If you want to wait for your threads then you must do so explicitly:
threads = 10.times.collect do |i|
Thread.new do
puts i
end
end
threads.each(&:join)
The join method blocks the calling thread until the specified thread has completed. It also returns the return value of that thread.

Related

Sidekiq jobs won't run in same time in different queues

I have 2 Sidekiq workers:
Foo:
# frozen_string_literal: true
class FooWorker
include Sidekiq::Worker
sidekiq_options queue: :foo
def perform
loop do
File.open(File.join(Rails.root, 'foo.txt'), 'w') { |file| file.write('FOO') }
end
end
end
Bar:
# frozen_string_literal: true
class BarWorker
include Sidekiq::Worker
sidekiq_options queue: :bar
def perform
loop do
File.open(File.join(Rails.root, 'bar.txt'), 'w') { |file| file.write('BAR') }
end
end
end
Which has pretty the same functionality, both runs on different queues and the yaml file looks like this:
---
:queues:
- foo
- bar
development:
:concurrency: 5
The problem is, even both are running and showing in the Busy page of Sidekiq UI, only one of them will actually create a file and put contents in. Shouldn't Sidekiq be multi-threaded?
Update:
this happens only on my machine
i created a new project with rails new and same
i cloned a colleague project and ran his sidekiq and is working!!!
i used his sidekiq version, not working!
New Update:
this happens also on my colleague machine if he clone my project
if I run 2 jobs with a finite loop ( like 10 times do something with a sleep), first job will be executed and then the second, but after the second finishes and start again both will work on same time as expected -- everyone who cloned the project from: github.com/ArayB/sidekiq-test encountered the problem.

It's not an issue with Sidekiq. It's an issue somewhere in Ruby/MRI/Thread/GIL. Google for more info, but my understanding is that sometimes threads aren't real threads (see "green threads") so really just simulate threading. The important thing is that only one thread can execute at a time.
It's interesting that with only two threads the system isn't giving time to the second thread. No idea why, but it must realize it's mistake when you run it again.
Interestingly if you run your same app but instead fire off 10 TestWorkers (and tweak the output so you can tell the difference) sidekiq will run all 10 "at once".
10.times {|i| TestWorker.perform_async(i) }
Here is the tweaked worker. Be sure to flush the output cause that can also cause issues with threading and TTY buffering not reflecting reality.
class TestWorker
include Sidekiq::Worker
def perform(n)
10.times do |i|
puts "#{n} - #{i} - #{Time.current}"
$stdout.flush
sleep 1
end
end
end
Some interesting links:
https://en.wikipedia.org/wiki/Green_threads
http://ruby-doc.org/core-2.4.1/Thread.html#method-c-pass
https://github.com/ruby/ruby/blob/v2_4_1/thread.c
Does ruby have real multithreading?

Why isn't .with_lock working as I expect it to?

I would expect the below code to force exclusive access to the code block within, but it isn't. In my test, each thread is able to run the block concurrently.
Assume a Rails environment and user is an activerecord object. Also note that this is a somewhat arbitrary test I wrote in order to resolve a concurrency issue I'm experiencing with web requests.
user = User.first
threads = []
3.times do |i|
threads << Thread.new do
user.with_lock do
puts "start: #{i}"
sleep 1
puts "stop: #{1}"
end
end
end
threads.each(&:join)
Expected output:
start: 1
stop: 1
start: 2
stop: 2
start: 3
stop: 3
Actual output:
start: 1
start: 2
start: 3
stop: 1
stop: 2
stop: 3
What am I missing? Does rails .with_lock not work within standard ruby threads? Or, is this possibly due to my test environment using sqlite3?

It appears to have been related to sqlite3.

You've evidently figured this out already, but to clarify for others encountering this problem:
Sqlite3 doesn't support row-level locking. Therefore, Arel simply ignores any locks when building the query:
https://github.com/rails/arel/blob/master/lib/arel/visitors/sqlite.rb#L7

Code not getting executed in Ruby threads

I have a rake task that uses some threads and now I'm getting to a really strange case...
some code didn't get executed so I started playing with simple puts statements...
Basically I have this:
Thread.new do
puts "hi"
puts "there"
[more code]
end
These are three consecutive runs of my rake task:
$ rake task:execute
hi
there
$ rake task:execute
[nothing!]
$ rake task:execute
hi
I tried Ruby 2.0 and 2.1.
I don't know if the problem is just in puts but I think not because the code didn't get executed and that's why I started debugging with printouts only to discover that even this doesn't get executed (always).
Strange?

You need to save the reference to all the threads and then call join on each of them to wait for them to complete. Ruby will not wait for other threads once the main thread exits.
threads = 3.times.map { Thread.new { puts "hello" } }
# do something else while threads run, if you want
threads.each(&:join)

Your main thread (the rake task itself) is probably completing before your subthread completes. You can do something like this:
t = Thread.new do
puts "hi"
puts "there"
[more code]
end
[do other stuff in the main thread]
t.join # Let the subthread catch up

Rails threads testing db lock

Let's say we want to test that the database is being locked..
$transaction = Thread.new {
Rails.logger.debug 'transaction process start'
Inventory.transaction do
inventory.lock!
Thread.stop
inventory.units_available=99
inventory.save
end
}
$race_condition = Thread.new {
Rails.logger.debug 'race_condition process start'
config = ActiveRecord::Base.configurations[Rails.env].symbolize_keys
config[:flags] = 65536 | 131072 | Mysql2::Client::FOUND_ROWS
begin
connection = Mysql2::Client.new(config)
$transaction.run
$transaction.join
rescue NoMethodError
ensure
connection.close if connection
end
}
Rails.logger.debug 'main process start'
$transaction.join
Rails.logger.debug 'main process after transaction.join'
sleep 0.1 while $transaction.status!='sleep'
Rails.logger.debug 'main process after sleep'
$race_condition.join
Rails.logger.debug 'main process after race_condition.join'
In theory, I'd think it would do the transaction thread, then wait( Thread.stop ), then the main process would see that it's sleeping, and start the race condition thread(which will be trying to alter data in the locked table when it actually works). Then the race condition would continue the transaction thread after it was done.
what's weird is the trace
main process start
transaction process start
race_condition process start
Coming from nodejs, it seems like threads aren't exactly as user friendly.. though, there has to be a way to get this done.
Is there an easier way to lock the database, then try to change it with a different thread?

Thread.new automatically starts the Thread.
But that does not mean that it is executing.
That depends on Operations system, ruby or jruby, how many cores, etc.
In your example the main thread runs until
$transaction.join,
and only then your transaction thread starts, just by chance.
It runs still Thread.stop, then your '$race_condition' Thread starts, because both other are blocked (it might have started before)
So that explains your log.
You have two $transaction.join
they wait until the thread exits, but a thread can only exit once...
I don't know what is happen then, maybe the second call waits forever.
For your test, you need some sort of explicit synchronization, so that our race_thread writes exactly when the transaction_thread is in the middle of the transaction. You can do this with Mutex, but better would be some sort of message passing. The following blog post may help:
http://www.engineyard.com/blog/2011/a-modern-guide-to-threads/

For any resource to make it a "Mutually Exclusive", you need to use Mutex class and use a synchronize method to make the resources locked while one thread is using them. You have to do something like this:
semaphore = Mutex.new
and use it inside the Thread instance.
$transaction = Thread.new {
semaphore.synchronize{
# Do whatever you want with the *your shared resource*
}
}
This way you can prevent any deadlocks.
Hope this helps.

Rails threading - multiple tasks

I am trying to run multiple tasks, each task access the database, and I am trying to run the tasks into separate execution wires.
I played around, tried allow_concurrency which I have set to true, or config.thread_safe! but it I get un-deterministic errors, for example sometimes a class is missing, or a constant ...
here is some code
grabbers = get_grabber_name_list
threads = []
grabbers.each { |grabber|
threads << Thread.new {
ARGV[0] = grabber
if (##last_run_timestamp[grabber.to_sym].blank? || (##last_run_timestamp[grabber.to_sym] >= AbstractGrabber.aff_net_accounts(grabber, "grab_interval").seconds.ago))
Rake::Task["aff_net:import:" + grabber].execute
##last_run_timestamp.merge!({grabber.to_sym => Time.now})
end
}
}
threads.each {|t| t.join }
thanks

I've recently implemented a Rails application that uses threads and made a few discoveries:
First, if you're writing to any arrays or hashes (i.e., complex types) outside your thread, wrap them in a mutex. It looks to me like hash and array references may not be thread safe. It seems unlikely that hash/array element indexing isn't thread safe but all I know is that after I put the external data structures in a mutex before writing, problems disappeared.
Second, close your ActiveRecord connection when the thread terminates, otherwise you can end up creating a large number of stale connections. Here's a post about how to do this. I don't know if it still applies for Rails versions > 2.2 but after I started closing connections explicitly, my problems disappeared. The author suggests monkey-patching ActiveRecord to do this automatically but I decided to release connections explicitly in my code.
Here's a sample of code that's working for me:
mutex = Mutex.new
my_array = []
threads = []
1.upto(10) do |i|
threads << Thread.new {
begin
do_some_stuff
mutex.synchronize {
# You'd think that each thread would only touch its own personal
# array element but without a mutex, I run into problems.
my_array[i] = some_computed_value
}
ensure
ActiveRecord::Base.connection_pool.release_connection
end
}
}
threads.each {|t| t.join}
By the way, if you're using threads to take advantage of multi-core CPUs, you'll need to use JRuby. As far as I know, JRuby is the only implementation that can take advantage of native CPU threads. If you use threads so you can do other things while waiting on network connections or some other non-CPU task, this isn't an issue.

You should probably do this using background workers. There are a few options for background worker libraries, but my favourite is delayed_job (http://github.com/tobi/delayed_job).
It should be pretty easy to convert the code you posted into background jobs.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Multithreading in Ruby in EC2 causing weird behavior - ruby-on-rails

Related

Sidekiq jobs won't run in same time in different queues

Why isn't .with_lock working as I expect it to?

Code not getting executed in Ruby threads

Rails threads testing db lock

Rails threading - multiple tasks

Categories

Resources