"Errno::EMFILE: Too many open files" with local images on create

"Errno::EMFILE: Too many open files" with local images on create - ruby-on-rails

On create of an object in Rails, I want to automatically assign it a stock image from the assets directory which can be overwritten later by a user.
As a result, I execute the following private method upon creation of the object:
def save_stock_image
image_path = Dir.glob(<list-of-images-from-directory>).sample
File.open(image_path) do |file|
self.image = file
self.save!
end
end
However, after 6 RSpec tests, I begin to receive the following error:
Failure/Error: let(:object) { create(:object) }
Errno::EMFILE:
Too many open files - /tmp/16020130822-36578-q8j9v9.jpg
# ./app/models/object.rb:502:in `block in save_stock_image'
# ./app/models/object.rb:501:in `open'
# ./app/models/object.rb:501:in `save_stock_image'
# ./spec/controllers/object_controller_spec.rb:318:in `block (3 levels) in <top (required)>'
# ./spec/controllers/object_controller_spec.rb:344:in `block (4 levels) in <top (required)>'
The above error is on ~40 of 60 tests. I've looked at a few SO questions, as well as https://github.com/thoughtbot/paperclip/issues/1122 and https://github.com/thoughtbot/paperclip/issues/1000. The closest answer I could find was to ensure the file descriptor was closing. Before I used File.open in the block, I explicitly closed the file with file.close - this didn't work either.
Something obvious that I'm doing wrong? Is there a better way to accomplish what I'm trying to do?
UPDATE
It looks like it has something to do with the temporary files which Paperclip creates before they are uploaded to S3. Is there something with closing those tempfiles that I'm missing?

Just ran into this myself. It looks like the master branch has a fix. See my comments here:
https://github.com/thoughtbot/paperclip/issues/1326?source=cc

If this is a development/test environment and you want a quick resolution.
Try to identify resque process id, kill it and restart the resque server.
Additionally you may try the below
Redis.current.client.reconnect
$redis = Redis.current

Just ran into this and the latest code didn't help me. So, I delegated the work of closing those temp files to the OS by spawning a child process:
def save_stock_image
ActiveRecord::Base.connection.disconnect!
Proces.fork do
image_path = Dir.glob(<list-of-images-from-directory>).sample
File.open(image_path) do |file|
self.image = file
self.save!
end
end
Process.wait
ActiveRecord::Base.establish_connection
end
Also, consider placing a timeout on the Process.wait like suggested here: Waiting for Ruby child pid to exit

Related

Why am I getting Listen Loop Bad File Descriptor errors on my machine but not anyone else's machine when specific code is enabled?

I'm currently working on a project to enable database backed configurations in the frontend of our application. These need to be loaded after application initialization, so I created a module to load them and added a call to it in environment.rb, after Rails.application.initialize!.
The problem is that when this code is enabled, my console gets flooded with listen loop errors with bad file descriptors like:
2020-01-24 09:18:16 -0500: Listen loop error: #<Errno::EBADF: Bad file descriptor>
/Users/fionadurgin/.asdf/installs/ruby/2.6.5/lib/ruby/gems/2.6.0/gems/puma-4.3.1/lib/puma/server.rb:383:in `select'
/Users/fionadurgin/.asdf/installs/ruby/2.6.5/lib/ruby/gems/2.6.0/gems/puma-4.3.1/lib/puma/server.rb:383:in `handle_servers'
/Users/fionadurgin/.asdf/installs/ruby/2.6.5/lib/ruby/gems/2.6.0/gems/puma-4.3.1/lib/puma/server.rb:356:in `block in run'
When I disable either the call to the ConfigurationLoader or the methods I'm calling on the model, I no longer get these errors.
The rub is that I can't reproduce this issue on another machine, or in specs. I've tried on two other laptops and on one of our staging servers and they work perfectly with the ConfigurationLoader enabled.
I've tried restarting my computer, working from a freshly cloned repository, and setting all the file permissions for the application to 777. Nothing has worked so far.
Here's the ConfigurationLoader module:
module ConfigurationLoader
# Overrides client default configurations if frontend configurations exist
def self.call
Configurations::ImportRowMapping.override_configurations
rescue ActiveRecord::NoDatabaseError => e
log_no_database_error(e)
rescue ActiveRecord::StatementInvalid => e
log_statement_invalid_error(e)
rescue Mysql2::Error::ConnectionError => e
log_connection_error(e)
end
def self.log_no_database_error(error)
Rails.logger.warn(
'Could not initialize database backed configurations, database does '\
'not exist'
)
Rails.logger.warn(error.message)
end
def self.log_statement_invalid_error(error)
Rails.logger.warn(
'Could not initialize database backed configurations, table does '\
'not exist'
)
Rails.logger.warn(error.message)
end
def self.log_connection_error(error)
Rails.logger.warn(
'Could not initialize database backed configurations, could not '\
'connect to database'
)
Rails.logger.warn(error.message)
end
end
The call in environment.rb:
# Load the Rails application.
require_relative 'application'
require_relative 'configuration_loader'
# Initialize the Rails application.
Rails.application.initialize!
ConfigurationLoader.call
And the model method being called:
def self.override_configurations
return unless any?
Rails.application.client.payroll_service_file.payroll_service_file
.mappings = all.to_a
end
I'll note here that I get the errors when either the guard clause or the assignment are enabled.
Anyone have any ideas about what's going on? I'm about at my wits' end.

So I'm still not sure on the exact cause of the problem, but the solution was to move the configuration loader call out of environment.rb and into an after_initialize block in application.rb.

Capybara + Selenium-webdriver + RSpec file fixtures + SSR giving Net::ReadTimeout

I'm noticing a strange issue that I haven't been able to solve for a few days.
I have a Rails 5 API server with system tests using RSpec and Capybara + Selenium-webdriver driving headless Chrome.
I'm using Capybara.app_host = 'http://localhost:4200' to make the tests hit a separate development server which is running an Ember front-end. The Ember front-end looks at the user agent to know to then send requests to the Rails API test database.
All the tests run fine except for ones which use RSpec file fixtures.
Here's one spec that is failing:
describe 'the affiliate program', :vcr, type: :system do
fixtures :all
before do
Capybara.session_name = :affiliate
visit('/')
signup_and_verify_email(signup_intent: :seller)
visit_affiliate_settings
end
it 'can use the affiliate page' do
affiliate_token = page.text[/Your affiliate token is \b(.+?)\b/i, 1]
expect(affiliate_token).to be_present
# When a referral signs up.
Capybara.session_name = :referral
visit("?client=#{affiliate_token}")
signup_and_verify_email(signup_intent: :member)
refresh
# It can track the referral.
Capybara.session_name = :affiliate
refresh
expect(page).to have_selector('.referral-row', count: 1)
# When a referral makes a purchase.
Capybara.session_name = :referral
find('[href="/videos"]').click
find('.price-area .coin-usd-amount', match: :first).click
find('.cart-dropdown-body .checkout-button').click
find('.checkout-button').click
wait_for { find('.countdown-timer') }
order = Order.last
order.force_complete_payment!
Rake::Task['affiliate_referral:update_amounts_earned'].invoke
# It can track the earnings.
Capybara.session_name = :affiliate
refresh
amount = (order.price * AffiliateReferral::COMMISSION_PERCENTAGE).floor.to_f
amount_in_dom = find('.referral-amount-earned', match: :first).text.gsub(/[^\d\.]/, '').to_f * 100
expect(amount).to equal(amount_in_dom)
end
end
This will fail maybe 99% of the time. There is the odd case where it passes. I can get my test suite to eventually pass by running it on a loop for a day.
I ended up upgrading all versions to the latest (Node 10, latest Ember, latest Rails) but the issue persists.
I can post a sample repo that reproduces the issue later. I just wanted to get this posted in case anyone has encountered the issue.
Here's a typical stack trace when the timeout happens:
1.1) Failure/Error: page.evaluate_script('window.location.reload()')
Net::ReadTimeout:
Net::ReadTimeout
# /home/mhluska/.rvm/gems/ruby-2.5.1/gems/webmock-3.3.0/lib/webmock/http_lib_adapters/net_http.rb:97:in `block in request'
# /home/mhluska/.rvm/gems/ruby-2.5.1/gems/webmock-3.3.0/lib/webmock/http_lib_adapters/net_http.rb:110:in `block in request'
# /home/mhluska/.rvm/gems/ruby-2.5.1/gems/webmock-3.3.0/lib/webmock/http_lib_adapters/net_http.rb:109:in `request'
# /home/mhluska/.rvm/gems/ruby-2.5.1/gems/selenium-webdriver-3.14.0/lib/selenium/webdriver/remote/http/default.rb:121:in `response_for'
# /home/mhluska/.rvm/gems/ruby-2.5.1/gems/selenium-webdriver-3.14.0/lib/selenium/webdriver/remote/http/default.rb:76:in `request'
# /home/mhluska/.rvm/gems/ruby-2.5.1/gems/selenium-webdriver-3.14.0/lib/selenium/webdriver/remote/http/common.rb:62:in `call'
# /home/mhluska/.rvm/gems/ruby-2.5.1/gems/selenium-webdriver-3.14.0/lib/selenium/webdriver/remote/bridge.rb:164:in `execute'
# /home/mhluska/.rvm/gems/ruby-2.5.1/gems/selenium-webdriver-3.14.0/lib/selenium/webdriver/remote/oss/bridge.rb:584:in `execute'
# /home/mhluska/.rvm/gems/ruby-2.5.1/gems/selenium-webdriver-3.14.0/lib/selenium/webdriver/remote/oss/bridge.rb:267:in `execute_script'
# /home/mhluska/.rvm/gems/ruby-2.5.1/gems/selenium-webdriver-3.14.0/lib/selenium/webdriver/common/driver.rb:211:in `execute_script'
# /home/mhluska/.rvm/gems/ruby-2.5.1/gems/capybara-3.8.2/lib/capybara/selenium/driver.rb:84:in `execute_script'
# /home/mhluska/.rvm/gems/ruby-2.5.1/gems/capybara-3.8.2/lib/capybara/selenium/driver.rb:88:in `evaluate_script'
# /home/mhluska/.rvm/gems/ruby-2.5.1/gems/capybara-3.8.2/lib/capybara/session.rb:575:in `evaluate_script'
# ./spec/support/selenium.rb:48:in `refresh'
# ./spec/support/pages.rb:70:in `signup_and_verify_email'
# ./spec/system/payment_spec.rb:43:in `block (3 levels) in <top (required)>'
I should point out it doesn't always happen with page.evaluate_script('window.location.reload()'). It can happen with something benign like visit('/').
Edit: I tried disabling Ember FastBoot (server-side rendering) using the DISABLE_FASTBOOT env variable and suddenly all tests pass. I'm thinking that somehow the RSpec fixtures are causing Ember FastBoot to not finish rendering in some cases. This certainly lines up with dropped connections I've occasionally seen in production logs.
I've been experimenting with the client code and it may be due to my use of FastBoot's deferRendering call.
Edit: I'm using the following versions:
ember-cli: 3.1.3
ember-data: 3.0.2
rails: 5.2.1
rspec: 3.8.0
capybara: 3.8.2
selenium-webdriver: 3.14.0
google chrome: 69.0.3497.100 (Official Build) (64-bit)
Edit: I'm using this somewhat flaky Node/Express library fastboot-app-server to do server-side rendering. I've discovered that it sometimes strips important response headers (Content-Type and Content-Encoding). I'm wondering if this is contributing to the issue.
Edit: I added a strict Content Security Policy to make sure there are no external requests running during the test suite that could be causing the Net::ReadTimeout.
I inspect the Chrome network tab at the point when it locks up and it seems to be loading nothing. Manually refreshing the browser allows the tests to pick up and continue running. How strange.
I've spent a couple weeks on this now and it may be time to give up on Selenium tests.
I upgraded to Chrome 70 and chromedriver 2.43. It didn't seem to make a difference.
I tried using the rspec-retry gem to force a refresh when the timeout occurs but the gem seems to fail to catch the timeout exception.
I've inspected the raw request to chromedriver where things hang. It looks like it's always POST http://127.0.0.1/session/<session id>/refresh. I tried refreshing in an alternate way: visit(page.current_path) which seems to fix things!

I finally got my test suite to pass by switching page.driver.browser.navigate.refresh to visit(page.current_path).
I know it's an ugly hack but it's the only thing I could find to get things working (see my various attempts in the question edits).
I looked at the request to chromedriver that was causing the timeouts each time: POST http://127.0.0.1/session/<session id>/refresh. I can only guess that it's some kind of issue with chromedriver. Perhaps incidentally, it only hangs when multiple chromedriver instances are active (which happens when multiple Capybara sessions are being used).
Edit: I needed to account for query params as well:
def refresh
query = URI.parse(page.current_url).query
path = page.current_path
path += "?#{query}" if query.present?
visit(path)
end
I tried just doing visit(page.current_url) but that was giving timeouts as well.

A copy of XXX has been removed from the module but is still active

I am basically running a background process that checks for files, and then updates the rails model based on the data discovered in the file. However, I can't access the model from within the thread because of an error.
Here's my example:
def check_logs
while #start == 1
results = Dir.glob("#{#path}/*.txt")
unless results.empty?
results.each do |result|
file_name = result.split("/")[-1]
data = File.open(result).read
if file_name.include? "get"
data_contents = data.split("\n")
time = data_contents[0]
ExamResult.create(time: time)
end
FileUtils.rm_rf result
end
end
sleep 5
end
end
def start_agent
#start = 1
Thread.start {check_logs}
end
def stop_agent
#start = 0
end
However, while it's in the background, this is the error that I see coming across the console:
terminated with exception (report_on_exception is true):
Traceback (most recent call last): 5: from
portal/lib/custom_rb/exam_results/exam_custom.rb:69:in block in
start_monitoring_agent' 4: from
portal/lib/custom_rb/exam_results/exam_custom.rb:40:incheck_logs'
3: from portal/lib/custom_rb/exam_results/exam_custom.rb:40:in each'
2: from portal/lib/custom_rb/exam_results/exam_custom.rb:46:inblock
in check_logs' 1: from
/home/nutella/.rvm/gems/ruby-2.5.1/gems/activesupport-5.1.6/lib/active_support/dependencies.rb:202:in const_missing'
/home/user/.rvm/gems/ruby-2.5.1/gems/activesupport-5.1.6/lib/active_support/dependencies.rb:496:inload_missing_constant': A copy of ExamResult has been removed from
the module tree but is still active! (ArgumentError)
My goal here is just to have a backgrounded process to monitor for logs. I've seen some other posts about this same exact error, but perhaps I could be doing this a little better other than the solutions provided for them.
Any thoughts or feedback would be greatly appreciated.

I don't think that provided error is somehow related to the code above.
Usually, this error happens when you modify your classes at runtime with metaprogramming.
Take a look at places where you require, define ExamResult, it looks like you require it several times in your code.

Problems with file.path with csv import via sidekiq on heroku

I am using a background job in order to import user data from a csv file into my datase. First I did this "hard" in my User model by simply calling a method in my User model and by passing the file path which is transmitted via a form file_field:
User.import_csv(params[:file].path)
Worked well locally and on production (heroku).
Now when it comes to huge CSV files, I understood that I need a job to perform this import in the background. I am familiar with redis and sidekiq so the job was built quickly.
CsvImportJob.perform_async(URI.parse(params[:file].path))
and in my worker:
def perform(file_path)
User.import_csv(file_path)
end
Well, that also works perfect locally but as soon as I hit this on production, I see the following error in my log:
» 10 Aug 2015 13:56:26.596 2015-08-10 11:56:25.987726+00:00 app worker.1 - - 3 TID-oqvt6v1d4 ERROR: Actor crashed!
» 10 Aug 2015 13:56:26.596 2015-08-10 11:56:25.987728+00:00 app worker.1 - - Errno::ENOENT: No such file or directory # rb_sysopen - /tmp/RackMultipart20150810-6-14u804c.csv
» 10 Aug 2015 13:56:26.596 2015-08-10 11:56:25.987730+00:00 app worker.1 - - /app/vendor/ruby-2.2.2/lib/ruby/2.2.0/csv.rb:1256:in `initialize'
This is meant to be the file_path variable.
Somehow heroku is not able to find the file when I pass it to a sidekiq job. When I do this without sidekiq, it works.
I don't really know how to tackle this issue so any help is appreciated.

I had the same experience, you can look at a similar project of mine at https://github.com/coderaven/datatable-exercise/tree/parallel_processing
(Basically just focus on object_record.rb model and the jobs: import_csv_job.rb and process_csv_job.rb)
The error: Errno::ENOENT: No such file or directory # rb_sysopen
If you said that this works on heroku then probably that means that the path you are getting this is valid (in your example you are using the /tmp/ path)
So here's 2 probable problems and their solution:
1.) You have saved an unknown to Heroku path (or inaccessible path) which cannot be access or opened by the application when it is running. Since, when handling the import csv without sidekiq - the file you uploaded are save temporarily in-memory until you finish processing the csv - However, in a job scheduler (or sidekiq) the path should not be in memory and should be an existing path that is accessible to the app.
Solution: Save the file to a storage somewhere (heroku has an ephemeral filesystem so you cannot save files via the running web-app) to work this around, you have to use an Amazon S3 like service (you can also use Google Drive like what I did) to save your files there and then give the path to your sidekiq worker - so it can access and process it later.
2.) If the paths are correct and the files are save or processed correctly then from my experience it could have been that you are using File.open instead of the open-uri's open method. File.open does not accept remote files, you need to require open-uri on your worker and then use the open method to work around remote files.
ex.
require 'open-uri'
class ProcessCsvJob < ActiveJob::Base
queue_as :default
def perform(csv_path)
csv_file = open(csv_path,'rb:UTF-8')
SmarterCSV.process(csv_file) do |array|
.... code here for processing ...
end
end
end
I'm fully aware this question is already past almost a year, so if you have solved this or this answer worked then it could also help serve as a documentation archive for those who will probably experience the same problem.

You can't pass a file object to the perform method.
The fix is to massage the data beforehand and pass in the parameters you need directly.
Something like...
def import_csv(file)
CSV.foreach(file.path, headers: true) do |row|
new_user = { email: row[0], password: row[1] }
CsvImportJob.perform_async(new_user)
end
end
Note: you'd call CsvImportJob.perform_later for Sidekiq with ActiveJob and Rails 5.

You got the error because on production/staging and sidekiq run on different servers.
Use my solution: upload csv to google cloud storage
class Services::Downloader
require 'fog'
StorageCredentials = YAML.load_file("#{::Rails.root}/config/g.yml")[Rails.env]
def self.download(file_name, local_path)
storage = Fog::Storage.new(
provider: "Google",
google_storage_access_key_id: StorageCredentials['key_id'],
google_storage_secret_access_key: StorageCredentials['access_key'])
storage.get_bucket(StorageCredentials['bucket'])
f = File.open(local_path)
storage.put_object(StorageCredentials['bucket'], file_name, f)
storage.get_object_https_url(StorageCredentials['bucket'], file_name, Time.now.to_f + 24.hours)
end
end
Class User
class User < ApplicationRecord
require 'csv'
require 'open-uri'
def self.import_data(file)
load_file = open(file)
data = CSV.read(load_file, { encoding: "UTF-8", headers: true, header_converters: :symbol, converters: :all})
...
Worker
class ImportWorker
include Sidekiq::Worker
sidekiq_options queue: 'workers', retry: 0
def perform(filename)
User.import_data(filename)
end
end
and code for start worker
--
path = Services::Downloader.download(zip.name, zip.path)
ImportWorker.perform_async(path)

Issue with testing Tempfile in ruby

I have a block of code that creates Tempfiles
#tmp_file = Tempfile.new("filename")
I keep them closed after creation,
#tmp_file.close unless #tmp_file.closed?
When there is a need to add data to the temp files I open them and add data as below
def add_row_to_file(row)
#tmp_file.open
#tmp_file.read
#tmp_file.print(row.to_json + "\n")
end
All is well, but for testing the same I have stubbed tempfile as below and is creating an error when the test case runs into add_row_to_file(row)
buffers = {}
Tempfile.stub(:new) do |file_name|
buffer = StringIO.new
buffers[file_name] = buffer
end
Error message is :
Failure/Error: ]],
NoMethodError:
private method `open' called for #<StringIO:0x00000010b867c0>
I want to keep the temp files closed on creation as there is a max temp files open issue at OS level (I have to deal with uploading lot of tempfiles to S3)
but for testing I have a problem accessing the private method of StringIO.
Any idea how to solve this problem? Thanks.
I have a work around, which is to skip closing the StringIO when in test environment.
#tmp_file.close unless #tmp_file.closed? || Rails.env.test?
and update add_row_to_file(row) as below
def add_row_to_file(row)
#tmp_file.open unless Rails.env.test?
#tmp_file.read unless Rails.env.test?
#tmp_file.print(row.to_json + "\n")
end

Apart from the work-around provided if we want 100% code coverage while in test we can try the below.
Do not close the stream if it is StringIO
#tmp_file.close unless #tmp_file.is_a?(StringIO)
so we dont need to open it while in test.
def add_row_to_file(row)
#tmp_file = File.open(#tmp_file, 'a+') unless #tmp_file.is_a?(StringIO)
#tmp_file.print(row.to_json + "\n")
end
In this way we can achieve the Tempfile testing without actually creating Files while in test environment and having 100% code coverage while in test.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

"Errno::EMFILE: Too many open files" with local images on create - ruby-on-rails

Just ran into this myself. It looks like the master branch has a fix. See my comments here: https://github.com/thoughtbot/paperclip/issues/1326?source=cc

If this is a development/test environment and you want a quick resolution. Try to identify resque process id, kill it and restart the resque server. Additionally you may try the below Redis.current.client.reconnect $redis = Redis.current

Related

Why am I getting Listen Loop Bad File Descriptor errors on my machine but not anyone else's machine when specific code is enabled?

Capybara + Selenium-webdriver + RSpec file fixtures + SSR giving Net::ReadTimeout

A copy of XXX has been removed from the module but is still active

Problems with file.path with csv import via sidekiq on heroku

Issue with testing Tempfile in ruby

Categories

Resources