Error: too many emails per second on Heroku - ruby-on-rails

I have rails 4 app on Heroku sending emails via deliver_later. Sidekiq is running and working and I have config.active_job.queue_adapter = :sidekiq in my config.
I also see this in the logs [ActiveJob] [ActionMailer::DeliveryJob] which makes me believe emails are being sent as background jobs.
So why do I still get errors about too many emails per second?
Net::SMTPUnknownError: could not get 3xx (550: 550 5.7.0 Requested action not taken: too many emails per second
I just noticed I have sidekiq concurrency set to 3. Maybe that's the problem?

This is not going to be related to Sidekiq (well, not directly, anyway). This relates to whatever SMTP server you are using and the server's rate limiting. If you have multiple instances of Sidekiq firing off emails near simultaneously, this error might get thrown if you have a relatively low rate limit.
For instance, if you are using Mailtrap as a SMTP server in a development or staging environment, a free account is limited to 2 emails per second. Pretty easy to run into this error.
If you can't avoid running into this issue for your purposes, you can try a few different strategies by using the keyword argument 'wait' for ActionMailer's deliver_later method, e.g.:
users_i_need_to_email.each_with_index do |user, index|
t = 5 * index
UserMailer.send_important_stuff(user)
.deliver_later(wait: t.seconds)
end
In the event your :deliver_later calls are operating in different processes or flows, you can get away with randomizing the delivery time (say, some random time up to 5 minutes).
Obviously, neither solution above is full proof, so you are best off using a SMTP server that is adequate for your purposes.

Definitely seems to be an issue with sidekiq concurrency set to 3.
Additional info: Setting concurrency to 1 fixes the issue but I don't want all jobs to be processed by a single worker. It would be great if Sidekiq would let use set concurrency per queue so emails could use a single worker while other job could use multiple workers. However, sidekiq author said it's not possible.
I found a solution to my particular problem here using heroku which involves putting each queue in it's own heroku dyno which allows you to set concurrency in each dyno.

Related

ActiveJob queue limiting

Seems that the task is simple and straightforward: I need to limit amount of jobs that can be performed at the same time, so my server won't blow up. But google is silent, perhaps I'm doing something wrong? Enlighten me please?
I use standard async adapter.
It's not recommended to use default rails async adapter in production, especially for heroku dyno that restart itself once per day.
For enqueuing and executing jobs in production you need to set up a
queuing backend, that is to say you need to decide for a 3rd-party
queuing library that Rails should use. Rails itself only provides an
in-process queuing system, which only keeps the jobs in RAM. If the
process crashes or the machine is reset, then all outstanding jobs are
lost with the default async backend. This may be fine for smaller apps
or non-critical jobs, but most production apps will need to pick a
persistent backend.
There are plenty of supported adaptor to choose from such as:
Sidekiq
Resque
Delayed Job
It's easy to start, they provide clear instruction and example.
If you would like to use the default async adapter in development and would like to limit the maximum number of jobs that will be executed in parallel, you can add the following line in your config/application.rb file:
config.active_job.queue_adapter = ActiveJob::QueueAdapters::AsyncAdapter.new min_threads: 1, max_threads: 2, idletime: 600.seconds
So in this case at most two will run at the same time. I think the default maximum is twice the number of processors.
I have a case where I need to limit it to one and that works just fine (Rails 7).
Source: https://api.rubyonrails.org/classes/ActiveJob/QueueAdapters/AsyncAdapter.html

RoR: multiple calls in a row to the same long-time-response controller

Update:
Read "Indicate to an ajax process that the delayed job has completed" before if you have the same problem. Thanks Gene.
I have a problem with concurrency. I have a controller scraping a few web sites, but each call to my controller needs about 4-5 seconds to respond.
So if I call 2 (or more) times in a row, the second call needs wait for the first call before starting.
So how I can fix this problem in my controller? Maybe with something like EventMachine?
Update & Example:
application_controller.rb
def func1
i=0
while i<=2
puts "func1 at: #{Time.now}"
sleep(2)
i=i+1
end
end
def func2
j=0
while j<=2
puts "func2 at: #{Time.now}"
sleep(1)
j=j+1
end
end
whatever_controller.rb
puts ">>>>>>>> Started At #{Time.now}"
func1()
func2()
puts "End at #{Time.now}"
So now I need request http://myawesome.app/whatever several times at the same times from the same user/browser/etc.
I tried Heroku (and local) with Unicorn but without success, this is my setup:
unicorn.rb http://pastebin.com/QL0wdGx0
Procfile http://pastebin.com/RrTtNWJZ
Heroku setup https://www.dropbox.com/s/wxwr5v4p61524tv/Screenshot%202014-02-20%2010.33.16.png
Requirements:
I need a RESTful solution. This is API so I need to responds JSON
More info:
I have right now 2 cloud servers running.
Heroku with Unicorn
Engineyard Cloud with Nginx + Panssenger
You're probably using webrick in development mode. Webrick only handles one request at a time.
You have several solutions, many ruby web servers exist that can handle concurrency.
Here are a few of them.
Thin
Thin was originally based on mongrel and uses eventmachine for handling multiple concurrent connections.
Unicorn
Unicorn uses a master process that will dispatch requests to web workers, 4 workers equals 4 concurrent possible requests.
Puma
Puma is a relatively new ruby server, its shiny feature is that it handles concurrent requests in threads, make sure your code is threadsafe !
Passenger
Passenger is a ruby server bundled inside nginx or apache, it's great for production and development
Others
These are a few alternatives, many other exist, but I think they are the most used today.
To use all these servers, please check their instructions. They are generally available on their github README.
For any long response time controller function, the delayed job gem
is a fine way to go. While it is often used for bulk mailing, it works as well for any long-running task.
Your controller starts the delayed job and responds immediately with a page that has a placeholder - usually a graphic with a progress indicator - and Ajax or a timed reload that updates the page with the full information when it's available. Some information on how to approach this is in this SO article.
Not mentioned in the article is that you can use redis or some other memory cache to store the results rather than the main database.
Answers above are part of the solution: you need a server environment that can properly dispatch concurrent requests to separate workers; unicorn or passenger can both work by creating workers in separate processes or threads. This allows many workers to sit around waiting while not blocking other incoming requests.
If you are building a typical bot whose main job is to get content from other sources, these solutions may be ok. But if what you need is a simple controller that can accept hundreds of concurrent requests, all of which are sending independent requests to other servers, you will need to manage threads or processes yourself. Your goal is to have many workers waiting to do a simple job, and one or more masters whose jobs it is to send requests, then be there to receive the responses. Ruby's Thread class is simple, and works well for cases like this with ruby 2.x or 1.9.3.
You would need to provide more detail about what you need to do for help getting to any more specific solution.
Try something like unicorn as it handles concurrency via workers. Something else to consider if there's a lot of work to be done per request, is to spin up a delayed_job per request.
The one issue with delayed job is that the response won't be synchronous, meaning it won't return to the user's browser.
However, you could have the delayed job save its responses to a table in the DB. Then you can query that table for all requests and their related responses.
What ruby version are you utilizing?
Ruby & Webserver
Ruby
If its a simple application I would recommend the following. Try to utilize rubinius (rbx) or jruby as they are better at concurrency. Although they have drawback as they're not mainline ruby so some extensions won't work. But if its a simple app you should be fine.
Webserver
use Puma or Unicorn if you have the patience to set it up
If you're app is hitting the API service
You indicate that the Global Lock is killing you when you are scraping other sites (presumably ones that allow scraping), if this is the case something like sidekiq or delayed job should be utilized, but with caution. These will be idempotent jobs. i.e. they might be run multiple times. If you start hitting a website multiple times, you will hit a website's Rate limit pretty quickly, eg. twitter limits you to 150 requests per hour. So use background jobs with caution.
If you're the one serving the data
However reading your question it sounds like your controller is the API and the lock is caused by users hitting it.
If this is the case you should utilize dalli + memcached to serve your data. This way you won't be I/O bound by the SQL lookup as memcached is memory based. MEMORY SPEED > I/O SPEED

How do I handle long requests for a Rails App so other users are not delayed too much?

I have a Rails app on a free tier on Heroku and it recently started getting some users. One of the events in my app involves querying another API and can take up to 10 seconds to finish. How do I make sure other users who visit a simple page at the same time (as another user's API event) don't need to wait 10 seconds for their page to load?
Do I need to pay for more Dynos? Is this something that can be solved with the delayed_job gem? Would another host (like AppFog or OpenShift) be able to handle simultaneous requests faster?
Update:
This question suggest manually handling threads instead of using delayed_job.
That sounds like a Delayed Job situation. If the first request is just waiting, the most efficient thing to do is assign a process to wait for it to complete and cut the Rails process loose to handle another request.
Yes you need more dynos, speccialy worker dynos those are the ones that work on the background you can check this railscast on delayed jobs that can help also:
http://railscasts.com/episodes/366-sidekiq
also here is a quick tutorial on adding unicorn and multiple threads to your free heroku instance:
https://devcenter.heroku.com/articles/rails-unicorn
you divide your dyno into two or more instances then each one can handle a different request
What kind of app server are you using? If you are using passenger or unicorn, you can have multiple worker processes that can handle simultaneous requests
http://www.modrails.com/documentation/Users%20guide%20Apache.html#_passengermaxinstancesperapp_lt_integer_gt

Perfomance issue with rake task: Would adding worker dyno resolve the issue?

In our application we are using rake task for sending mails to around 11 000 users. Each email sending is executed as a delayed job as given following.
#Users.each do |a|
a.delay.send_email(body,text)
end
It was working perfect two weeks back and suddenly slowed down. Means it was about to send all that emails in single day but currently it takes time.
We have tried to follow this performance issue but couldn't find anything so far.
1. We investigated the code, tried with single delayed job. Commented out the part taking from db etc. But it is doing in the same time
2. Tried the email sending part commented out. But time taken was same to execute the delayed job.
Later on noticed about the heroku worker process dyno. We have purchased 1 Worker and 2 Webs currently. Is that the reason it is getting delayed. If so how it was previously working? Adding more workers would increase the performance?

E-mailing users through RoR app

I am at the tail end of building a forum/Q&A community-based application, and I would like to add email notifications. The app has several different entities, including: threads, questions, projects, photos, etc. The goal is that a user can "subscribe" to any number of these entities, queuing an e-mail whenever the entity receives new comments or activity. This functionality is very similar to facebook and forums.
I have looked into ActionMailer (with rake tasks and delayed jobs), MailChimp API (and plugins), and other app mailers (PostageApp and Postmark).
I am leaning against ActionMailer, because of potential issues with memory hogging and server overload. The app will be running on Heroku, but I'm afraid the servers could be easily overwhelmed sending out potentially hundreds of emails every few minutes.
Another complexity is that there will be different types of subscriptions (instant email notification, daily email notification) based on user preference.
What would be the best way to manage email for functionality like this? Any tips/ideas are greatly appreciated!
You can use ActionMailer to send with SendGrid, or Postmark. PostageApp still needs an SMTP server and adds an additional dependency, but it can be nice to have. MailChimp is for newsletters only I believe, so that's probably not much use for you here.
Giving a high level overview here, a few things are important:
Keep mailer logic from cluttering controllers.
Prevent delaying responses to user requests.
Avoid issues with "application overload".
Handle event-based and periodic emails.
To address #1, you will want to use an Observer to decide when to send an event-based email. To account for #2 and #3, you can drop in DelayedJob so that emails are sent in the background. You can use SendGrid with ActionMailer on Heroku pretty easily (especially if you drop in Pony). For #4 you should just create a rake task that handles the business logic of deciding who to email and queues the send jobs as DJ tasks like the Observer would.
And just to be clear, DelayedJob will execute jobs in a separate process. In the case of Heroku, you're actually running each DelayedJob worker in a different Dyno, which is an entirely separate stack/environment (quite probably on a different physical server). There won't be any issues with your app getting overloaded this way, unless of course your database can't keep up with adding jobs (in which case you can use Redis as a DJ store instead). You can easily scale horizontally by adding more DJ workers as needed.
Take a look at SimpleWorker, a cloud-based background processing / worker queue.
It's an add-on for Heroku and is able to scale up and out to handle a set of one-time posts or scheduled emails. (A master job, for example, can be scheduled and then when it comes off schedule to run, it queues up 10s, 100s, 1000s of jobs to run concurrently across a scaled out infrastructure.)
Heroku workers can work fine given they'll run as separate processes but if you have variable load, then you want a service that can scale up and scale down by the jobs -- so you a) don't pay for unused capacity and b) can handle burst traffic and batch output.
(Disclosure: I work for the company.)

Resources