E-mailing users through RoR app - ruby-on-rails

I am at the tail end of building a forum/Q&A community-based application, and I would like to add email notifications. The app has several different entities, including: threads, questions, projects, photos, etc. The goal is that a user can "subscribe" to any number of these entities, queuing an e-mail whenever the entity receives new comments or activity. This functionality is very similar to facebook and forums.
I have looked into ActionMailer (with rake tasks and delayed jobs), MailChimp API (and plugins), and other app mailers (PostageApp and Postmark).
I am leaning against ActionMailer, because of potential issues with memory hogging and server overload. The app will be running on Heroku, but I'm afraid the servers could be easily overwhelmed sending out potentially hundreds of emails every few minutes.
Another complexity is that there will be different types of subscriptions (instant email notification, daily email notification) based on user preference.
What would be the best way to manage email for functionality like this? Any tips/ideas are greatly appreciated!

You can use ActionMailer to send with SendGrid, or Postmark. PostageApp still needs an SMTP server and adds an additional dependency, but it can be nice to have. MailChimp is for newsletters only I believe, so that's probably not much use for you here.
Giving a high level overview here, a few things are important:
Keep mailer logic from cluttering controllers.
Prevent delaying responses to user requests.
Avoid issues with "application overload".
Handle event-based and periodic emails.
To address #1, you will want to use an Observer to decide when to send an event-based email. To account for #2 and #3, you can drop in DelayedJob so that emails are sent in the background. You can use SendGrid with ActionMailer on Heroku pretty easily (especially if you drop in Pony). For #4 you should just create a rake task that handles the business logic of deciding who to email and queues the send jobs as DJ tasks like the Observer would.
And just to be clear, DelayedJob will execute jobs in a separate process. In the case of Heroku, you're actually running each DelayedJob worker in a different Dyno, which is an entirely separate stack/environment (quite probably on a different physical server). There won't be any issues with your app getting overloaded this way, unless of course your database can't keep up with adding jobs (in which case you can use Redis as a DJ store instead). You can easily scale horizontally by adding more DJ workers as needed.

Take a look at SimpleWorker, a cloud-based background processing / worker queue.
It's an add-on for Heroku and is able to scale up and out to handle a set of one-time posts or scheduled emails. (A master job, for example, can be scheduled and then when it comes off schedule to run, it queues up 10s, 100s, 1000s of jobs to run concurrently across a scaled out infrastructure.)
Heroku workers can work fine given they'll run as separate processes but if you have variable load, then you want a service that can scale up and scale down by the jobs -- so you a) don't pay for unused capacity and b) can handle burst traffic and batch output.
(Disclosure: I work for the company.)

Related

options for managing ActiveMailer generated queues

An application, hosted as the only application on a server, will be handling e-mails for large numbers of users who will launch groups of mailings. Most other processing by the application is not very intense.
While the volumes of mails will not be massive, they are important ≈ in the thousands per day. Mails will mostly be sent as individual items following an action that involves multiple mail recepients; a lag will occur between individual items and within sub-groups of the mail recipients.
In other words, each mail can have a calculation as to the time when it should be issued.
There are multiple options for handling queues, which I would group into two categories.
a) RAM-based objects. These have the disadvantage of losing the queues if something happens to the server.
b) database-based objects. These require more processing. (I can only think of a mechanism whereby the mails are stored with their time release and a cron job (scheduler gem) check every minute for unreleased mails and where a datetime is < Time.now, sending them off and modifying the mail's 'released' attribute)
Not having experience with any of the queuing options, my question is based on your experience, which option and ActiveJob adapters (or non!) makes the most sense given the context, while containing complexity?

Should i make a seperate app to send push notifications to 40-50k users in RoR App or use background jobs

I have a rails application that is in fact a backend of a popular IOS application which have a user base of 200k users who needs to be notified time to time.
Daily 40-50k users will be notified using push notifications. These push notifications will be realtime and scheduled ones. eg: if a new users signs up he will be notified within few seconds. eg: scheduled notifications will run at 10 pm daily with limited users ranging 10k-30k or sometimes more upto 100k.
I also will be doing business reporting to generate list of users fulfilling certain criteria and it requires firing mysql queries that could take upto 1-2 minutes of time.
My area of concern is should i have a seperate application with seperate mirror db to send push notifications to these users so my IOS users doesnt feel lag while using this application when push notifications are triggered or business reporting query is triggered.
Or should i use background jobs like Rails Active job, Sidekiq or Sucker Punch to perform push notifications and triggering business reporting queries.
Is background jobs in rails so powerful that it can manage this condition and doesn't let App users to feel lag in experience.
My application stack is:
Rails: 4.1.6
Ruby: 2.2
DB: Mysql
PaaS: AWS Elastic Beans
IOS Push gem: Houston
In my opinion, there are several factors that affect my decision.
1. Does your service need to keep many persistent connections?
If your answer is YES, then use another language which has better asynchronous IO (like Node.js) to implement your push service.
If your answer is NO, which means you only send requests to third-party services (like APNS), then consider the next factor.
2. Do you have to reuse your domain model in your push service?
If your answer is YES, then stick to Active Job + Sidekiq.
If your answer is NO, which means you only uses some fields (like id, name) of some table (like users), then consider the next factor.
3. Does your server have a limited memory resource?
A rails processes often consumes several hundreds of MB of memory, and Sidekiq requires a separate Rails process which can't be preforked (which means it does not share memory with your Rails app).
So if your answer is YES, then consider create a separate lightweight push service.
As for mirror database, if I have to do heavy query before push, I will definitely use mirror database.

One Delayed_job per email vs. delayed_job for all emails?

As part of my app I am sending out an email to many users daily. Depending on their status they will be sent one of five possible types of emails.
The logic that determines which email the user receives is fairly long.
Should I:
1) Should I create a delayed_job for each email
or
2) Send the entire logic (50 lines of Ruby) with the send commands into a single job
What are the pros/cons of either approach?
Further to Sabyasachi Ghosh's answer, here are the differences between DelayedJob and Resque:
DelayedJob relies on the DB
Requires ActiveRecord
Uses Ruby Objects (not just references)
Has much deeper queuing functionality (queue depth etc)
Runs much heavier than resque
Resque relies on Redis
Lightweight
Runs independently of ActiveRecord
Is meant to process references (not entire objects)
Modularity
In answer to your question, I would look at modularity
Rails' is based on the principle of DRY code -- which essentially means you need to be as modular as possible (reusing code wherever you can). This leads to efficiency & simpler development cycles
In light of this, you have to observe your queueing functionality from the perspective of modularity. What does the queuing system actually do?
It queues things
Therefore, you want to include as little code as possible in the queuing system
I would create a redis instance (you can get them on Heroku), and use resque to queue specific information (such as id or email)
This will allow you to use resque to run through the Redis list, sending as many emails as you need
If you have a huge logic i recommed not to put in the delayed job if you need to send a bunch of email. better use resque(https://github.com/resque/resque) or sidekiq(http://sidekiq.org/). as in the time of sending email it delayed job will lock your database so your performance will be low.
If you have small logic and less number of email just go for delayed job for each email. as it is easy to setup and implement.
I think so you need to send each email via delayed job because if anything happens with your delayed job like if your delayed job get crashed or stopped in that case when system will re execute your job it can cause problems so i suggest you to add each email in delayed job.

Setting up queues for resque workers with multiple similar jobs

I have a number of different event-driven emails being sent with action mailer (ex. send an email when a user follows you etc.) and I need to move all of this into resque workers. My question is what is the best way to set up these workers? Should I create a separate file for each type of email being sent, or would it make sense to make one file for all emails and put them each in different classes within that file? The latter makes more sense to me, and if I do that, should I assign all of the emails to the same queue or different queues?
This is a very subjective question as it really depends on your volume, setup, etc.
For my main project that handles a bunch of e-mail, I have a single Rails mailer class that handles all my notifications, and in turn, I have a single Resque worker to handle mailing those out, all tied to my :email queue.
You probably wouldn't really need to worry about multiple queues until performance became an issue and you needed to start giving higher priority to certain queues, etc.

Practical use of delayed background job when dealing with many users

When a background job starts, it's sent to the back of a queue where a worker handles it; a task clears and the other starts. I think I've got this one right except I don't understand the practical side of it in some cases. Sure, if you're a company sending out 15,000 newsletters once a week using a delayed job makes perfect sense. But when you have an application of even 100 users, in which some task is long enough to need background work (like sending/fetching emails that might take a minute) then each user will have to wait in line while another user gets cleared (in case there's a single worker).
This is the part I'm not sure I'm getting right. I'm talking about the same job, but individually for each user. Does that count as a job per user? If I have 100 users, do I need to keep 100 workers for each one's process to not get tied up?
I've tried using delayed_job to simulate that, and indeed when I sign in with a different account I have to wait until another user's email gets sent until mine is. While the plugin is swift and simple to work with, I think it's not the right approach here.
I've also tried using Ajax, but since it's an HTTP request it ties up the browser in loading mode until it gets a response from the server (even with async: true). Not sure if I ruled this one out too quickly, but I was sortof looking for a more elegant server solution.
Is there a way to achieve a background job like this? (I've heard of different, mostly commercial solutions promising little waiting time, but I'm interested in completely eliminating the queue between users). If not, is there a method to make an ajax request without waiting for a response? I realize my questions are both drastically different but both seem like an appropriate solution to this problem.
Resque is a background processing engine that can support multiple queues.
Ways you could use this:
Group your tasks into queues that make sense on their priority. If you need fast response times, use it in a 'foreground' queue. Slow? (like sending/receiving emails) can be in the 'background' queue
Have one queue per user (you will need to have many many workers for this)
This SO question also gives a way to use delayed_jobs with multiple queues/tables
The purpose of delayed_job and other message queues is to asynchronously process jobs outside of your core application. I always use a queue for sending email since I'm relying on an outside application (sometimes a third-party API like gmail) to send them and I can't guarantee available and operating efficiency.
So for your use case, even with very few users, I highly recommend offloading emails to delayed_job. This will speed up your front end (ajax) and will also give you retries upon failure. You could spin up multiple workers to process the queue, but it shouldn't be necessary with your numbers unless your calls to send mail are taking a really long time (more than a couple seconds?).
And yes in most situations I'd create separate jobs for each user even though the message might be identical. The only time I'd process them all together would be if the email application / API has bulk sending and you can reduce the number of calls significantly by sending a large payload in a few calls.

Resources