Ruby on rails background processing and temporary data storage

Ruby on rails background processing and temporary data storage - ruby-on-rails

I'm interested in creating a system that can queue user ids into categories and then poll values at regular intervals in order to run some code with them.
I'm unsure of how to do this in rails however, but my first thoughts would be to have some sort of temporary db table that stores the ids alongside categories, and resets if the server restarts. I have no idea how I would implement the background process to repeatedly process entries. Perhaps I can possibly achieve all of this with some sort of background worker?

For executing background jobs in rails there are so many ways so try anyone
sidekiq
delayed_job
microservices

Related

Will caching or background job improve my Rails App performance?

My Rails App syncs calendar events from gmail through the Nylas API. I am storing all the events and associated calendars on my app (either creating new or updating existing). It takes a very long time, in fact, I get timeout errors on my Heroku hosted Rails App whenever I try to sync a calendar. Not sure why it takes a very long time. So to react, I want to either start caching (using Redis or Memcached) the data (still don't know exactly how I will do that) OR run the sync in a background job (using Delayed_Job or Resque).
I wanted to know how others would tackle this problem. Would appreciate some feedback on not only what approach to take, but pointers in how would be appreciated as well.

If you need fast, persistent access within your app to a large set of calendar events that are ultimately sourced from an external system, then I'd create (in fact have already created) models for the calendars and the events. The structure ought to be fairly obvious from the structure of the API, and I would just persist them in your database so you can use ActiveRecord methods to retrieve/sort them.
It's unlikely that you'd need a caching layer on top of the model.
Synchronisation is definitely a background job.

You can use both but majorly background jobs, delayed_job or sidekiq.
You can also run a cron task that periodically update the calendar data in your app.
To fetch data you can fetch from memory store then database, where memcache will be useful.

Should data being used by ActiveJob (resque) be persisted or put into a ruby object and passed by object id?

I am using Twilio to send/receive texts in a Rails 4.2 app. I am sending in bulk, around 1000 at a time, and receiving sporadically.
Currently when I receive a text I save it to the DB (to, from, body) and then pass that record to an ActiveJob worker to process later. For sending messages I currently persist the Twilio params to another DB and pass that record to a different ActiveJob worker. Since I am often doing it in batches I have two workers. The first outgoing message worker sends a single message. The second one queries the DB and finds all the user who should receive the message, creates a DB record for each message that should be sent, and then passes that record to the first outgoing message worker. So the second one basically just creates a bunch of jobs for the first one to process.
Right now I have the workers destroying the records once they finish processing (both incoming and outgoing). I am worried about not persisting things incase the server, redis, or resque go down but I do not know if this is actually a good design pattern. It was suggested to me just to use a vanilla ruby object and pass it's id to the worker but I am not sure how that effects data reliability. So is it over kill to be creating all these DBs and should I just be creating vanilla ruby objects and passing those object's ids to the workers?
Any and all insight is appreciated,
Drew

It seems to me that the approach of sending a minimal amount of data to your jobs is the best approach. Check out the 'Best Practices' section on the sidekiq wiki: https://github.com/mperham/sidekiq/wiki/Best-Practices
What if your queue backs up and that quote object changes in the meantime? Don't save state to Sidekiq, save simple identifiers. Look up the objects once you actually need them in your perform method.
Also in terms of reliability - you should be worried about your job queue going down. It happens. You either design your system to be fault tolerant of a failure or you find a job queue system that has higher reliability guarantees (but even then no queue system can guarantee 100% message deliverability). Sidekiq pro has better reliability guarantees than sidekiq (non-pro), but if you design your jobs with a little bit of forethought, you can create jobs that can scan your database after a crash and re-queue any jobs that may have been lost.
How much work you spend desinging fault tolerant solutions really just depends how critical it is that your information make it from point A to point B :)

Create multiple Rails servers sharing same database

I have a Rails app hosted on Heroku. I have to do long backend calculations and queries against a mySQL database.
My understanding is that using DelayedJob or Whenever gems to invoke backend processes will still have impact on Rails (front-end) server performance. Therefore, I would like to set up two different Rails servers.
The first server is for front-end (responding to users' requests) as in a regular Rails app.
The second server (also a Rails server) is for back-end queries and calculation only. It will only read from mySQL, do calculation then write results into anothers Redis server.
My sense is that not lot of Rails developers do this. They prefer running background jobs on a Rails server and adding more workers as needed. Is my sever structure a good design, or is it an overkill? Is there any pitfall I should be aware of?
Thank you.

I don't see any reason why a background job like DelayedJob would cause any more overhead on your main application than another server would. The DelayedJob runs in it's own process so the dyno's for your main app aren't affected. The only impact could be on the database queries but that will be the same whether from a background job or another app altogether that is accessing the same database.
I would recommend using DelayedJob and workers on your primary app. It keeps things simple and shouldn't be any worse performance wise.
One other thing to consider if you are really worried about performance is to have a database "follower", this is effectively a second database that keeps itself up to date with your primary database but can only be used for reads (not writes). There may be better documentation about it, but you can get the idea here https://devcenter.heroku.com/articles/fast-database-changeovers#create_a_follower. You could then have these lengthy background jobs read data from here leaving your main database completely unaffected.

Rails concurrent background processes

I need to access and pull data from a number of API's over the the course of the a number of days. This is streaming data so the process will be running all the time. Each process will pulling in data and inserting it into a separate google fusion table.
As I want to run this processes in the background and forget about them, just being able to monitor is they fail and don't restart.
I have looked at Delayed Job, Resque, Beanstalk etc and my question is can these run processes concurrently. I don't want to queue processes just run them in the background.
I looked at Spawn as well, but didn't completely understand how it worked.
So what options are available to me, does anybody does have any recommendations?

I would use the whenever gem to schedule cron jobs to pull data.
every 2.hours do
YourApi.do_whatever
SecondApi.do_the_thing
end

Maybe a custom background daemon is a better fit you, take a look at daemon_generator. But note that you probably have to do some work if you want to do things concurrently but just processing things in serial should be quite easy.

Ruby on Rails: keep collecting data in background and store in DB

Here is my scenario: I have some data coming into serial port. I want to keep collecting this data and store in the database. Then the rails app uses this data from the database and shows some statistics like graphs and stuff.
So my question is how can I keep collecting this data in a separate thread from the rails app, while the rest of the things work like any other rails app on the database.
If there is a better way of doing this please advice.
PS: I dont have any problem in reading from serial port. This is about doing this task from rails app in separate thread.

Use the delayed_job gem.
Delayed_job (or DJ) encapsulates the
common pattern of asynchronously
executing longer tasks in the
background.

Another thing you can check out is the resque gem which github published, it makes use of redis as a working queue, and has a very handy web interface, that helps controlling the working threads. It is very similar to delayed jobs, but may fit your needs in a better way.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Ruby on rails background processing and temporary data storage - ruby-on-rails

For executing background jobs in rails there are so many ways so try anyone sidekiq delayed_job microservices

Related

Will caching or background job improve my Rails App performance?

Should data being used by ActiveJob (resque) be persisted or put into a ruby object and passed by object id?

Create multiple Rails servers sharing same database

Rails concurrent background processes

Ruby on Rails: keep collecting data in background and store in DB

Categories

Resources