Medium scale repeated tasks in Rails - ruby-on-rails

I'm building a web app that tracks stats for a game. The API for that game (Steam Web API) only allows me to retrieve data from the most recently played match.
When a user requests their stats to be tracked, I need make a call to the Steam Web API every 10 minutes or so to check if they have played another match, then store it in the database if they have. The problem is, I check every 10 minutes for every user...
Is there a way to schedule this efficiently so that the server doesn't get overloaded? This application could potentially server 10,000+ users.
Please feel free to correct anything in this question if I got something obviously wrong.

There are many gems for recurring events in ruby, like:
ice cube
recurrence
However, if you are going to have lots of users, and you are worried about server load, I would suggest not using your rails app to do this.
Instead, build another service which doesn't run on your rails app to update your database with statistics.
Having an independent service like this decoupled from your main app allows you to easily put it on another server, and/or have it scale independently from your webapp.

Related

Interval based API access and processing different DSL

Background
I'm currently working on a small Rails 5 project that needs to access and process an external API. There is a ruby wrapper gem available for the API, so accessing the data is not a problem.
Problem description
There are two parts of the equation that I am currently missing, and hoping someone out there can help me with.
1: I need to call the API, via Rails, every 15 minutes. How can I realize this? I was looking towards Active Job for this, but my research kind of stalled after getting no useful results.
2: The external API has different domain models and a different domain-specific language than my application. How can I map the different models without changes in Active Record?
1: I need to call the API, via Rails, every 15 minutes. How can I realize this? I was looking towards Active Job for this, but my research kind of stalled after getting no useful results.
The first problem you can solve using recurring tasks. The main idea is to run the process that will perform some operations every x minutes (or days or whatever fits your problem.
There are several tools that you can use. One of them is built-in the unix system and it is cron. You can read about it in system's manual. You can easily manage it using whenever gem. The main disadvantage is that you need an access to the system's cron which may be non-trivial on non-bare machines (for example Platform as a Service hosts such as Heroku).
You should also take a look at clockwork which does not rely on the system's cron. It uses approach where you have a separate process running all time and it keeps an eye on defined tasks.
In the second approach (having a separate process) you need to remember that time-consuming instructions may "lock" the process and postpone another tasks. In this case, you may want to use background processing such as sidekiq or delayed_job. The idea is to use one process for scheduling tasks at certain time and another process to process those tasks as soon as they appear in the queue.
2: The external API has different domain models and a different domain-specific language than my application. How can I map the different models without changes in Active Record?
You need to create a client that will consume the API and map its responses into models that you have in your application. This way, you don't need to make your model's scheme dependent on the API scheme. Take a look at resource_kit gem - this is a sample solution that uses this approach.
HI hdauven,
processing the API every 15 minutes will affect your server performance,so done it by using sidekiq, it is a background job and use sidetiq it will help you to perform the task every 15 min automatically
You are accessing API, Then why are you worrying about different domain.

iOS chat app design

I'm building a simple chat app on iOS for fun (and to have projects to gain experience from), using socketsIO and a node backend. I am trying to figure out the best design for messages. I was planning to use a mongoDB database where each conversation would have its message data stored. Whenever the client sends a new message to the server, the server adds it to the appropriate conversation in the database.
I was also hoping to create a user Sign Up/Log In system which would add you to the database.
However, I've googled around quite a bit and I am really not sure if creating a database made up of conversations (that get updated whenever a sentMessage event is triggered) and user data is the right way to go.
Additionally, I've seen some people talk about saving the chats on the actual devices themselves, not in a database? What is the common design pattern for a chat app like this?
for the design I would use socket.io for emitting messages as well. It has a great community behind it, I woul also use MongoDb because everything is using JSON format and it's integrated so well with Node due to it using JavaScript.
Now the part you are interested about, is REDIS. Redis is a database that sits in RAM on the web and should be used with mongodb if you're going to be having higher traffic / need quick speed / less hanging and waiting.
REDIS would be your temporary save for the chat with a session because doing disk write/read/querying is a lot on the machine (looking at you MongoDB), If you plan on saving the chat with every message. Doing so MongoDb would just not scale all the well in the long run and is not as fast as REDIS. Mind you REDIS database will only hold the temporary chat log of let's say the last 1 million chat session or some limit (it's all in RAM so the size is limited can't have Terabytes or hundreds of Gigabytes of RAM on 1 server).
so the data flow would look something like
user sends message
server receives messsage via HTTP(S) post/put - Ajax/Observable
Server will use socket.io to emit the message to the designated user while saving the message to REDIS with a specific key/session/message.
designated user get's the update on their screen via io event.
-- inbetween there should be a check on the REDIS db of whether it is getting full. if it's full remove the last 10,000 inactive messages (could be from 1 year ago if the server hasn't gotten full yet) to make some space.
Saving the chat on the phone is an okay idea as it would save the users data/bandwidth and they could potentially look at their message while offline.
a solution is using SQL Lite which is a lightweight library that will sit inside your app acting as a database which you can perform queries on if your familiar with RDBMS you will have no problem implementing it. But now you gotta find a good way to manage saving data to REDIS/SQL-LITE/MongoDb.

Will caching or background job improve my Rails App performance?

My Rails App syncs calendar events from gmail through the Nylas API. I am storing all the events and associated calendars on my app (either creating new or updating existing). It takes a very long time, in fact, I get timeout errors on my Heroku hosted Rails App whenever I try to sync a calendar. Not sure why it takes a very long time. So to react, I want to either start caching (using Redis or Memcached) the data (still don't know exactly how I will do that) OR run the sync in a background job (using Delayed_Job or Resque).
I wanted to know how others would tackle this problem. Would appreciate some feedback on not only what approach to take, but pointers in how would be appreciated as well.
If you need fast, persistent access within your app to a large set of calendar events that are ultimately sourced from an external system, then I'd create (in fact have already created) models for the calendars and the events. The structure ought to be fairly obvious from the structure of the API, and I would just persist them in your database so you can use ActiveRecord methods to retrieve/sort them.
It's unlikely that you'd need a caching layer on top of the model.
Synchronisation is definitely a background job.
You can use both but majorly background jobs, delayed_job or sidekiq.
You can also run a cron task that periodically update the calendar data in your app.
To fetch data you can fetch from memory store then database, where memcache will be useful.

Best way to run rails with long delays

I'm writing a Rails web service that interacts with various pieces of hardware scattered throughout the country.
When a call is made to the web service, the Rails app then attempts to contact the appropriate piece of hardware, get the needed information, and reply to the web client. The time between the client's call and the reply may be up to 10 seconds, depending upon lots of factors.
I do not want to split the web service call in two (ask for information, answer immediately with a pending reply, then force another api call to get the actual results).
I basically see two options. Either run JRuby and use multithreading or else run several regular Ruby instances and hope that not many people try to use the service at a time. JRuby seems like the much better solution, but it still doesn't seem to be mainstream and have out of the box support at Heroku and EngineYard. The multiple instance solution seems like a total kludge.
1) Am I right about my two options? Is there a better one I'm missing?
2) Is there an easy deployment option for JRuby?
I do not want to split the web service call in two (ask for information, answer immediately with a pending reply, then force another api call to get the actual results).
From an engineering perspective, this seems like it would be the best alternative.
Why don't you want to do it?
There's a third option: If you host your Rails app with Passenger and enable global queueing, you can do this transparently. I have some actions that take several minutes, with no issues (caveat: some browsers may time out, but that may not be a concern for you).
If you're worried about browser timeout, or you cannot control the deployment environment, you may want to process it in the background:
User requests data
You enter request into a queue
Your web service returns a "ticket" identifier to check the progress
A background process processes the jobs in the queue
The user polls back, referencing the "ticket" id
As far as hosting in JRuby, I've deployed a couple of small internal applications using the glassfish gem, but I'm not sure how much I would trust it for customer-facing apps. Just make sure you run config.threadsafe! in production.rb. I've heard good things about Trinidad, too.
You can also run the web service call in a delayed background job so that it's not hogging up a web-server and can even be run on a separate physical box. This is also a much more scaleable approach. If you make the web call using AJAX then you can ping the server every second or two to see if your results are ready, that way your client is not held in limbo while the results are being calculated and the request does not time out.

Is it possible to have a stateless timed function

I'm trying to set a reminder in a system to fire at a certain time.
This is a web based app, so it's not like it will be in memory all the time.
Ideally I'd like to avoid using a service or job on the server(mainly out of curiosity, to see if there is a more efficient way to do it)
For example, imagine how many Ebay bids are constantly ending all the times, and emails being sent out seemingly perfectly in time.
Do people recon there is just a big loop going over and over, moving items into a queue etc... Or is there something lower level helping out (stored procedures, triggers etc)
Thanks everyone.
What you have to realize about eBay - and most large database-backed websites - is that the interactions between humans and the database that come through the web server are only a part (sometimes a very small part) of the functionality of the system.
To use eBay as an example, the email that goes out when auctions expire is not handled by a web server. They are far more likely to have that scripted. In other words, there is another program running on a number of their systems that look at the database for ended auctions, do some processing on them, send emails, etc.
If I were doing something similar (albeit on a much smaller scale,) I'd have my web services built in the usual way, but have a job that is run automatically every few minutes to do the maintenance work. It would start up, look at the database for work, process anything that was required, then exit.

Resources