Sending http requests every n seconds in my Rails app - ruby-on-rails

What's the best way to send http requests to external api every n number of seconds? Where n is changing after every request.
I have inifinite loop which calculates time interval and sends http request, but I don't know what's the best way to use it in Rails app.
I though sidekiq would be perfect solution. With chained jobs, where job would send request, calculate time interval and schedule another job with set(wait: n). But it looks like Sidekiq has polling interval and set(wait: n) does not run request in exactly n seconds.
How would you do something like this?

You are totally right about Sidekiq. It will be the best solution I think. Polling interval can be configured via average_scheduled_poll_interval . Here there are documentation
Do so:
Create an async job
After the job is completed queue the same job and ask Sidekiq to wait some time. SMSDelegationJob.perform_later(wait: 10.seconds)
Don't forget to develop good logic for exception handling
Don't forget to set low polling interval
Smart root job manually or via console.
Good luck with it.

Is it n seconds between requests (i.e. from when the last one completed to when the next one starts), or should they start every n seconds, regardless of how long the last one took (or if it was successful or not)?
Answering that question should tell you whether the requests need to be made in parallel (using some form of concurrency), or whether you could just do it from a single long-lasting process.

Related

How can I use Sidekiq delay with a worker

I have a situation where I have a worker that makes multiple calls to an external API. The problem is that we have a threshold of many calls we can make to this API per hour.
What I'd like to do is to create a worker which will make these many sequential calls to this external API. If in between these calls we get an error because we've reached the number of connections we're allowed in that hour, the worker would then save the document and schedule a new worker to complete the remaining API calls at a later time (maybe 1, 2 hours later. Ideally this should be configurable e.g.: 10mins, 1hour, etc).
Is there someway I could achieve this?
With SideKiq you can scheduled when a job will be executed with a friendly API :
MyWorker.perform_in(3.hours, 'mike', 1) # Expect a duration
MyWorker.perform_at(3.hours.from_now, 'mike', 1) # Expect a date
Check it out : Scheduled Jobs
You want Sidekiq Enterprise and its Rate Limiting API. The alternative is tracking the rate limit yourself and rescheduling the job manually.
https://github.com/mperham/sidekiq/wiki/Ent-Rate-Limiting

enqueuing jobs using sucker punch

I have one doubt with enqueuing the job using sucker punch.
I have 2000+ search keywords in my database I want to know the google and bing ranking for each keyword in my database. For this I'm using Authority Labs API. But AuthorityLabs will only process 1000 POST request in 1 hour. I'm sending each request to AuthorityLab as a background job using sucker punch. How can I limit only 1000 jobs will run in 1 hour, remaining jobs only start after one hour. Also I want to run this jobs daily for analysing the rank change.
Rate limiting is not a concern of your queue system, much less of SuckerPunch that is not designed to handle advanced delaying/queuing stuff, it just moves asynchronous jobs to a thread from a thread pool.
If you really want to have rate limiting, use a real queue system like Sidekiq, and put some actual code to work.
Sidekiq Enterprise supports it natively: https://github.com/mperham/sidekiq/wiki/Ent-Rate-Limiting
Sidekiq-throttler seems to provide the same functionality: https://github.com/gevans/sidekiq-throttler
But you can also just delay execution (so pre-emptively limiting the rate), by enqueuing jobs at specific times in the future (each executing 4 minutes after the other) or enqueuing just one job that executes itself (doing next outstanding request) and enqueues itself again with 4 minutes delay.
As always with open source, check the code and decide by yourself.
Could you do something like this?
YourProcessingJob.set(wait: 1.hours).perform_later
Possibly in a custom rake task...

How to specify timeout for a particular request in ruby on rails?

How can I specify timeout of 2 minutes for a particular request in rails application. One of my application request is taking morethan 5 minutes in some cases. In that case I would like to stop processing that request if it is taking morethan 2 mins.
I need this configuration at application level so that in future if there are any other such type of requests I should not do any special changes otherthan mentioning that action in that configuration. There are some requests which take morethan 10mins also. But they should not have any effect.
Thanks,
Setting the timeout for a request back that far is generally bad practice. Making your users wait for minutes on end for a request to finish isn't a good idea.
Instead, this type of long-running task should be placed into a job queue for a worker process to run at it's convenience independent of the web request. This
allows the web request to finish very quickly, making your user happy
the long-running task to stay out of your web process, freeing it up to do what its supposed to (serve web requests)
Consider a gem like delayed_job. Describing how to work it into your application is outside of the scope of this question; my answer here serves only to point out that looking to modify the timeout is very likely the wrong 'answer' and than you're better off looking at a job queue.

Grails non time based queuing

I need to process files which get uploaded and it can take as little as 1 second or as much as 10 minutes. Currently my solution is to make a quartz job with a timer of 30 seconds and then process and arbitrary job whenever it hits. There are several problems with this.
One: if the job will take less than a few seconds it is wasteful to make things wait 30 seconds for the job queue.
Two: if there is only one long job in the queue it could feasibly try to do it twice.
What I want is a timeless queue. When things are added the are started immediately if there is a free worker. Is there a solution for this? I was looking at jesque, but I couldn't tell if it can do this.
What you are looking for is a basic message queue. There are lots of options out there, but my favorite for Grails is RabbitMQ. The Grails plugin for it is quite good and it performs well in my experience.
In general, message queues allow you to have N producers (things creating jobs") adding work messages to a queue and then M consumers pulling jobs off of the queue and processing them. When a worker completes it's job, it simply asks the queue for the next job to process and if there is none, it just waits for the queue to give it something to do. The queue also keeps track of success / failure of message processing (you can control this) so that you don't give the same message to more than one worker.
This has the advantage of not relying on polling (so you can start processing as soon as things come in) and it's also much more scaleable. You can scale both your producers and consumers up or down as needed, decoupling the inputs from the outputs so that you can take a traffic spike and then work your way through it as you have the resources (workers) available.
To solve problem one just make the job check for new uploaded files every 5 seconds (or 3 seconds, or 1 second). If the check for uploaded files is quick then there is no reason you can't run it often.
For problem two you just need to record when you start processing a file to ensure it doesn't get picked-up twice. You could create a table in the database, or store the information in memory somewhere.

what would be the possible approach to go : SQS or SNS?

I am going to make the rails application which integrates the Amazon's cloud services.
I have explore amazon's SNS service which gives the facility of public subscription which i don't want to do. I want to notify only particular subscriber.
For example if I have 5 subscriber in one topic then the notification should be goes to particular subscriber.
I have also explored amazon's SQS in which i have to write a poller which monitor the queue for message. SQS has also a lock mechanism but the problem is that it is distributed so there would be a chance of getting same message from another copy of queue for process.
I want to know that what would be the possible approach to go.
SQS sounds like what you want.
You can run multiple "worker" processes that compete over messages in the queue. Each message is only consumed once. The logic behind the "lock" / timeout that you mention is as follows: if one of your workers were to die after downloading a message, but before processing it, then you want that message to eventually time out and be re-downloaded for processing on another node.
Yes, SQS is built on a polling model. For example, I have a number of use cases in which I use a minutely cron job to poll for new messages in the queue and take action on any messages found. This pattern is stupid simple to build and works wonders for a bunch of use cases -- a handy little "client" script that pushes a message into the queue, and the cron activated script that will process that message within a minute or so.
If your message pattern is extremely sparse -- eg, only a few messages a day -- it may seem wasteful to poll constantly while the queue is empty. It hardly matters.
My original calculation was that a minutely cron job would cost $0.04 (now $0.02) per month. Since then, SQS added a "Long-Polling" feature that lets you achieve sub-second latency on processing new messages by sending 1 "long-poll" message every 20 seconds to poll an idle queue. Plus, they dropped the price 50%. So per month, that's 131k messages (~$0.06), a little bit more expensive, but with near realtime request processing.
Keep in mind that a minutely cron job I described only costs ~$0.04 / month in request load (30d*24h*60m * 1c / 10k msgs). So at a minutely clip, cost shouldn't really be a concern here. Even polling every second, the price rises only to $2.59 / mo, not exactly a bank buster.
However, it is possible to avoid frequent polling using a webservice that takes an SNS HTTP message. Such an architecture would work as follows: client pushes message to SNS, which pushes message to SQS and routes an HTTP request to your webservice, triggering it to drain the queue. You'd still want to poll the queue hourly or daily, just in case an HTTP request was dropped. In the end though, I'm not sure I can think of any scenario which really justifies such complexity. I'd much rather pay $0.04 a month to have a dirt simple cron job polling my queue.

Resources