I am running a rails app and using redis for jbulder's cache and sidekiq queue. I use sidekiq to send emails asyncly, everytime when I try to send mass emails, say 20k emails in background using sidekiq, after a while, all the background jobs in sidekiq queue are cleared, left 0 jobs in queue.
I filed an issue on sidekiq github page(link), the author said it could be something or someone flushing my redis. There's no one flush redis manually and I wonder how can I find when and how redis gets flushed.
I've checked redis log file with nothing strange.
Here is the documentation on changing certain commands. Perhaps consider changing flushAll and flushDB to something abnormal.
Related
We have a Rails 6.1 app and using Sidekiq for bacground jobs. Sometimes it happens that the sidekiq web UI resets to the initial state, showing the completed/failed job counters near zero, indicating a recent Redis reset. What could be causing this?
This was being caused by Rails using the same Redis as cache store and someone just ran Rails.cache.clear which in turns calls Redis command FLUSHDB which clears the whole DB.
There are two possible solutions:
use separate Redis instance for Rails cache and Sidekiq workers (IMO overkill for most cases)
use separate Redis DB for Rails.cache and Sidekiq. This still means that you need separare REDIS_URL config for Rails.cache and Sidekiq but you can just use different Redis DB (by default redis supports up to 16 separate DBs). To do this set
REIDS_URL_RAILS=redis://host:10000/0
REIDS_URL_SIDEKIQ=redis://host:10000/1
And adjust your configs accordingly. The number at the end of the url is the ID of the database to use (by default DB=0 is used).
I want the worker to run on a specific date. I am able to schedule jobs in sidekiq. And sidekiq UI also shows scheduled jobs perfectly. But due to unknown reason my data on sidekiq (processed count,scheduled jobs etc.) gets deleted and everything is reset to 0 in sidekiq UI. Can someone please help me understand this issue.
I suspect you are calling flush on Redis, clearing all your data.
I have sucker_punch worker which is processing a csv file, I initially had a problem with the csv file disappearing when the dyno powered down, to fix that i'm gonna set up s3 for file storage.
But my current concern is whether a dyno powering down will stop my worker in it's tracks.
How can I prevent that?
Since sucker_punch uses a separate thread on the same dyno and does not use an external queue or persistence (the way delayed_job, sidekiq, and resque do) you will be subject to losing the job when your dyno gets rebooted or stopped and you'll have no way to restart the job. On Heroku, dynos are rebooted at least once a day. If you need persistence and the ability to retry a job in the event a dyno goes down, I'd say switch to one of the other job libraries:
https://github.com/collectiveidea/delayed_job
https://github.com/mperham/sidekiq
https://github.com/resque/resque
However, these require using a Heroku Addon. You can get a way with the free version but you will still have to pay for the extra worker process. Other than that you'd have to implement your own persistence and retrying by wrapping sucker_punch. Here's a discussion on adding those features to sucker_punch: https://github.com/brandonhilkert/sucker_punch/issues/21 They basically say to use Sidekiq instead.
Is there a way to permanently remove jobs from a resque queue? The following commands remove the jobs, but when I restart the workers and the resque server, the jobs load back up.
Resque::Job.destroy("name_queue", Class)
OR
Resque.remove_queue("name_queue")
The problem is you're not removing the specific instance of the job that you added to your Redis server through resque. So when you remove the queue then add it back when you restart the server, all the data from that queue could still be in your Redis server. You can work around this in your job.perform depending on your implementation. For instance, if you want to manipulate a model through resque you could check to see if that model has been destroyed before manipulating it.
our rails web app has to download/unpack archives with html pages from ftp on request for user's viewing through the browser.
the archive can be quite big, so user has to wait until it downloads/unpacks on the server.
i implemented progress bar the way that i call fork/Process.detach in user's request, so that his request is done but downloading/unpacking process continues running in the background. and javascript rendered in his browser pings our server for status until all is ready and then it redirects him to unpacked html pages.
as long as user requests one archive, everything goes smoothly, but if he tries to run 2 or more requests at the same time(so that more forks are started), it seems that only one of them completes, and the rest expires/times outs/gets killed by passenger(?). i suppose its the issue with Passenger/forking.
i am not sure if its possible to fix it somehow so i guess i need to switch to another solution. the solution needs to permit immediate and parallel processing of downloads. so that if user requests multiple archives, he has to see download/decompression progress in all of them at the same time.
i was thinking about running background rake job immediately but it seems very slow to startup(also there's a lot of cron rake tasks happening every minute on our server). reason i liked fork was that it was very fast to start. i know there is delayed job, we also use it heavily for other tasks. but can it start multiple processes at the same time immediately without queues?
solved by keeping the fork and using single dj worker. this way i can have as many processes starting at the same time as needed without trouble with passenger/modifying our product's gemset (which we are trying to avoid since it resulted in bugs in the past)
not sure if forking inside dj worker can cause any troubles, so asked at
running fork in delayed job
if id be free to modify gemset, id probably use resque as wrdevos suggested, or sidekiq, or girl_friday(but thats less probable because it depends on the server running).
Use Resque: https://github.com/defunkt/resque
More on bg jobs and Resque here.
https://github.com/blog/542-introducing-resque