I'd love to be notified when any of my background tasks causes an error. We use DJ on heroku and NewRelic is monitoring it very well, i can even see the errors themselves, i'm just not sure how to create a rule for an alert or a notification for these specific transactions (or for any background task error for that manner)
Thanks a bunch!
For the agents, .Net, PHP, Java, and the like, New Relic alerts on only two events: downtime events and low Apdex scores. The best you could do is pull a specific transaction out as a Key Transaction which would allow you to assign it an Apdex threshold that is different (usually lower) than the main application so that if a small number of errors do occur it will lower the Apdex enough to create an alert. Since key transaction Apdex scores feed into the main application Apdex score it will lower that one as well as a side effect.
Related
I want to create SQS using code whenever it is required to send messages and delete it after all messages are consumed.
I just wanted to know if there is some delay required between creating an SQS using Java code and then sending messages to it.
Thanks.
Virendra Agarwal
You'll have to try it and make observations. SQS is a dostributed system, so there is a possibility that a queue might not immediately be usable, though I did not find a direct documentation reference for this.
Note the following:
If you delete a queue, you must wait at least 60 seconds before creating a queue with the same name.
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_CreateQueue.html
This means your names will always need to be different, but it also implies something about the internals of SQS -- deleting a queue is not an instantaneous process. The same might be true of creation, though that is not necessarily the case.
Also, there is no way to know with absolute certainty that a queue is truly empty. A long poll that returns no messages is a strong indication that there are no messages remaining, as long as there are also no messages in flight (consumed but not deleted -- these will return to visibility if the consumer resets their visibility or improperly handles an exception and does not explicitly reset their visibility before the visibility timeout expires).
However, GetQueueAttributes does not provide a fail-safe way of assuring a queue is truly empty, because many of the counter attributes are the approximate number of messages (visible, in-flight, etc.). Again, this is related to the distributed architecture of SQS. Certain rare, internal failures could potentially cause messages to be stranded internally, only to appear later. The significance of this depends on the importance of the messages and the life cycle of the queue, and the risks of any such an issue seem -- to me -- increased when a queue does not have an indefinite lifetime (i.e. when the plan for a queue is to delete it when it is "empty"). This is not to imply that SQS is unreliable, only to make the point that any and all systems do eventually behave unexpectedly, however rare or unlikely.
Apple states that the app using the background mode shouldn't perform expensive, battery consuming tasks when launched in the background. What exactly is considered a batter consuming task? To be specific: is searching and array of 100 entries acceptable? What about 1000?
A battery-consuming task is just that - a task that uses so much CPU that it makes considerable "dent" on the battery, as measured by the "battery percentage" screen. To that end, searching 100, 1000, or 1000000 items a single time is unlikely to do any damage. On the other hand, searching a 10-item list fifty times a second is very likely to make your task a high energy consumer. Same goes for downloading data multiple times per minute, using location services, etc.
A rule of thumb is very simple: go to the "Last 24 hours" view of the "Battery percentage" screen, and see if your app is listed there. If it is not there, of if it is below "Home & Lock Screen", your app is fine.
The remark from the guidelines you are referencing is meant to keep people form draining a user's battery in the background.
As a lot of applications use GPS and/or radio in the background and those are way more power consuming than searching tiny arrays, searching some array will probably be fine, as long as it has a reasonable size.
You shouldn't create an app that helps SETI or Folding or searches for the next biggest prime - or excessively use radio. But for small tasks like your's, this guideline is not to be concerned. Yet, this is only an estimate and in the end, the review process will decide on this on a case by case basis.
The general rule of thumb from apple is:
Always try to avoid doing any background work unless doing so improves the overall user experience. An app might move to the background because the user launched a different app or because the user locked the device and is not using it right now. In both situations, the user is signaling that your app does not need to be doing any meaningful work right now. Continuing to run in such conditions will only drain the device’s battery and might lead the user to force quit your app altogether. So be mindful about the work you do in the background and avoid it when you can.
So the question for you to decide if this is okay is: Does the user expect that to be done right now? rather than does that use too many battery?. If searching that array cannot be deferred to when the app is active again (I cannot see a reason for not deferring it, but there might be), you are fine searching the array in background.
In my Rails application, I have a long calculation requiring a lot of database access.
To make it short, my calculation took 25 seconds.
When implementing the same calculation within a background job (a big single worker), the same calculation take twice the same time (ie 50 seconds). I have try several technics to put the job in a background process put none add an impact on my performances => using DelayJob / Sidekiq / doing the process within my rails but in a thread created for the work, but all have the same impact on my performances *2.
This performance difference only exist in rails 'production' environment. It looks like there is an optimisation done by rails that is not done in my background job.
My technical environment is the following =>
I am using ruby 2.0 / rails 4
I am using unicorn (but I have same problem without it).
The job is using Rails.cache to store some partial computation.
I am using postgresql
Does anybody has an clue where this impact might come from ?
I'm assuming you're comparing the background job speed to the speed of running the operation during a web request? If so, you're likely benefiting from Rails's QueryCache, which caches db queries during a web request. Try disabling it like described here:
Disabling Rails SQL query caching globally
If that causes the web request version of the job to take as long as the background job, you've found your culprit. You can then enable the query cache on your background job to speed it up (if it makes sense for your application).
Background job is not something that need to used for speed-up things. It's main meaning is to 'fire and forget' and remove 25 seconds of calculating synchronously and adding some more of calculating asynchronously. So you can give user response that she's request is processing and return with calculation later.
You may take speed gain from background job by splitting big task on some small and running them at same time. In your case I think it's something impossible to use, because of dependency of operations in yours calculation.
So if you want to speed you calculation, you need to look into denormalization of your data structure, storing some calculated values for your big calculation on moment when source data for this calculation updated. So you will calculate less on user request for results and more on data storage. And it's good place for use background job. So you finish your update of data, create background task for update caches. And if user request for calculation comes before this task is finished you will still need to wait for cache fill-up.
Update: I think I am still need to answer your main question. So basically this additional time on background task processing is comes from implementation. Because of 'fire and forget' approach no one need that background task scheduler will consume big amount of processor time just monitoring for new jobs. I am not sure completely but think that if your calculation will be two times more complex, time gain will be same 25 seconds.
My guess is that the extra time is coming from the need for your background worker to load rails and all of your application. My clue is that you said the difference was greatest with Rails in production mode. In production mode, subsequent calls to the app make use of the app and class cache.
How to check this hypotheses:
Change your background job to do the following:
print a log message before you initiate the worker
start the worker
run your calculation. As part of your calculation startup, print a log message
print another log message
run your calculation again
print another log message
Then compare the two times for running your calculation.
Of course, you'll also gain some extra time benefits from database caching, code might remain resident in memory, etc. But if the second run is much much faster, then the fact that the second run didn't restart Rails is more significant.
Also, the time between the log message from steps 1 and 3 will also help you understand the start up times.
Fixes
Why wait?
Most important: why do you need the results faster? Eg, tell your user that the result will be emailed to them after it is calculated. Or let your user see that the calculation is proceeding in the background, and later, show them the result.
The key for any long running calculation is to do it in the background and encourage the user to not wait for the result. They should be able to do something else until they get the result.
Start the calculation automatically As soon as the user logs in, or after they do something interesting, start the calculation. That way, when (and if) the user asks for the calculation, the answer will either be already done or will soon be done.
Cache the result and bust the cache as needed Similar to the above, start the calculation periodically and automatically. If the user changes some data, then restart the calculation by busting the cache. There are also ways to halt any on-going calculation if data is changed during the calculation.
Pre-calculate part of the calculation Why are you taking 25 seconds or more for a dbms calculation? Could be that you should change the calculation. Investigate adding indexes, summary tables, de-normalizing, splitting the calculation into smaller steps that can be pre-calculated, etc.
I'd like to infrequently open a Twitter streaming connection with TweetStream and listen for new statuses for about an hour.
How should I go about opening the connection, keeping it open for an hour, and then closing it gracefully?
Normally for background processes I would use Resque or Sidekiq, but from my understanding those are for completing tasks as quickly as possible, not chilling and keeping a connection open.
I thought about using a global variable like $twitter_client but that wouldn't horizontally scale.
I also thought about building a second application that runs on one box to handle this functionality, but that seems excessive if it can be integrated into the main app somehow.
To clarify, I have no trouble starting a process, capturing tweets, and using them appropriately. I'm just not sure what I should be starting. A new app? A daemon of some sort?
I've never encountered a problem like this, and am completely lost. Any direction would be much appreciated!
Although not a direct fix, this is what I would look at:
Time
You're working with time, so I'd look at what time-centric processes could be used to induce the connection for an hour
Specifically, I'd look at running a some sort of job on the server, which you could fire at specific times (programmatically if required), to open & close the connection. I only have experience with resque, but as you say, it's probably not up to the job. If I find any better solutions, I'll certainly update the answer
Storage
Once you've connected to TweetStream, you'll want to look at how you can capture the tweets for that time period. It seems a waste to create a data table just for the job, so I'd be inclined to use something like Redis to store the tweets that you need
This can then be used to output the tweets you need, allowing you to simulate storing / capturing them, but then delete them after the hour-window has passed
Delivery
I don't know what context you're using this feature in, so I'll just give you as generic process idea as possible
To display the tweets, I'd personally create some sort of record in the DB to show the time you're pinging TweetStream that day (if it changes; if it's constant, just set a constant in an initializer), and then just include some logic to try and get the tweets from Redis. If you're able to collect them, show them as you wish, else don't print anything
Hope that gives you a broader spectrum of ideas?
I'm building an online calendar in Ruby on Rails that needs to send out email notifications whenever a user-created event is about to start/finish (i.e you get a reminder when a meeting is 5 minutes away). What's the best way of figuring out when an event is about to start? Would there be a cron task that checks through all events to find out which ones are starting within a certain threshold (i.e 5 minutes) ? A cron task seems inefficient to me, so I'm wondering what might be a better solution. My events are stored in a mySQL database. There must be a design pattern for this... I'm just at a loss for what to search for.
Any help would be greatly appreciated. Thanks.
In all likelihood you will probably implement some background queuing mechanism to actually deliver the notifications - at least you certain should be considering this approach.
Assuming this, why not create your delayed notification jobs at event creation time to be delivered when the associated event is starting or finishing. The background queue, which is already waking up periodically to look for work, will pick these up and run them.
However adopting this approach requires you to consider the following (at least):
Removing queued notification job if the associated event is removed
Amending the notification job if the associated event is amended (say a new time)
Ensuring that the polling resolution of the queuing system does not allow notifications to be delivered so late as to be useless.
If you haven't picked a queuing solution for your application you should consider these options