I execute queries that go through the whole, quite big graph. At the moment some of them take 10m and others go up to 3h and even more. However, it is just a start. I already have to break some of them after a few hours because I do not know if they will end in ten minutes or 10 years. It would be very helpful to see some kind of progress during the execution.
At the moment such a feature does not (yet) exist. There is a plan to provide a way to kill a running query conveniently in one the upcoming releases.
In the meantime you can go with a feature that is not part of the public API called guards. Mark did a nice writeup at http://www.markhneedham.com/blog/2013/10/17/neo4j-setting-query-timeout/
Related
I worked with Zapier a few years ago and remember that they expect a minimum number of triggers and actions to go live on the app directory. However I'm not finding that document right now (or maybe it was never there and I'm mistaken?).
Does anyone have context on what's the least number of triggers and actions needed to go live with a Zapier app?
There's no hard and fast rule here. 1 is the minimum. Initially it's recommended to do no more than 5 total. From the docs:
We recommend your Zapier integration have no more than 5 of each (trigger, action, or search) at first; we suggest starting with your most popular 2-3 use cases.
You shouldn't pad your integration; it should have enough important functionality to be useful.
I want to make a game in rails (not with flash, just html). Every action should take some time to execute. For example, user can send an action to his hero "go learn ". It should lasts for 10 minutes. What's the best way to implement it?
I want to store player tasks in my database, but how should I do their execution?
1 way: when user log in or do something, check his tasks and look for finished ones.
2 way: check tasks on my app every X seconds and look for finished ones.
3 way: use something like Delayed Job gem. Do you think its good for my problem?
You could use delayed job, to run the task.With that there is problem that you will have tomanage "many" workers when there is extra load on the site, but its not that bad either, its doable as long as it "runs" every task exactly after 10 minutes.
You can still use a combined approach using 1 & 2 which would generally work.
I am now researching for a solution to perform similar task and just recently came across a railscasts episode that I found worthy of noting down.
Using custom deamons (and a number of other interesting topics) found here.
P.S.
I see that you have asked this question a while back. Could you please share how you ended up implementing your solution.
Cheers
I'm running a rails 3.0 application on Heroku and using the New Relic addon/service.
I have been looking at the transaction traces feature (available in the pro version) to understand a little more about the performance characteristics of the application. However, a significant portion of time (30-50%) is "uninstrumented time". After making a few stabs by putting method_tracers in some places and going through the reasonably slow cycle to test whether I get more info, I'm feeling this is going nowhere fast.
It seems in the PHP new relic agent they have a great feature to get very detailed traces without needing to guess where to put method tracers: http://newrelic.com/docs/php/php-agent-faq#top100
Is there anything similar to this for ruby?
Note: I'm already using rpm_contrib to get some more info and have garbage collection stats enabled. Also, this is not about fixing a performance problem, just understanding how to better use the performance tools available and scratch a niggling itch about that uninstrumented time.
There isn't currently anything similar for Ruby. I'll mention it to the Ruby engineer when I get a chance. My guess is unless a lot of requests come in for it, it won't be at the top of the list for a while, though. In the meantime, you can use the method tracers to figure out the uninstrumented time.
Hope that helps.
Method tracers can work well, but if you have a lot of code in your controller, try a binary search using trace_execution_scoped, which records the time spent in a block of code:
http://newrelic.github.com/rpm/NewRelic/Agent/MethodTracer/InstanceMethods/TraceExecutionScoped.html#method-i-trace_execution_scoped
Add a couple calls to this, give each metric a sensible name like "Custom/MySlowControllerAction/block0" (first argument to trace_execution_scoped), and repeat.
The metrics you name will show up not just in Transaction Traces, but also in the Performance Breakdown for the controller action under the Web Transactions tab, so you'll see average time in that block of code across all requests, not just the slow ones.
We are experiencing some serious scaling challenges for our intelligent search engine/aggregator. Our database holds around 200k objects. From profiling and newrelic it seems most of our troubles may come from the database. We are using the smallest dedicated database Heroku provide (Ronin).
We have been looking into indexing and caching. So far we managed to solve our problems by reducing database calls and caching content intelligently, but now even this seems to reach an end. We are constantly asking ourselves if our code/configuration is good enough or if we are simply not using enough "hardware".
We suspect that the database solution we buy from Heroku may be performing insufficiently. For example, just doing a simple count (no joins, no nothing) on the 200k items takes around 250ms. This seems like a long time, even though postgres is known for its bad performance on counts?
We have also started to use geolocation lookups based on latitude/longitude. Both columns are indexed floats. Doing a distance calculation involves pretty complicated math, but we are using the very well recommended geocoder gem that is suspected to run very optimized queries. Even geocoder still takes 4-10 seconds to perform a lookup on, say, 40.000 objects, returning only a limit of the first nearest 10. This again sounds like a long time, and all the experienced people we consult says that it sound very odd, again hinting at the database performance.
So basically we wonder: What can we expect from the database? Might there be a problem? And what can we expect if we decide to upgrade?
An additional question I have is: I read here that we can improve performance by loading the entire database into memory. Are we supposed to configure this ourselves and if so how?
UPDATE ON THE LAST QUESTION:
I got this from the helpful people at Heroku support:
"What this means is having enough memory (a large enough dedicated
database) to store your hot data set in memory. This isn't something
you have to do manually, Postgres is configured automatically use all
available memory on our dedicated databases.
I took a look at your database and it looks like you're currently
using about 1.25 GB of RAM, so you haven't maxed your memory usage
yet."
UPDATE ON THE NUMBERS AND FIGURES
Okay so now I've had time to look into the numbers and figures, and I'll try to answer the questions below as follows:
First of all, the db consists of around 29 tables with a lot of relations. But in reality most queries are done on a single table (some additional resources are joined in, to provide all needed information for the views).
The table has 130 columns.
Currently it holds around 200k records but only 70k are active - hence all indexes are made as partial-indexes on this "state".
All columns we search are indexed correctly and none is of text-type, and many are just booleans.
Answers to questions:
Hmm the baseline performance it's kind of hard to tell, we have sooo many different selects. The time it takes varies typically from 90ms to 250ms selecting a limit of 20 rows. We have a LOT of counts on the same table all varying from 250ms to 800ms.
Hmm well, that's hard to say cause they wont give it a shot.
We have around 8-10 users/clients running requests at the same time.
Our query load: In new relic's database reports it says this about the last 24 hours: throughput: 9.0 cpm, total time: 0.234 s, avg time: 25.9 ms
Yes we have examined the query plans of our long-running queries. The count queries are especially slow, often over 500ms for a pretty simple count on the 70k records done on indexed columns with a result around 300
I've tuned a few Rails apps hosted on Heroku, and also hosted on other platforms, and usually the problems fall into a few basic categories:
Doing too much in ruby that could be done at the db level (sorting, filtering, join data, etc)
Slow queries
Inefficient use of indexes (not enough, or too many)
Trying too hard to do it all in the db (this is not as common in rails, but does happen)
Not optimizing cacheable data
Not effectively using background processing
Right now its hard to help you because your question doesn't contain any specifics. I think you'll get a better response if you pinpoint the biggest issue you need help with and then ask.
Some info that will help us help you:
What is the average response time of your actions? (from new relic, request-log-analyzer, logs)
What is the slowest request that you want help with?
What are the queries and code in that request?
Is the site's performance different when you run it locally vs. heroku?
In the end I think you'll find that it is not an issue specific to Heroku, and if you had your app deployed on amazon, engineyard, etc you'd have the same performance. The good news is I think that your problems are common, and shouldn't be too hard to fix once you've done some benchmarking and profiling.
-John McCaffrey
We are constantly asking...
...this seems a lot...
...that is suspected...
...What can we expect...
Good news! You can put and end to seeming, suspecting wondering and expecting through the magic of measurement!!!
Seriously though, you've not mentioned any of the basic points you'd need to get a useful answer:
What's the baseline performance of the DB running a sequential scan and single-row index fetches? You say Heroku say your DB fits in RAM, so you shouldn't see disk I/O issues when you measure.
Does this performance match whatver Heroku say it should be?
How many concurrent clients?
What's your query load - what queries and how often?
Have you checked the query plans for any of your suspiciously long-running queries?
Once you've got this sort of information, maybe someone can say something useful. As it stands anything you read here is just guesswork.
First: you should check your postgres configuration. (show all from within psql or another client, or just look at postgres.conf in the data directory) The parameter with the largest impact on performance is effective_cache_size, which should be set to about (total_physical_ram - memory_in_use_by_kernel_and_all_processes). For a 4GB machine, this often is around 3GB (4-1). (this is very course tuning, but will give the best results for a first step)
Second: why do you want all the counts? Better use a typical query: just ask for what is needed, not what is available. (reason: there is no possible optimisation for a COUNT(*): eiither the whole table, or a whole index needs to be scanned)
Third: start gathering and analysing some queryplans (for typical queries that perform badly). You can get a query plan by putting EXPLAIN ANALYZE before the actual query. (another way is to increase the logging level, and obtain them from the logfile) A bad queryplan can point you at missing statistics or indexes, or even at bad data-modelling.
Newrelic monitoring can be included as an add-on for heroku (http://devcenter.heroku.com/articles/newrelic). At the very least this should give you a lot of insight into what is happening behind the scenes, and may help you pinpoint some issues.
How would you update attributes in your database based on the time of day or what day it is. I have three attributes energy, hunger, and happiness that I want to decrease by ten every hour but I don't quite know how to go about doing this. I know there are timestamps in the database but I don't really know how to use them. Also I want to change the players skills every day based on their job. So if you have this job, add 2 to intelligence every day. But I don't know how to add that 2 every day. I would love it if anyone could give me help on this problem. I would greatly appreciate it.
A couple of options:
cronjob: You could setup your cronjob to access the database directly through a SQL script (probably the simplest solution out of all in terms of setup) or go through your rails application first (e.g. in case you need to run additional business logic before updating the database - you mentioned something about updating the database based on the user job). See this post for the latter approach.
Background task: Take a look at Starling/Workling or Backgroundrb. You can use either of these to run a background task that could update your database at regular intervals.
There are two common but fundamentally different ways of achieving this:
During each request, simulate the amount of time which has passed since the last request. If a user makes two requests three hours apart, simulate three hours of time passing by subtracting 30 happiness (10/hour times 3 hours) all at once. This is less resource intensive, but requires a little more thinking on your part. It's not difficult for something as simple as "lower a value by 10 every hour", but more complex interactions are more difficult to model.
Run a cron job which invokes an action in your program every hour, on the hour, to deduct 10 happiness from each account. This is easier conceptually, but involves a lot of overhead if you have many users, especially when some of them are idle for long periods.