Get execution time of every part of a profile plan - neo4j

Is there a way to measure the time it takes to perform each part of a Neo4j execution plan?
I can see the total execution time and total db hits. Also db hits and estimated rows for each part of the execution plan but not the time it takes to perform it. For example, the time it takes to perform a 'Filter' or 'Expand(All)' operation.

Nop you can't.
But you have the number of dbhits on each boxes, so you are already aware of the resources consumption of each part.
Why do you want to know the time of each part ?
Update answer after comment
A dbhit is an abstract unit of work for the database. So more dbhit you've got on a box, more work needs to be done on it, and so it takes more time.
On the other side, an execution depends a lot of the state of your computer. Do you have a lot of processes that are using the CPU, memory, network, hard drive ... ?
So to compare time executions is bad habit, you should compare the dbhits.
DBHits are always related to the time execution of a query, but the opposite is not necessary true.

Related

Neo4j query execution time: when executing the same query multiple times, only the first one seems to be correct

I'm using LDBC dataset to test execution time in Neo4j 4.0.1, SF = 1, and I use java to connect Neo4j, ResultSummary.resultAvailableAfter() to get the execution time, which is the time to get the result and start streaming.
But for the same query, when I run for the first time, the execution time seems reasonable, like hundreds of ms, but when I continue running this same query, the execution time becomes almost 0.
I guess it's effect of query cache, but is there any proper approach to test query execution time and get a reasonable result?
Now I can only restart db to get the result that seems to be correct.
I guess it's because Neo4j directly caches the query result and just fetches it if the same query is executed multiple times. Is there any way to avoid this? i.e. let neo4j do normal caching (like nodes and relationships), not directly cache query result.
Thanks!
The page cache most likely is responsible for the results you are seeing (Well, i had some discussions with neo4j engineers when i was working on building a neo4j cluster. Their suggestions to optimize our cluster performance seemed to indicate this). You should set the page cache size to 0 or very close to 0 (say 1Meg or something low). You can read about the memory settings here https://neo4j.com/docs/operations-manual/current/performance/memory-configuration/
The specific setting you need to change is
dbms.memory.pagecache.size=1M
or set it 0. Explicitly set this to a value. Dont leave this setting commented. neo4j may assign default memory size to page cache. Restart your server/cluster after the settings change and see what performance numbers you come up with. You should also check how your cache looks by running
:sysinfo
command in the browser before and after running your queries.
And there is no direct setting to tell neo4j what to cache. That is, rightly, decided by the server itself.
Sorry, i dont have enough reputation points to leave a comment to your question!

VoltDB Stored Procedures history

I have many stored procedures running in Volt and it seems like 1 of them is causing spikes in CPU every now and then but I don't know which one.
Is there somewhere I can see the history of all the stored procedures that ran so that I could pinpoint the problematic one based on the time it occurred?
I have tried turning the Command Logging on but it's a binary file so I have no way of reading it.
My next option is to log from inside the stored procedures but I prefer to keep this option as a last resort because it will require some extra developing/deploying and it won't be relevant for internal procedures.
Is there any way to log/somehow see when stored procedures ran?
There isn't a log of every transaction in VoltDB that a user can review. The command log is not meant to be readable and only includes writes. However, there are some tools you can use to identify poorly performing or long-running procedures.
You can call "exec #Statistics PROCEDUREPROFILE 0;" to get a summary of all the procedures that have been executed, including the number of invocations and the average execution time in nanoseconds. If one particular procedure is the problem, it may stick out.
You can also grep the volt.log file for the phrase "is taking a long time", which is a message printed when a procedure or SQL statement takes longer than 1 second to execute.
Also, there is a script in the tools subdirectory called watch_performance.py, which can be used to monitor the performance. It is similar to calling "exec #Statistics PROCEDUREPROFILE 0;" at regular intervals, except there are some columns gathered from additional #Statistics selectors, and the output is formatted for readability. "./watch_performance.py -h" will output help and usage information. For example, you might run this during a performance load to get a picture of the workload. Or, you might run it over a longer period of time, perhaps at less granular intervals, to see the fluctuations in the workload over time.
Disclosure: I work for VoltDB

Background job taking twice the time that the same operation within rails

In my Rails application, I have a long calculation requiring a lot of database access.
To make it short, my calculation took 25 seconds.
When implementing the same calculation within a background job (a big single worker), the same calculation take twice the same time (ie 50 seconds). I have try several technics to put the job in a background process put none add an impact on my performances => using DelayJob / Sidekiq / doing the process within my rails but in a thread created for the work, but all have the same impact on my performances *2.
This performance difference only exist in rails 'production' environment. It looks like there is an optimisation done by rails that is not done in my background job.
My technical environment is the following =>
I am using ruby 2.0 / rails 4
I am using unicorn (but I have same problem without it).
The job is using Rails.cache to store some partial computation.
I am using postgresql
Does anybody has an clue where this impact might come from ?
I'm assuming you're comparing the background job speed to the speed of running the operation during a web request? If so, you're likely benefiting from Rails's QueryCache, which caches db queries during a web request. Try disabling it like described here:
Disabling Rails SQL query caching globally
If that causes the web request version of the job to take as long as the background job, you've found your culprit. You can then enable the query cache on your background job to speed it up (if it makes sense for your application).
Background job is not something that need to used for speed-up things. It's main meaning is to 'fire and forget' and remove 25 seconds of calculating synchronously and adding some more of calculating asynchronously. So you can give user response that she's request is processing and return with calculation later.
You may take speed gain from background job by splitting big task on some small and running them at same time. In your case I think it's something impossible to use, because of dependency of operations in yours calculation.
So if you want to speed you calculation, you need to look into denormalization of your data structure, storing some calculated values for your big calculation on moment when source data for this calculation updated. So you will calculate less on user request for results and more on data storage. And it's good place for use background job. So you finish your update of data, create background task for update caches. And if user request for calculation comes before this task is finished you will still need to wait for cache fill-up.
Update: I think I am still need to answer your main question. So basically this additional time on background task processing is comes from implementation. Because of 'fire and forget' approach no one need that background task scheduler will consume big amount of processor time just monitoring for new jobs. I am not sure completely but think that if your calculation will be two times more complex, time gain will be same 25 seconds.
My guess is that the extra time is coming from the need for your background worker to load rails and all of your application. My clue is that you said the difference was greatest with Rails in production mode. In production mode, subsequent calls to the app make use of the app and class cache.
How to check this hypotheses:
Change your background job to do the following:
print a log message before you initiate the worker
start the worker
run your calculation. As part of your calculation startup, print a log message
print another log message
run your calculation again
print another log message
Then compare the two times for running your calculation.
Of course, you'll also gain some extra time benefits from database caching, code might remain resident in memory, etc. But if the second run is much much faster, then the fact that the second run didn't restart Rails is more significant.
Also, the time between the log message from steps 1 and 3 will also help you understand the start up times.
Fixes
Why wait?
Most important: why do you need the results faster? Eg, tell your user that the result will be emailed to them after it is calculated. Or let your user see that the calculation is proceeding in the background, and later, show them the result.
The key for any long running calculation is to do it in the background and encourage the user to not wait for the result. They should be able to do something else until they get the result.
Start the calculation automatically As soon as the user logs in, or after they do something interesting, start the calculation. That way, when (and if) the user asks for the calculation, the answer will either be already done or will soon be done.
Cache the result and bust the cache as needed Similar to the above, start the calculation periodically and automatically. If the user changes some data, then restart the calculation by busting the cache. There are also ways to halt any on-going calculation if data is changed during the calculation.
Pre-calculate part of the calculation Why are you taking 25 seconds or more for a dbms calculation? Could be that you should change the calculation. Investigate adding indexes, summary tables, de-normalizing, splitting the calculation into smaller steps that can be pre-calculated, etc.

Lots of "COMMIT;" in the PostgreSQL log of slow queries

I am trying to optimize the PostgreSQL 9.1 database for a Rails app I am developing. In postgresql.conf I have set
log_min_duration_statement = 200
I then use PgBadger to analyze the log file. The statement which, by far, takes up most of the time is:
COMMIT;
I get no more information than this and I am very confused as to what statement this is. Does anyone know what I can do to get more detailed information about the COMMIT queries? All other queries show the variables used in the statement, SELECT, UPDATE etc. But not the COMMIT queries.
As #mvp notes, if COMMIT is slow the usual reason is slow fsync()s because every transaction commit must flush data to disk - usually with the fsync() call. That's not the only possible reason for slow commits, though. You might:
have slow fsync()s as already noted
have slow checkpoints stalling I/O
have a commit_delay set - I haven't verified that delayed commits get logged as long running statements, but it seems reasonable
If fsync() is slow, your best option is to re-structure your work so you can run it in fewer larger transactions. A reasonable alternative can be to use a commit_delay to group commits; this will group commits up to improve overall throughput but will actually slow individual transactions down.
Better yet, fix the root of the problem. Upgrade to a RAID controller with battery backed write-back cache or to high-quality SSDs that're power-fail safe. See, ordinary disks can generally do less than one fsync() per rotation, or between 5400 and 15,000 per minute depending on the hard drive. With lots of transactions and lots of commits, that's going to limit your throughput considerably, especially since that's the best case if all they're doing is trivial flushes. By contrast, if you have a durable write cache on a RAID controller or SSD, the OS doesn't need to make sure the data is actually on the hard drive, it only needs to make sure it's reached the durable write cache - which is massively faster because that's usually just some power-protected RAM.
It's possible fsync() isn't the real issue; it could be slow checkpoints. The best way to see is to check the logs to see if there are any complaints about checkpoints happening too frequently or taking too long. You can also enable log_checkpoints to record how long and how frequent checkpoints are.
If checkpoints are taking too long, consider tuning the bgwriter completion target up (see the docs). If they're too frequent, increase checkpoint_segments.
See Tuning your PostgreSQL server for more information.
COMMIT is perfectly valid statement which purpose is to commit currently pending transaction. Because of nature of what it really does - making sure that data is really flushed to disk, it is likely to take most of the time.
How can you make your app work faster?
Right now, it is likely that your code is using so called auto-commit mode - that is, every statement is implicitlly COMMIT'ted.
If you explicitly wrap bigger blocks into BEGIN TRANSACTION; ... COMMIT; blocks, you will make your app work much faster and reduce number of commits.
Good luck!
Try to log every query for a couple of days and then see what is going on in the transaction before the COMMIT statement.
log_min_duration_statement = 0

Tracking impressions/visits per web page

I have a site with several pages for each company and I want to show how their page is performing in terms of number of people coming to this profile.
We have already made sure that bots are excluded.
Currently, we are recording each hit in a DB with either insert (for the first request in a day to a profile) or update (for the following requests in a day to a profile). But, given that requests have gone from few thousands per days to tens of thousands per day, these inserts/updates are causing major performance issues.
Assuming no JS solution, what will be the best way to handle this?
I am using Ruby on Rails, MySQL, Memcache, Apache, HaProxy for running overall show.
Any help will be much appreciated.
Thx
http://www.scribd.com/doc/49575/Scaling-Rails-Presentation-From-Scribd-Launch
you should start reading from slide 17.
i think the performance isnt a problem, if it's possible to build solution like this for website as big as scribd.
Here are 4 ways to address this, from easy estimates to complex and accurate:
Track only a percentage (10% or 1%) of users, then multiply to get an estimate of the count.
After the first 50 counts for a given page, start updating the count 1/13th of the time by a count of 13. This helps if it's a few page doing many counts while keeping small counts accurate. (use 13 as it's hard to notice that the incr isn't 1).
Save exact counts in a cache layer like memcache or local server memory and save them all to disk when they hit 10 counts or have been in the cache for a certain amount of time.
Build a separate counting layer that 1) always has the current count available in memory, 2) persists the count to it's own tables/database, 3) has calls that adjust both places

Resources