We are 2 working on a website with Ruby on Rails that receives GPS coordinates sent by a tracking system we developped. This tracking system send 10 coordinates every 10 seconds.
We have 2 servers to test our website and we noticed that one server is processing the 10 coordinates very quickly (less than 0.5 s) whereas the other server is processing the 10 coordinates in 5 seconds minimum (up to 20 seconds). We are supposed to use the "slow" server to put our website in production mode this is why we would try to solve this issue.
Here is an image showing the time response of the slow server (on the bottom we can see 8593 ms).
Slow Server
The second image shows the time response of the "quick" server.
Fast Server
The version of the website is the same. We upload it via Github.
We can easily reproduce the problem by sending fake coordinates with POSTMan and difference of time between the two servers remain the same. This means the problem does not come from our tracking system in my opinion.
I come here to find out what can be the origins of such difference. I guess it can be a problem from the server itself, or from some settings that are not imported with Github.
We use Sqlite3 for our database.
However I do not even know where to look to find the possible differences...
If you need further information (such as lscpu => I am limited to a number of 2 links...) in order to help me, please do not hesitate. I will reply very quickly as I work on it all day long.
Thank you in advance.
EDIT : here are the returns of the lscpu commands on the server.
Fast Server :
Slow Server :
May be one big difference is the L2 cache...
My guess is that the answer is here but how can I know what is my value of pragma synchronous and how can I change it ?
The size of the .sqlite3 file I use is under 1 Mo for the tests. Both databases should be identical according to my schema.rb file.
The provider of the "slow" server solved the problem, however I do not know the details. Some things were consuming memory and slowing down everything.
By virtual server, it means finally that several servers are running on the same machine, each is attributed a part of the machine.
Thanks a lot for your help.
Related
I am running Ubuntu (64Bit) with Apache 2.2.17, Passenger 3.0.11, Ruby 1.9.3 and Rails 3.2.6
When accessing the web page (index.html) on my webpage the request takes ages to complete, somewhere around 30 second in extreme cases.
The server has plenty of memory available (top shows more than 4GB free), the Apache processes (there are 10 of them) each show 0% CPU in top and the load is also almost 0 and there are hardly any DB accesses as I cache most of the things with memcached.
The log files of Apache as well as Rails do not show any errors, on the contrary the render times shown in the RubyOnRails log file show excellent values (<100 ms).
So where to go from here?
Is the first request slow or all requests slow? Passengers shutdown after a given time interval. So intermittenly requests (requests with sufficient time span in between) will allow passengers to shutdown (only to be restarted at next request.
Passenger does the autoshutdown BY DESIGN. This is so because on a shared environment, there might be other user's apps. If your app is idle for a while, then the resources can be transferred to other people's app.
If you are on a tight budget and you have multiple apps hosted on the same server, then passenger is a great solution.
If you have only ONE app in your server which you control, then please reconfigure Passengers to NOT shutdown (if that indeed is your problem).
You can do "passenger-status" to see how many passengers are currently running and available for taking requests.
The configuration to ensure that Passengers stay up is PassengerMinInstances and PassengerPoolIdleTime.
Are you accessing it through a 'fake domain name' (added to your /etc/hosts file)?
If so, do
service avahi-daemon stop
At least that's what worked for me on ubuntu 10.10 :)
For some reason a DNS lookup is made on each and every request you do to the server, and when the domain doesn't exists, it times out ...
The performance issue has been keeping me busy for all these days. I believe I have nailed it down to Apache configuration: KeepAliveTimeout, it was set to a very high value (90), can't think why it was set that high, must have been a typo.
My understanding of KeepAliveTimeout is that the Apache process gets locked to the client for 90 seconds, even if the client isn't issuing any further requests, hence when traffic picks up (which it did on that day when performance was significantly reduced, page visits more than tripled) all Apache processes are busy waiting for the KeepAliveTimeout, while blocking all new requests coming in. This would also explain why the system was not showing much load at all, it was just sitting there waiting. I reduced the value down to 10, if traffic picks up I'll probably drop it to 5.
After performing load testing against an app hosted on Heroku, I am finding that the most DB intensive request takes 50-200ms depending upon load. It never gets slower, no matter the load. However, seemingly at random, the request will outright timeout (30s or more).
On Heroku, why might a relatively high performing query/request work perfectly 8 times out of 10 and outright timeout 2 times out of 10 as load increases?
If this is starting to seem like a question for Heroku itself, I'm looking to first answer the question of whether "bad code" could somehow cause this issue -- or if it is clearly a problem on their end.
A bit more info:
Multiple Dynos
Cedar Stack
Dedicated Heroku DB (16 connections, 1.7 GB RAM, 1 comp. unit)
Rails 3.0.7
Thanks in advance.
Since you have multiple dynos and a dedicated DB instance and are paying hundreds of dollars a month for their service, you should ask Heroku
Edit: I should have added that when you check your logs, you can look for a line that says "routing" That is the Heroku routing layer that takes HTTP request and sends them to your app. You can add those up to see how much time is being spent outside your app. Unfortunately I don't know how easy it is to get large volumes of those logs for a load test.
Ok. I know I don't have a lot of information. That is, essentially, the reason for my question. I am building a game using Flash/Flex and Rails on the back-end. Communication between the two is via WebORB.
Here is what is happening. When I start the client an operation calls the server every 60 seconds (not much, right?) which results in two database SELECTS and an UPDATE and a resulting response to the client.
This repeats every 60 seconds. I deployed a test version on heroku and NewRelic's RPM told me that response time degraded over time. One client with one task every 60 seconds. Over several hours the response time drifted from 150ms to over 900ms in response time.
I have been able to reproduce this in my development environment (my Macbook Pro) so it isn't a problem on Heroku's side.
I am not doing anything sophisticated (by design) in the server app. An action gets called, gets some data from the database, performs an AR update and then returns a response. No caching, etc.
Any thoughts? Anyone? I'd really appreciate it.
What does the development log say is slow for those requests? The view or db? If it's the db, check to see how many records there are in database and see how to optimize the queries. Maybe you need to index some fields.
Are you running locally in development or production mode? I've seen Rails apps performance degrade faster (memory usage) over time in development mode. I'm not sure if one can run an app on Heroku in development mode but if I were you I would check into that.
I have a website that is hanging every 5 or 10 requests. When it works, it works fast, but if you leave the browser sit for a couple minutes and then click a link, it just hangs without responding. The user has to push refresh a few times in the browser and then it runs fast again.
I'm running .NET 3.5, ASP.NET MVC 1.0 on IIS 7.0 (Windows Server 2008). The web app connects to a SQLServer 2005 DB that is running locally on the same instance. The DB has about 300 Megs of RAM and the rest is free for web requests I presume.
It's hosted on GoGrid's cloud servers, and this instance has 1GB of RAM and 1 Core. I realize that's not much, but currently I'm the only one using the site, and I still receive these hangs.
I know it's a difficult thing to troubleshoot, but I was hoping that someone could point me in the right direction as to possible IIS configuration problems, or what the "rough" average hardware requirements would be using these technologies per 1000 users, etc. Maybe for a webserver the minimum I should have is 2 cores so that if it's busy you still get a response. Or maybe the slashdot people are right and I'm an idiot for using Windows period, lol. In my experience though, it's usually MY algorithm/configuration error and not the underlying technology's fault.
Any insights are appreciated.
What diagnistics are available to you? Can you tell what happens when the user first hits the button? Does your application see that request, and then take ages to process it, or is there a delay and then your app gets going and works as quickly as ever? Or does that first request just get lost completely?
My guess is that there's some kind of paging going on, I beleive that Windows tends to have a habit of putting non-recently used apps out of the way and then paging them back in. Is that happening to your app, or the DB, or both?
As an experiment - what happens if you have a sneekly little "howAreYou" page in your app. Does the tiniest possible amount of work, such as getting a use count from the db and displaying it. Have a little monitor client hit that page every minute or so. Measure Performance over time. Spikes? Consistency? Does the very presence of activity maintain your applicaition's presence and prevent paging?
Another idea: do you rely on any caching? Do you have any kind of aging on that cache?
Your application pool may be shutting down because of inactivity. There is an Idle Time-out setting per pool, in minutes (it's under the pool's Advanced Settings - Process Model). It will take some time for the application to start again once it shuts down.
Of course, it might just be the virtualization like others suggested, but this is worth a shot.
Is the site getting significant traffic? If so I'd look for poorly-optimized queries or queries that are being looped.
Your configuration sounds fine assuming your overall traffic is relatively low.
To many data base connections without being release?
Connecting some service/component that is causing timeout?
Bad resource release?
Network traffic?
Looping queries or in code logic?
I have a simple Rails app deployed on a 500 MB Slicehost VPN. I'm the only one who uses the app. When I run it on my laptop, it's fast enough. But the deployed version is insanely slow. It take 6 to 10 seconds to load the login screen.
I would like to find out why it's so slow. Is it my code? (Don't think so because it's much faster locally, but maybe.) Is it Slicehost's server being overloaded? Is it the Internet?
Can someone suggest a technique or set of steps I can take to help narrow down the cause of this problem?
Update:
Sorry forgot to mention. I'm running it under CentOS 5 using Phusion Passenger (AKA mod_rails or mod_rack).
If it is just slow on the first time you load it is probably because of passenger killing the process due to inactivity. I don't remember all the details but I do recall reading people who used cron jobs to keep at least one process alive to avoid this lag that can occur with passenger needed to reload the environment.
Edit: more details here
Specifically - pool idle time defaults to 2 minutes which means after two minutes of idling passenger would have to reload the environment to serve the next request.
First, find out if there's a particularly slow response from the server. Use Firefox and the Firebug plugin to see how long each component (including JavaScript and graphics) takes to download. Assuming the main page itself is what is taking all the time, you can start profiling the application. You'll need to find a good profiler, and as I don't actually work in Ruby on Rails, I can't suggest any: google "profile ruby on rails" for some options.
As YenTheFirst points out, the server software and config you're using may contribute to a slowdown, but A) slicehost doesn't choose that, you do, as Slicehost just provides very raw server "slices" that you can treat as dedicated machines. B) you're unlikely to see a script that runs instantly suddenly take 6 seconds just because it's running as CGI. Something else must be going on. Check how much RAM you're using: have you gone into swap? Is the login slow only the first time it's hit indicating some startup issue, or is it always that slow? Is static content served slow? That'd tend to mean some network issue (either on the Slicehost side, or your local network) is slowing things down, assuming you're not in swap.
When you say "fast enough" you're being vague: does the laptop version take 1 second to the Slicehost 6? That wouldn't be entirely surprising, if the laptop is decent: after all, the reason slices are cheap is because they're a fraction of a full server. You're using probably 1/32 of an 8 core machine at Slicehost, as opposed to both cores of a modern laptop. The Slicehost cores are quick, but your laptop could be a screamer compared to 1/4 of core. :)
Try to pint point where the slowness lies
1/ application is slow, or infrastructure (network + web server)
put a static file on your web server, and access it through your browser
2/ If it is fast, it is probable a problem with application + server configuration.
database access is slow
try a page with a simpel loop: is it slow?
3/ If it slow, it is probably your infrastructure. You can check:
bad network connection: do a packet capture (with Wireshark for example) and look for retransmissions, duplicate packets, etc.
DNS resolution is slow?
server is misconfigured?
etc.
What is Slicehost using to serve it?
Fast options are things like: Mongrel, or apache's mod_rails (also called passenger phusion or
something like that)
These are dedicated servers (or plugins to servers) which run an instance of your rails app.
If your host isn't using that, then it's probably defaulting to CGI. Rails comes with a simple CGI script that will serve the page, but it reloads the app for every page.
(edit: I suspect that this is the most likely case, that your app is running off of the CGI in /webapp_directory/public/dispatch.cgi, which would explain the slowness. This tends to be a default deployment on many hosts, since it doesn't require extra configuration on their part, but it doesn't give good performance)
If your host supports "Fast CGI", rails supports that too. Fast CGI will open a CGI session, and keep it open for multiple pages, so you get much better performance, but it's not nearly as good as Mongrel or mod_rails.
Secondly, is it in 'production' or 'development' mode? The easy way to tell is to go to a page in your app that gives an error. If it shows you a stack trace, it's in development mode, which is slower than production mode. Mongrel and mod_rails have startup options to determine whether to run the app in production or development mode.
Finally, if your database is slow for whatever reason, that will be a big bottleneck as well. If you do have a good deployment (Mongrel/mod_rails/etc.) in production mode, try looking into that.
Do you have a lot of data in your DB? I would double check that you have indexed all the appropriate columns- because this can make a huge difference. On your local dev system, you probably have a lot more memory than on your 500 mb slice, which would result in the DB running a lot slower if you have big, un indexed tables. You can also run the slow queries logger in MySql to pinpoint columns without indexes.
Other than that, yes- passenger will need to spool up a process for you if you have not been using the site recently. If this is the case, you should see a significant speed increase on second, and especially third and later page loads.
You might want to run a local virtual machine with 500 MB. Are you doing a lot of client-server interaction? Delays over the WAN are significant
You might want to check out RPM (there's a free "lite" version too) and/or New Relic's Tune Up.
Your CPU time is guaranteed by Slicehost using the Xen virtualization system, so it's not that. Don't have the other answers for you, sorry! Might try 'top' on a console while you're trying to access the page.
If you are using FireFox and doing localhost testing (or maybe even on LAN) you may want to try editing the network.dns.disableIPv6 setting.
Type about:config in the address bar and filter for network.dns.disableIPv6 and double-click to set to true.
This bug has been reported mainly from Vista OS's, but some others as well.
You could try running 'top' when you SSH in to see which process is heavy. If you also have problems logging you, perhaps you may try getting Statistics in the Slicehost manager.
If you discover it is MySQL's fault, consider decreasing the number of servers it can spawn.
512 seems decent for Rails application, you might have to check if you misconfigured too.