Detecting end-user connection speed problems in Apache for Windows - network-programming

Our company provides web-based management software (servicedesk, helpdesk, timesheet, etc) for our clients.
One of them have been causing a great headache for some months complaining about the connection speed with our servers.
In our individual tests, the connection and response speeds are always great.
Some information about this specific client :
They have about 300 PC's on their local network, all using the same bandwith/server for internet access.
They dont allow us to ping their server, so we cant establish a trace route.
They claim every other site (google, blogs, news, etc) are always responding fast. We know for a fact they have no intention to mislead us and know this to be true.
They might have up to 100 PC's simulateneously logged in our software at any given time. They have a need to increase that amount up to 300 so this is a major issue.
They are helpfull and colaborative in this issue we are trying to resolve for a long time.
Some information about our server and software :
We have been able to allocate more then 400 users at a single time without major speed losses for other clients.
We have gone extensive lengths to make good use of data caching and opcode caching in the software itself, and we did notice the improvement (from fast to faster)
There are no database, CPU or memory bottlenecks or leaks. Other clients are able to access the server just fine.
We have little to no knowledge on how to do some analyzing on specific end-user problems (Apache running under Windows server), and this is where I could use a lot of help.
Anything that might be related to Apache configuration would also be helpfull.
While all signs points to it being an internal problem in this specific client network, we are dedicating this effort to solve that too, if that is the case, but do not have capable or instructed professionals to deal with network problems (they do, however, while their main argument is that 'all other sites are fast, only yours is slow')

you might want to have a look at the tools from google "page speed family": http://code.google.com/speed/page-speed/docs/overview.html
your customer should maybe run the page speed extension for you. maybe then you can find out what is the problem: http://code.google.com/speed/page-speed/docs/extension.html

Related

Scalable Delphi TCP server implementation

Any suggestions for components to use as a base for a scalable TCP server? I currently have an implementation that uses Indy which works well for say 100 relatively active connections or 1,000 relatively inactive connections, but the one thread per connection model limits the number of concurrent active connections that can be handled.
Let's say my goal might be 1,000 connections each processing 10 messages per second or 10,000 connections each processing 1 message per second on a good server (8-16 cores). Is this realistic? I'd really like to hear of any real-world implementations because I have found that what might work in theory does not necessarily work in practice and I do not want to be chasing a proposed solution that will not work.
Edit: IOCP would be good, but I only want to use commercial-grade classes/components, so they need to be as "professional" as Indy or IP*Works before I would think of using them. Furthermore, I have no intention of "rolling my own" solution - it would take too much time to make it commercial-grade. Lastly, I am looking for a significant improvement on what I already have. I am sure I can squeeze at least 20-50% more out of what I have (based on Indy), but I am never going to be able to handle 10,000 concurrent clients, or 10,000 messages per second, no matter how hard I try. Whether there is something out there that meets these conditions is another matter.
I have decided to accept the answer referring to the IOCP classes, even though I have not used them, because they look like the best path for investigation at this stage.
There is a project at http://voipobjects.com/ which is based on the former iopcclasses project.
It claims to handle thousands simultaneous connections:
IOCP engine is set of classes, components and routines for rapid
creation high scalable and performance TCP/UDP applications.
Application created using IOCP classes can handle thousands
simultaneous connections.
Library is written in Delphi - Delphi 7 - 2010 are supported.
Library uses IO completion ports technology. There is most powerful
technology in Win32 world for creation highly scalable and performance
TCP/UDP applications. This technology is supported in all desktop
Windows OSes except old Win9x/WinME versions.
This library is licensed under MPL1.1. Also It includes some files
from Jedi project (Winsock2 header translation).
https://bitbucket.org/voipobjects/iocpengine
My favorite Delphi network layer is ICS by Francois Piette. It's fantastically easy to understand, very scalable, and ultra-high performance. Free, and open source. Will probably scale to 1000 clients for most people, without significant effort, and without the complexity that gives me trouble when I use Indy.
I got about a 20% scalability/performance boost from switching all my stuff from Indy to ICS.
You should look at RealThinClient SDK http://www.realthinclient.com/about.htm
Well proven solution. Good support. Test results for different server solutions on the home page.
The real deciding fact is what you plan on doing on each of those transactions.
I use Indy with Network Load Balanced windows servers. One of these Delphi applications is serviced by 3 physical servers listening one public IP address where we have received millions of requests since yesterday with zero errors. Load overnight is pretty idle so the actual requests are around 350/second/server during the day and there's plenty of room for growth.
If there's not a lot of CPU/Memory needed per transaction you might get away with it on one box using Indy. It all depends on the load...as you likely can't write to 1000 different files every second.
There's other items to worry about too - like the OS supporting this amount of activity. You may need to tweak some registry settings. (see this stackoverflow question)
IOCP is the way to go for ultra-capacity servers. I have used Indy for ease of use in implementation/debugging for a very long time. I have my own IOCP implementation that I wrote years ago but never rolled it out on production as we simply haven't needed to.
My simple advice - I'd highly suggest rolling it out in Indy, using NLB as your crutch for load, and after that if you are still desiring the utmost speed, write your own IOCP implementation so you can craft it towards your specific requirements. Note that this is based on knowing nothing of the actual implementation requirements.
I've tried multiple Delphi solutions to do networking, and found that many if not all solutions add complexity and code which impacts on either performance or footprint or both. So I started searching for the lightest wrapper around the winsock API. The I (re)discovered Delphi's own TTcpClient and TTcpServer components. Used in blocking mode, and overriding TCustomTcpServer overriding the DoAccept method, I've had the best results till now.
If you expect to have a really high number of incoming connections and (small) responses to serve to (small) requests, it's highly advisable to implement I/O completion ports, as this handles incoming requests better.
I have been using ICS for the last 12 years. It is non-blocking. I support upto 2000 concurrent connections, each getting atleast 5000 bytes per second, sent as 1000 bytes every 200 msec. Never faced any problem and cpu usage of the app is very small.
Good support in forum, but not required at all.
Shekar

Number of users to a webpage

I have a webpage that reads data from an Access file (Microsoft Access file) on my website. How many users can visit that page at the same time?
Would the page crash at some time if too many users tried to visit that page at the same time? Is it better to use a PHP file that reads data from a text file or its just the same?
There are many variables that influence how many people can simultaneousness use your website (loosely known as scalability), including your database, hardware, network, caching and more. And yes, at some point your performance will degrade if more and more users access the page.
It would be really hard to say from the information you provided how scalable your website is. PHP could be faster but not necessarily. Always be skeptical about technologies that promise superior performance.
For the moment your best option is to try and estimate how many concurrent users you are expecting and then use a load testing tool like JMeter, Apache Bench or others to assess if you're website will stand up to the load.
It turns out that my website was hosted on Domain.com. Domain.com say that I have unlimited bandwidth frequency. But in reality I don't.
My website was crashing, because it was hosted with thousands of websites on the same server. So the bandwidth is limited even though it says unlimited. My only solution was to host my website on a VPS. Basically hosting my website on a server by itself.

How to specifically identify server issues from a load test result (using LoadRunner)?

How do you isolate a performance issue to a specific component of the application infrastructure? Specifically, are there distinct markers in the result logs that distinguish between bottlenecks at web, application and/or database server levels?
I was asked this question in an interview and went blank on it. Seems this information is not available anywhere.
In addition to SiteScope and other agentless monitoring of system components, you need to make sure your scenario and scripts are working as expected. This includes proper error checking and use of transactions (and a host of other things). If the transactions are granular enough, this will give you insight into at least the requests that have performance issues. Once you have these indicators, work with the infrastructure team to review logs and other information. Being an iterative process, tests can be made to focus on a smaller and smaller section of the infrastructure.
In addition, loadrunner scripts don't have to be made strictly 'coming in through the frontdoor'. If you have a multi-tiered system, scripts can be made to hit the web/app/database servers directly.
For what to look for, focus on any measurements that have 'knees' or 'hockey stick' type of behaviour. You can hook into any of the server resource type measurements directly in the controller and integrate other team's stats in the analysis phase. Compare with benchmarks at lower virtual user levels to determine what is acceptable and unacceptable.
Good luck!
If the interview is focused on LoadRunner and SiteScope is considered - I'd come to conclusion that it's more focused on HP/Mercury solutions.. In that case I'd suggest you to look into HP Diagnostics and it's LoadRunner integration capabilities.
This type of information is usually not available by just looking at the standard results from a performance test.
Parts of the information you are looking for MAY be found by using SiteScope to monitor all the relevant servers in the test. SiteScope offers many counters to look at such as CPU, Memory, Disk I/O and Network I/O - as seen on each server.
This information perhaps gives clues as to where the bottleneck is, and the more counters you add to SiteScope, the bigger the change to pinpoint the bottleneck.
It is a very common misconception that AppServer and DBServer bottlenecks could be identified by just looking at the raw response times or hits, pages etc (web protocol), unless of course the URI accessed defines the exact component(s) in the system...

What are the requirements for an application health monitoring system?

What, at a minimum, should an application health-monitoring system do for you (the developer) and/or your boss (the IT Manager) and/or the operations (on-call) staff?
What else should it do above the minimum requirements?
Is monitoring the 'infrastructure' applications (ms-exchange, apache, etc.) sufficient or do individual user applications, web sites, and databases also need to be monitored?
if the latter, what do you need to know about them?
ADDENDUM: thanks for the input, i was really looking for application-level monitoring not infrastructure monitoring, but it is good to know about both
Whether the application is running.
Unusual cpu/memory/network usage.
Report any unhandled exceptions.
Status of various modules (if applicable).
Status of external components (databases, webservices, fileservers, etc.)
Number of pending background tasks (if applicable).
Maybe track usage of the application and report statistics on most/less used functionalities so you know where optimizations are most beneficial.
The answer is 'it depends'. Why do you need to monitor? How large is your operations staff? Do you need reporting? What is the application environment? Who cares if the application fails? Who cares if an exception happens? Are any of the errors recoverable? I could ask questions like these for a long time.
Great question.
We've been looking for some application-level monitoring solution for our needs some time ago without any luck. Popular monitoring solution are mostly addressed to monitor infrastrcture and - in my opinion - they are too complicated for a requirements of most of small and mid-sized companies.
We required (mainly) following features:
alerts - we wanted to know about
incident as fast as possible
painless management - hosted service wouldbe
the best
visualizations - it's good to know what is going on and take some knowledge from the data
Because we didn't find suitable solution we started to write our own. Finally we've ended with up-and-running service called AlertGrid. (You can check it for free of course.)
The idea behind it is to provide an easy way to handle custom monitoring scenarios. Integration API is very simple (one function with two required parameters). At the momment we and others are using it for:
monitor scheduled tasks (cron jobs)
monitor entire application logic execution
alert on errors in applications
we are also working on examples of basic infrastructure monitoring using AlertGrid
This is such an open ended question, but I would start with physical measurements.
1. Are all the machines I think are hosting this site pingable?
2. Are all the machines which should be serving content actually serving some content? (Ideally this would be hit from an external network.)
3. Is each expected service on each machine running?
3a. Have those services run recently?
4. Does each machine have hard drive space left? (Don't forget the db)
5. Have these machines been backed up? When was the last time?
Once one lays out the physical monitoring of the systems, one can address those specific to a system?
1. Can an automated script log in? How long did it take?
2. How many users are live? Have there been a million fake accounts added?
...
These sorts of questions get more nebulous, and can be very system specific. They also usually can be derived reactively when responding to phsyical measurements. Hard drive fill up, maybe the web server logs got filled up because a bunch of agents created too many fake users. That kind of thing.
While plan A shouldn't necessarily be reactive, it is the way many a site setup a monitoring system.
Minimum: make sure it is running :)
However, some other stuff would be very useful. For example, the CPU load, RAM usage and (in multiuser systems) which user is running what. Also, for applications that access network, a list of network connections for each app. And (if you have access to client computer(s)) it would be cool to be able to see the 'window title' of the app - maybe check each 2-3 minutes if it changed and save it. Also, a list of files open by the application could be very useful, but it is not a must.
I think this is fairly simple - monitor so that you can be warned early enough before something goes wrong. That means monitor dependencies and the application itself.
It's really hard to provide specifics if you're not going to give details on the application you're monitoring, so I'd say use that as a general rule.
At a minimum you want to know that the system is healthy. This is subjective in what defines your system is healthy. Is it computers are up, the needed resources exist, the data is flowing through the system, the data is properly producing results, etc, etc.
In my project we do monitoring of most of this and then some. It really comes down to what is the highest level that you can use to analyze that everything is working. In our case we need to know down to the data output. If you just need to know down to the are these machines up it saves you on trying to show an inexperienced end user what is wrong.
There are also "off the shelf" tools that will do a lot of the hard work for you if you are just looking too hard into data results. I particularly liked Nagios when I was looking around but we needed more than it could easily show so I wrote our own monitoring system. Basically we also watch for "peculiarities" in the system, memory / cpu spikes, etc...
thanks everyone for the input, i was really looking for application-level monitoring not infrastructure monitoring, but it is good to know about both
the difference is:
infrastructure monitoring would be servers plus MS Exchange Server, Apache, IIS, and so forth
application monitoring would be user machines and the specific programs that they use to do their jobs, and/or servers plus the data-moving/backend applications that they run to keep the data flowing
sometimes it's hard to draw the line - an oversimplified definition might be "if your team wrote it, it's an application; if you bought it, it's infrastructure"
i think in practice it is best to monitor both
What you need to do is to break down the business process of the application and then have the software emit events at major business components. In addition, you'll need to create end to end synthetic transactions (eg. emulating end users clicking on a website). All that data would be fed into an monitoring tool. In the past, I've done JMX for applications of which flowed into Tivoli Monitoring's JMX Adapter and then I've done scripts that implement a "fake user" and then pipe in the results into Tivoli Monitoring's Script Adapter. Tivoli Monitoring takes the data and then creates application health and performance charts from that raw data.

How should I monitor potential threats to my site?

By looking at our DB's error log, we found that there was a constant stream of almost successful SQL injection attacks. Some quick coding avoided that, but how could I have setup a monitor for both the DB and Web server (including POST requests) to check for this? By this I mean if there are off the shelf tools for script-kiddies, are there off the shelf tools that will alert you to their sudden random interest in your site?
Funnily enough, Scott Hanselman had a post on UrlScan today which is one thing you could do to help monitor and minimize potential threats. It's a pretty interesting read.
UrlScan does seem like a nice option for iis6 and 7; I also found: dotDefender for pay which also covers Apache or IIS 5-7, and I had found an SQL Injection sanitation ISAPI
It is also worth noting in light of a recent wide spread SQL Injection attempt that dissallowing your webapp's db user account from querying the system tables (in MS SQL Server it's sysobjects and syscolumns) is a good idea.
I think this thread warrants more free solutions for Apache and other web servers.
Unfortunately intrusion detection was not what I had in mind, so sgfree isn't exactly a web site attack monitor, unless I'm not understanding how it works.
If you could go back and modify your app code, I'd suggest getting log4j/log4net integrated into the application. From there you could write code that would check a form field or URL (say at the global.asax level for .NET apps) and make a log entry when malicious code is detected.
The nice thing about log4j/log4net is that you can configure an e-mail/pager/SMS type appender so as soon as the malicious attempt was caught, you would be notified.
I'm in the process of merging some log4net code into our CMS system we have and I'm looking to do just this in light of the influx of ASPRox attacks that have been coming our way.
Monitoring web and DB access logs should alert you to things like this, but if you want a more fully featured alert system I would suggest some kind of IDS/IPS. You'll need a spare machine though, and a switch that can do port mirroring.
If you have those then an IDS is a cheap way of monitoring your traffic for many intrusion attempts (there will be lots). Snort (www.snort.org) based IDSes are excellent, and there are some free fully packaged versions available. One I have used is StrataGuard (http://sgfree.stillsecure.com/), and it can be configured as an IDS (Intrusion Detection System) or as an IPS (Intrusion Prevention System). It's free to use if your traffic does not exceed 5Mbps.
If you do go with an IDS/IPS I'd advise you to let it run as a simple IDS for a month or so, before you allow it to prevent attacks.
This may be overkill, but if you have a spare machine lying around it can't hurt to have an IDS running passively.
You can set up your system to kick out some error message that then makes a JSON or http call to a system that will monitor, report (log) and send out any kind of alert such as SMS/email or a phone call.
Check out developer.alertcaster.com
Especially if you need to monitor multiple simultaneous events, which it sounds like you have going on, this might be a good fix.

Resources