Web Server Log Analysis Tool - analysis

Any suggestions for an accurate Web Log analysis tool to generate reports on the IIS logs? We used WebTrends, but I don't feel it was accurate.

To analyze weblogs, I don't think you can go wrong with Analog: http://www.analog.cx/
If you are analyzing your own logs, which are often huge files, you will want the fastest analyzer you can find. Analog is fast.
You'll want one that's been around awhile and is still supported. Analog just celebrated its 10'th birthday.
Analog claims to be the most popular logfile analyser in the world.
Multi-languages.
Did I say its free and open source?
As far as accuracy goes, no tool gives perfect results. Javascript fails often in catching hits. Trying to track individual people's paths through a website (i.e. for Analytics purposes) is fraught with problems. And even trying to differentiate hits versus visits and screening out the bots is all more of a black art than a science.
What is best is simply to have a tool that gives decent basic statistics that tell you what you need to know.
I've looked at other tools, such as Deep Log Analyzer: http://www.deep-software.com/, which attempts to do analytics from your weblogs. But speed was a problem. They claim their new version 3.5 - April 2008, which I didn't try, has improved performance. The big advantage of a program like this is the advanced reporting you can do, including custom SQL requests. You have to purchase their professional version ($200) to do most of the analytics and custom queries. If Analog is too simple for you, then try the free version of Deep Log Analyzer.
And you can also try Microsoft's own Log Parser, as was the recommended answer in: https://stackoverflow.com/questions/157677/a-good-iis-log-viewer-for-large-log-files.
But you will need some extra skills to use it.

What are you wanting to analyze from your logs? There are a bunch of tools out there - free or paid for - that will go through the logs and spit out a great variety of figures. Some have real meaning, others are best used with a grain of salt.
What none will show you is "How many people are actually reading my wonderful web pages". Those that attempt to show "distinct site visitors" or any detailed metrics are at best a rough approximation to an indication of a vague trend...
But for what it's worth, we use Analog.

SHORT ANSWER:
You are correct to question the results; log analysis is not adequate to report actual traffic.
LONGER ANSWER:
WebTrends is a great tool for what it delivers. But as a previous administrator of a WebTrends installation, I found that web logs are notoriously bad at capturing metrics of interest.
For instance, if there exists any caching in your web delivery stack (or on the consumers side-- *I'm shaking my fist at YOU, AOL!), then your web logs are instantly non-reflective of your site's actual activity. This is because log analysis assumes that all user consumption will translate to an HTTP request back to the web server-- and thus having been recorded in the IIS logs. In the case of a cache, this would not be the case.
In the future if you want more reliable results, you ultimately need to ensure that there exists a way to bust any caching strategy. The obvious answer is dynamic content. But if you do not want to rewrite all of your content in such a fashion, just ensure your web traffic analysis uses a dynamic call.
WebTrends actually offers a solution to this problem, called SDC server. This is exactly what Google Analytics offers as well-- it's a javascript call back to the analysis server.
...I could go for days on this. If you want more specific information, comment back. ;)
EDIT: With WebTrends, specifically, it is quite important to configure session tracking beyond their default IP/userAgent configuration. If your web server assigns a session cookie, you will find this will increase your reliability; especially for differentiating between users which may sit behind the same NAT.

I have had really good luck with SmarterStats, from SmarterTools.

There is a logging package for free from MSFT for viewing this information using SQL Reporting Services. Google it.

doing it with the logs is only a good idea if it's internal - I'd use google analytics for anyhing on teh internets

I have been using Summary, which is paid for software, for years, and love it. The cost of updates is getting to me, and paying for an update to just get user agent string updates out of the deal is getting bothersome. Not that there are not other fixes, I just tend to not need them.
Anyone care to share if they have used Summary compared to analog?

Look at XpoLog log analysis platform for web application servers and web servers log. it a log management and analysis platform that integrate to web servers logs and create reports, provide search and log viewer and also monitor for problems. XpoLog

Related

Tools for analyzing application perfomance

We have application deployed on customer's server where we haven't any access.
Only thing which we can to do ask someone to make trivial operations(use profiler is not trivial operation) and change our code to add logger for example.
So i want to know is there any good logger specialized on perfomance. I know about log4net but it is just log some information. It would be good to get reports with charts and represent measures of code in hierachical view. I want logger which separate requests from different users. Ofcourse i can write it by myself but maybe there are exists some good free tools?
If you have no access, the only solution I know is: http://miniprofiler.com/
Other ideas:
Ants profiler
NewRelic
Custom Attributes with PostSharp
List item
Custom ASP.NET MVC filters
Instead of just logging you may want to think about using an application analytics tool. There are a variety of them available, many with a free usage tier. Depending on the tool you may need to do just a little extra coding to get the measurement data you want or it may require more significant custom coding.
First, one that I have personal experience with is Loupe from Gibraltar Software. This is a logging and analytics tool that started as a logging tool and I think will give you more of what you want with minimal code changes.
The other option I am familiar with is PreEmptive Analytics. This originated as an application analytics tool so you need to do a bit more coding to report on the more raw performance value data.
There are also other options in the .NET space including New Relic, Loggr, Equatec, and Trackerbird that should all work with either a greater or lesser amount of custom coding for your situation.

Bug reports solution

Clarification/summary for the question -- we're looking for:
a hosted bug tracking system,
that is as convenient to use as lighthouse/github/launchpad,
can deal with attachments,
integrates email notifications and operations (implies operations in commit messages),
has a script-friendly API,
allows anonymous bug reports, or ones with an email but that do not require setting up an account for submission.
Lighthouse is close but fails on the last point, launchpad is similar, github also doesn't handle attachments. Tender is great for the last point, but fails as a general bug tracking system (and it looks like its open-source version will be limited to basically being a forum).
We looked into a number of applications to install and setup -- but with this range of requirements, they are always coming with a huge cost in terms of investing time in setting up and maintaining a working system.
In our (open-source) project we have been using Gnats for a really long time. It doing what it was designed to do fine, but that's getting to be pretty inconvenient: it's no longer maintained, has features that we never use, and lack features that we'd want to use... It doesn't deal with attachments, has no easy way to perform actions via emails, no integration with commit messages, and a web interface that was designed for 90s browsers. So I've been looking around in an attempt to find something that could replace it, hopefully some hosted solution to avoid the setup/maintenance hassle.
Probably the most impressive tool that I've seen is lighthouse: it has a very nice and practical interface, properly deals with attachments, controllable via emails, and can respond to commands in commit messages. But... It doesn't have any sane way to submit a bug anonymously -- and that's a major requirement, since we need any random user to be able to submit bugs through our IDE. (It seems that there is a possible hack to forward an email faking the From field, but that doesn't work very well -- specifically, the reporter should be included in the followup email exchange.) On the other side, there is the related tender tool, which is very good in that area, but is very basic otherwise -- too basic to serve as a bug tracking system.
There's a whole bunch of other sites that I've tried -- it seems that all of them require submitters to have an account, so they don't work well for our needs; as well as being limited in various other ways (don't deal with attachments, no good email integration, etc etc). It doesn't help that the meta-descriptions of these sites is usually pretty obscure: it took me hours to just figure out what tender/lighthouse are and how they're related, and no site mentions its inability to receive bug reports without registration. (I'm looking only at open-source-friendly sites, since we don't have any kind of budget for such things.)
There's also the option of installing some system locally, but bug tracking systems tend to be monsters that I'd like to avoid configuring and maintaining, if possible.
So the question is: is there anything obvious that I'm missing? Or to make it more concrete: is there a good comparison page somewhere that lays out popular options and their respective features explicitly?
JIRA is free for open source projects. It's far more user friendly than trac and bugzilla, and allows anonymous submissions and plugins. Unfortunately you'll need to host it on your own server, but from personal experience I can tell you that all you need to do is install a database (it can run without; but that's not a good idea) and it basically maintains itself.
Also is there a particular reason why Google Code or Sourceforge issue tracking tools wouldn't work? You don't need to use all their services if you don't want, you could use them purely for issue tracking.
Did you try trac? It is used by many open source projects.
FogBugz is one option. They'll host or you can run it yourself. My company looked at it but ... political considerations ... meant it is not viable here.
Have you looked at this Comparison of issue tracking systems on Wikipedia?
I have also found fixx, by hedgehoglab. Apparently it has the features that you care more:
Get things done
fixx has an intuitive interface to enable quick bug
reporting. Filling in a bug report is
as easy as sending e-mail.
Ability to add multiple attachments to issues allowing you
to attach screenshots and manage
documents related to issues.
Clever notification options to keep relevant people informed while
preventing issue tracker spam.
Also:
It has an open REST API.
I see that you are using Subversion as SCM. There is a Subversion integration with fixx.
Its unique installation requirement is Sun JDK 1.5.0.
It seems free for Open Source Projects and an hosted version is "Coming soon".
Note that I have never used it, so I cannot give any recommendation.
The open source BugTracker.NET has support for the following areas that are giving you problems:
Attachments
Guest login
Email notifications
SVN commit integration
I found it easy to set up, maintain, and tweak. Of course, you might think otherwise if you are not familiar with .NET and have a Windows server available.
You might look at Unfuddle. They do allow an API for the submission of tickets and have your other points covered including attachments.
Take a look at repositoryhosting.com They have ready made solution with trac / svn / git, for you. Comes with all kinds of bells and whitsles, such as Agilo plug-in and auotomatic backup to the amazon S3 bucket of your choice.
The prices are very reasonable.
Also, jumboxes offers a Trac / SVN virtual appliance that you can host in your own environment.
Redmine is a good open source option. You can check an online demo and a list of features.
It's not hosted though. But it's an interesting option.
And you can always check a list of different open source bug tracking alternatives
I've used ZenDesk in the past and it was rather hassle free.
In addition it has an api: http://www.zendesk.com/api.
Moreover I KNOW it can CC whosoever you want it to whenever anything happens.
We too are looking for a new solution.
At present we're using FogBugz, which is painfully slow.
We need our customers to be able to log bugs via email. Tender looks perfect, with the exception that it doesn't have any obviously usable ID fields that we can pass around. Is there a plugin or similar? I could knock up a browser extension to "goto bug id [whatever]" but that seems kludgy for what should surely be a core feature?

How can I test if my web application could handle heavy traffic?

What would be a proper way to simulate a large number of requests to test if my web application can handle it?
You could try using Microsoft's WCAT tools. Look here: http://support.microsoft.com/kb/231282
They're free, too. That's always nice.
Depending on your budget, you may be interested in some load testing software designed for this. A Google search brings up all sorts of alternatives. This is probably the best way to do it.
This one has a free trial version and isn't too pricey, but I would recommend shopping around first.
I've used JMeter in the past, and I find it to be very useful for stress/load testing as website, even ones written in ASP.NET (with or without MVC).
In general you would want to (with any tool) write a script of what an average user of your site would do. You may even end up creating multiple of these scripts. Tools like JMeter even allow for a random element to be added to a script. With these scripts created a load testing tool can then simulate as many users as you desire hitting your site.
I would recommend allow JMeter to slowly ramp up the number of concurrent users and watch the response time graph. At the point where the response time starts increasing too highly is at the point where you've hit the maximum number of users (given you scripts) that your site can handle.
ab and httperf are two, more unixy options, if you don't mind delving in that direction.
There's a nice screencast for using httperf by peepcode.
Use the load testing tools from Visual Studio Team System. 2010 if you can get it.
The tools are great to use and provide wonderful instrumentation. There is also a programming model to go with the tools, allowing you to make some very complex testing scenarios possible.
Post the URL on stackoverflow.
Make it sound like a challenge, so lots of people come check it out: "Can you find the hidden performance problem in this app?"

Remote Machine Scan

As part of a web-application I'm building, I need to be able to scan the remote user's machine for viruses / malware, before they can continue using the web-application ... something like the McAfee On-Demand Scan.
I'm assuming that ActiveX would be the way to go (since all the On-Demand scanners of the antivirus companies seem to be ActiveX-based).
I'm a bit stuck on how to solve this problem. I'm hoping I don't have to rustle up something from scratch.
Does anybody have any ideas ? Is it possible to integrate some already available component into my code to do this ?
Do let me know if there's more information you need.
Regards,
Sonal.
Short Answer: Just don't do this.
Long Answer: I would seriously re-evaluate your requirements here. Forcing a virus scan from a webapp is essentially impossible to do properly, and serves no real purpose from the perspective of the webapp. The whole point of the web is that it's a request initiated by the user, and run inside a sandbox. Forcing access to the rest of the machine for something like a virus scan is deliberately the exact opposite of the way it is meant to work
The only thing I can think of which would be sensible would be to offer an on demand scan, for which you would be best to redirect your users to an expert in the area - Panda ActiveScan is probably as good as any. But services such as these rely on downloadable program anyway in the form of java applet, browser plugin or similar - it's not done over the web.
Is the user part of your company? Is this an application that they will be required to use as part of their employment? If not, I can hardly see people visiting your site and saying "Oooh... he wants me to download and run a program on my machine!" Sounds like a great way to get your site on a bunch of "block lists".
Also, do you have a commercial arrangement with a virus scanning company that would allow you to install multiple copies of their commercial software on people's machines? I'm guessing not.
Really, I have to agree with Colin. This idea sounds dead before it even starts.

What is the best Delphi n-tier low bandwidth technology?

I need to deploy a Delphi app in an environment that needs centralized data and file storage system (for document imaging) but has multiple branch offices with relatively poor inter connectivity. I believe a 3 tier database application is the best way to go so I can provide a rich desktop experience with relatively light-weight data transfer needs. So far I have looked briefly at Delphi Datasnap, kbmMW and Remobjects SDK. It seems that kbmMW and Remobjects SDK use the least bandwidth. Does anyone have any experience in deploying any of these technologies in a challenging environments with a significant number of users (I need to support 700+)? Thanks!
Depends if you are tied to remote datasets. If you aren't dataset bound then SOAP would likely be a good choice. Or, what I've done is write my own protocol that is similar to SOAP in nature. This was done before SOAP was standard and I'm glad I did - this gives you the ability to control more of the flow of data. It's given that if you have poor connectivity then you will be spending time supporting it. It's very nice if it's your own code you are supporting versus having to wait on a vendor. (Although KBM and REM are known to be pretty good vendors.)
Personal note: 700 users in a document imaging application over poor connectivity sounds like a mess. Spend the money on upgrading connectivity as it'll be cheaper in the long run.
Both kbmMW and RO SDK offer binary format, which is more compact than SOAP format,specially you are working with documents.
RO sdk seems to offer more GUI tools to help you doing your services.
Also give a RealThinClient SDK a look, it's a lightweight remoting framework.
But what ever framework you go with, your design of work will make it fast or slow, I have some applications working on slow 128kb lines, and it's working perfect without any user complain, but I don't do a large transfer for files.
One thing to remember...its not the number of users, but the number of them using the resources at the same time that will be the issue. Attempt to develop your application "server stateless" if at all possible, this will allow greater flexibility in the long term if you find you have to add more servers to the pool to support your customer base. The hardest thing about n-tier is scaling beyond the first server...plan on that from the start. Each request should not know anything about a prior request...or at the very least the request should have a way of passing the context so the server can look it up in a session table or something.
Personally, I would recommend RemObjects. I have used it with good results.
I don't know if it's the very best / most efficient (glad you asked this question!), but I've had good results w/RemObjects SDK + DataAbstract. The latter made much of the plumbing details less involved, which was helpful. Still implementing, but so far so good.
If you really wanna go "low-bandwidth" use BSD Sockets API - that'll give you full control over what's being sent and there you can send as little information as you want. Of course then you'll have to implement all the tiers yourself, but hey - that's still an option :D

Resources