Logging 404 errors (with target and referrer URL fields) - asp.net-mvc

I want to collect and analyze 404 data to address any real issues, in an ASP.NET MVC site (with ELMAH). The chief requirement is to store this information in a more specialized and dense but still queryable format, including the referring site/URL.
I can currently review 404's in ELMAH. However I do not want ELMAH collecting all my 404's (at least not in the default format), because these error logs get large too rapidly. Only about 1% of an ELMAH 404 log is typically relevant data, for example logging irrelevant exception details about mundane vulnerability scans. Then, finding real errors becomes very difficult, or even impossible if I have to truncate my ELMAH table weekly.
Also, even after collecting all that data in ELMAH, it does not offer specialized fields for the critical target and referer URL fields (to query or aggregate) that make managing 404's possible.
If there's a package (e.g. via NuGet) that is able to store to SQL, includes a presentation layer, can sort by most common errors or errors with actual referring sources, and even permits marking them seen/addressed so they do not show in future reports, that would be an ideal solution. Any solution providing a portion of that would be a great start.
In lieu of a recommendation, I will probably add a custom handler to ELMAH and log to SQL through my own data layer.
However, I'd prefer a packaged solution, and it need not leverage ELMAH. I can manually add a filter to ELMAH (Elmah reporting unwanted 404 errors, ELMAH - Filtering 404 Errors) if ELMAH is not part of the solution.

I'm one of the developers behind https://elmah.io. elmah.io offers some of the features you are looking for. You can search for errors by different key properties. Also the filter part can be implemented using our Rules option, where you can ignore errors from specific user agents and so on.
We are also creating a ErrorLog implementation for ELMAH, making it possible for you to store errors in Elasticsearch: https://github.com/elmahio/Elmah.Io.ElasticSearch. You could search and aggregate all of your 404's using a UI for Elasticsearch like Kibana.

Related

Orchard CMS - How to redirect URL request?

I've created an Orchard Website that consists of many mini-websites made operable via theme selection relating and triggered by the current [unique] URL all from the same DB.
It works as I'd hoped, but I wish to improve the Autoroute paths of my pages.
Currently I'm using:
{Content.Fields.PageOrderPart.SitesTaxonomy.Terms:0}/{Content.Slug}
Which results in:
www.site1.com/sitealpha/gallery
www.site2.com/sitebeta/gallery
What I would like is:
www.site1.com/gallery
www.site2.com/gallery
However, I still need to be able to differentiate pages with the same name [...hence why the Autoroute path above was created in the first place] - or I will obviously get permalink duplication errors.
Can anyone think of an ingenious way to sort this or is there some Orchard feature I may've missed, perhaps url rewrites or possibly an existing MVC method?
Many thanks for your input, PP
Further thoughts (hopefully this doesn’t influence any other member's sugestions):
Rewrite rules: Probably aren't dynamic enough for the amount of content I have [still increasing] - and I could see any alterations to existing permalinks being a real nightmare.
Besides, for an unkown reason - I and some other Orchard users - can't seem to get a rewrite action to work?
Routes: Honestly, I haven't played with these properly - I can see that capturing and dissecting a URL to stipulate the area / controller / action should be easy enough - but I'm not sure how to go about redirecting to a particular Orchard page?
FilterProvider, IActionFilter: I tested this scenario [which could become quite complicated code wise] and I'm not sure the performance is acceptable -- my dev system seems to really suffer with any code in the 'OnActionExecuting' method.
Update: I've investigated the IActionFilter scenario and it appears my initial performance worries were unfounded [i.e. a fresh install didn't behave any slower with some URL restructuring code on the 'OnActionExecuting' method].
My last hurdle is to discover why IIS Rewrite rules aren't working with Orchard:
https://stackoverflow.com/questions/35226580/orchard-cms-iis-rewrite-rules-do-not-work-for-rewrite-action-types
It could be that multi tenancy would be the feature you are after. You can have one code, one db but they are all completely separate sites. So separate admin areas etc.
http://docs.orchardproject.net/Documentation/Setting-up-a-multi-tenant-orchard-site

Orbeon and REST API

We use Orbeon with a custom REST interface with Apache CXF and we were wondering why does Orbeon Builder allow multiple sets of the same application/form?
Of course each set gets it's own documentId but on publish each form overwrites the other (given the same app/form)
So what was the idea behind that? It is manageable with a couple of forms but we are looking at 300+ forms with multiple users building forms with the builder.
Besides the possibility of user error when renaming a form and by accented overwriting another on publish it is quite a head ache from an administration point of view.
Speaking about the REST api:
We would like to return meaningful error messages from the persistence layer to the ui. Is that possible with the current builds of Orbeon and if so how? The 404/500 error message doesn't get displayed.
I hope Orbeon / another SO user could give us some insights about that.
It's mainly for historical reasons. We have an RFE to improve on this. Versioning, which is almost completely implemented now, will better allow handling multiple versions of a given form definition.
It's currently not possible to propagate error messages to the UI in the general case. It's possible upon Publish or when using the result-dialog when submitting, if the service returns an HTML response.

Encoding output from trusted sources such as AD

We've been having a debate at work recently about the merits of encoding output data from trusted sources such as an Active Directory. We have a web application that displays list of users that are queried from AD and allows them to be managed in various ways. The argument goes that if the data coming from the AD is not Html encoded, then it's possible to inject script and perform XSS style attacks against the site if you have access to the Domain Controller; for example by adding a script as the first name name of a AD user.
The two schools of thought (1 for not validating and 2 for validating) seem to be:
If you've got access to the DC, you can do a lot worse than inject code into a site which displays information you've already got access to. You could also just view the information directly. So why bother?
If you were a domain admin, you could craft this attack thus creating a backdoor which would enable you to get access to information even if you left the company.
I think the issue at hand is really a more generic one, do you need to guard against (and thus encode) output data from a trusted source, in addition to the common practice of guarding against malicious input.
It is good practice to always do output encoding. You've tagged the question with MVC, so I assume your web application is an MVC one. If you're using Razor views, output is automatically encoded. More details here.

Logging large volumes of actions in a production MVC/SQL application

We are happy users of the ASP.NET MVC framework and SQL Server, currently using LINQ-to-SQL. It serves our needs well with a consumer-facing application with about 1.4 million users and 2+ million active uniques per month.
We are long overdue to start logging all user actions (views of articles, searches on our site, etc.) and we're trying to scope out the right architecture to do so.
We'd like the archiving system to be its own entity, and not part of the main SQL cluster that stores the production articles and search engine. We'd like it to be its own SQL cluster, starting out with just one box initially.
To simplify the problem, let's say we just want to log the search terms that these millions of users enter into our site for the month, and we want to do so in the least cycle-intensive-way possible.
My questions:
(1) Is there an asynchronous way to dump the search terms to a remote box? Does LINQ support async for this?
(2) Would you recommend building up a cache of say 1,000 (userId, searchTerm, date) logging items in a RAM cache, and then flushing those at intervals to the database? I assume this method would cut down on open/close connections.
Or am I thinking about this entirely wrong? We'd like to strike a balance between ease of implementation and robustness.
1)Sure you can, there are different solution to achieve it. Linq is not the instrument you need.
2)There should not be any major improvement by doing it, the "logging" will be triggered only when a search is performed. You will end up with two calls instead of one, not a big deal.
A suggestion is to use AOP
You can create a clean and separate layer for logging using Postsharp (there are other alternatives though). You will then decorate your actions with the required logging attribute only when you need to trace what is passed to the action.
Main advantages with this approch are :
Logging logic doesn't reside inside your code (you don't need to change your methods code) but is executed before/after your method.
Clean separation of the Aspect from the target method.
You can easily switch on/off the aspects
AOP is a common practice specially when it comes to behavior that can be added to more than one method, like logging, authentication and so on. And yes it can be used in an async way.
1)I would suggest you to create an HttpModule that "catch" all the search terms used by the users. How and where you will dump those information(you said you will use a SQL box) it's another matter which is outside the scope of the module which should just catch the Search tems.
Then you can create a component that contains the login to store those information using Async call(or even a third part component like Log4Net )
2) if you want create a kind of batch insert caching all the information you need to store and at some point dump them on SQL I would use MSMQ or any other technology that support the Reliability: I think you want loose all those information in the case of a system-crash,etc

ASP.NET MVC: cache with non-cachable portions

I have a heavy page that is cached. This is okay for anonymous users. They all see the same page.
The problem is for logged in users. They should have minor parts of the page re-rendered on every request (like personal notes on content in the page, etc.)
But still all the rest of the page should be cached (it does tons of SQL and calcuations when rendered).
As a workaround I put placeholders in page templates (like #var1#, #var2#,..).
Then I make controller method to render View into string, where I do string.Replace #var1# and other into real values.
Any cleaner way to do such kind of partial "non-caching"?
This is called donut caching.
The ASP.Net MVC framework doesn't currently support it, but it's planned for version 3.
To start things off, it might make sense to go through the page and see if there is anything about it that you can do to streamline or reduce the weight. Depending upon how bad things are, investing some time here might pay off in the long run.
That said, in regards to trying to server the content to anonymous as well as logged in users, one option is to have two versions of the page: one for anonymous users and one for logged in users. This may not be the best approach though as it means that you now have two versions of the same page to maintain.
Given the lack of support doughnut caching mentioned by SLaks what I would likely do is try and cache the results of the calculations that are being done for the page (e.g. if you are querying a database for a table of data, cache a DataTable that you can check for before running the operation) and seeing what that does for the performance. It may not be the most elegant solution in the world, but it may solve the problems that you are having.

Resources