How to handle requests to clearly but wrong defined resource? - url

Given I use an CMS which makes an article available unter the following URL: http://example.com/article/1-my-first-and-famous-article/
Internally I can identify the requested article unequivocally by its id (1).
How should I handle requests to a wrong (typing error, manipulation, ..) URL? For example someone requests http://example.com/article/1-my-firsz-and-famous-article/ or http://example.com/article/1-this-article-is-stupid-idiot/ - should I respond with http status code 301 and redirect to the right URL or with 404 and show a not found page (maybe with redirection after a few seconds). Which is the preferable way in terms of search engine optimization?

Wrong URLs will be 404 error and any existing page moved to new location will be 301 redirect

Related

How to handle unauthorized accesses gracefully in backend?

I have a Ruby on Rails application which redirects users to the start or login page if they end up at a resource they are not authorized for.
For that, it redirects through a 302 Found.
This does not feel right to me, as for example a successful creation of a resource via POST also returns a 302, with the only difference being that it redirects to the created resource.
On the other hand, it does not seem possible to redirect a user without returning a 30X status code (401/403 in this case).
Am I missing something here, or am I already doing it correctly and this is just the way to go?
Well I'd say that it depends of the context, for an API I'd go for you way, if the user is trying to reach an endpoint without authentication or without enough permissions, I'd return a 401 or 403 respectively.
But for a web application without a separated frontend app, you've no choice to tell to the browser where it has to go next and the only way of doing this is to use redirections (that are only 3xx HTTP codes => https://developer.mozilla.org/en-US/docs/Web/HTTP/Status#redirection_messages).

Which RESTful action should I use to redirect to another site?

I have an app where I try to adhere to REST.
The app receives requests for external links that don't belong to the app, so the sole purpose of the action is to redirect the request to the external URL.
My suggestion is to have the following controller/action: redirects_controller#create.
Is my thinking correct or should it be the show action instead?
REST (apart from Rails) is about using the correct HTTP method for the correct action. The Rails part is just using the conventional controller action for a given HTTP method.
So, if you're doing a 301 or 302 redirect to another page, which browsers handle by issuing a GET request to the URL in the redirect response's Location header, do it in a show action. This will allow the user's browser to cache the other page when appropriate, and to not notify the user before redirecting.
(There is a way to redirect POSTs, but you didn't mention it so I expect you're talking about regular 301/302 redirects.)
Coming from a Java background, the REST actions must be related to CRUD operations. Requests that do not change the resource like in your case where the intent is to redirect to another page must be tied to a GET verb or show in your example.
If you were to create a new resource you would use POST.
A more detailed explanation can be found in Richardson's rest maturity model level 2

Redirect() vs RedirectPermanent() in ASP.NET MVC

Whats difference between Redirect() and RedirectPermanent(). I had read some articles, but I don't understand when we must use Redirect() and RedirectPermanent(). Can you show a pieces of example.
The basic difference between the two is that RedirectPermanent sends the browser an HTTP 301 (Moved Permanently) status code whereas Redirect will send an HTTP 302 status code.
Use RedirectPermanent if the resource has been moved permanently and will no longer be accessible in its previous location. Most browsers will cache this response and perform the redirect automatically without requesting the original resource again.
Use Redirect if the resource may be available in the same location (URL) in the future.
Example
Let's say that you have users in your system. You also have an option to delete existing users. Your website has a resource /user/{userid} that displays the details of a given user. If the user has been deleted, you must redirect to the /user/does-not-exist page. In this case:
If the user will never be restored again, you should use RedirectPermanent so the browser can go directly to /user/does-not-exist in subsequent requests even if the URL points to /user/{userid}.
If the user may be restored in the future, you should use a regular Redirect.
RedirectPermanent is 301 and Redirect is 302 status code
They send different response codes to the browser. 301 is a permanent redirect, 302 a temp one. The end effect is the same, but if the client wants to index links (the most common client that does this will be search engines) then a permanent redirect tells the client to update its records to ignore the old link and start using the new one. A temp redirect tells the client that the page is redirecting for now, but not to delete the old link from its indexing database

Redirect on record not found?

On the book Agile Web development with Rails, it is proposed that when someone tries to access some data in your web site and the record doesn't exist anymore, that the user should be redirected to a working page and display a message.
A user would go to /book/1, but a book with id 1 doesn't exist anymore, so it is redirected to /books and shown a message "That book doesn't exist". It seems to be a good user experience but to break the HTTP protocol. Should it be a temporary redirect? if so a web crawler will keep hitting that page. Should it be a permanent redirection? If so the previous content should be available there, and it isn't.
I think that a record-not-found page should issue a 404. Am I wrong? Hitting /book/1 where 1 doesn't exist anymore would return a 404 with the HTML showing exactly the same thing as /books, and maybe an error message.
Agile Web development with Rails is against that option because the user might keep hitting /book/1 generating 404s only to see what can be seen in /books.
What do you think?
If the resource does not exist, send the 404 status code. It’s really that simple. Redirecting means that only the URL is (temporarily) not valid but the resource does exist.
If there's no 404 , search engines have no way to discover that the object has been deleted. So I suppose it's a must.
I think there's a good compromise where you render a 404 template (complete with 404 status code) that prompts the user to continue to /books or /whatever.
if the record doesn't exist anymore, than you should probably use a 301 status code, "permanent redirect".
The difference between 301 and 404, is that a 404 error code should be used in cases when the resource never existed and 301 when the resource existed, but moved.

What's the correct response to unauthorized HTTP request?

I am writing web application I am not sure what is the correct response to unauthorized request. For user it is convenient when server response with 302 and redirects him to login page. However somewhere deep inside I feel that 401 is more correct. I am also little afraid if the 302 cannot be misinterpreted by search engines.
So how do you response to your unauthorized requests?
Edit
I am using ASP.NET MVC. This is not important from theoretical point of view. However ASP.NET form authentication use 302 approach.
I also like the behavior when user is redirected after successful login to the page he was requested. I am not sure if this can be implemented with 401 approach easily.
I think the correct response is entirely dependent on the context of the request. In a web application intended for human (not machine) consumption, I prefer to either redirect to login if the user is not authenticated and render an error page if the user is authenticated, but not authorized. I won't typically return an unauthorized response as it contains too little information for the typical user to help them use the application.
For a web service, I would probably use the unauthorized response. Since it is typically consumed by a program on the other end, there is no need to provide a descriptive error message or redirection. The developer using the service should be able to discern the correct changes to make to their code to use the service properly -- assuming I've done a good job of documenting interface usage with examples.
As for search engines, a properly constructed robots.txt file is probably more useful in restricting it to public pages.
401 seems grammatically correct, however a 401 is actually a statement presented back to the browser to ask for credentials - the browser would then expect to check the WWW-Authenticate header so that it could challenge the user to enter the correct details.
To quote the spec.
The request requires user
authentication. The response MUST
include a WWW-Authenticate header
field (section 14.47) containing a
challenge applicable to the requested
resource. The client MAY repeat the
request with a suitable Authorization
header field (section 14.8). If the
request already included Authorization
credentials, then the 401 response
indicates that authorization has been
refused for those credentials. If the
401 response contains the same
challenge as the prior response, and
the user agent has already attempted
authentication at least once, then the
user SHOULD be presented the entity
that was given in the response, since
that entity might include relevant
diagnostic information. HTTP access
authentication is explained in "HTTP
Authentication: Basic and Digest
Access Authentication" [43].
If you do a 302 you at least guarantee that the user will be directed to a page where they can log in if non-standard log in is being used. I wouldn't care much what search engines and the like think about 401's.
Send a 401 response, and include a login form on the page you return with it. (i.e. don't just include a link to the login page, include the whole form right there.)
I have to agree with you that the 401 result is actually the correct response.
That said why not have a custom 401 page which is well designed and shows the unauthorised message as well as a link to the login page, which you could have a 15 second javascript countdown to automatically send them there.
This way you give the correct 401 response to a bot which is told that the page is restricted but a real user gets redirected after being told that they are accessing a secured resource.
Don't bother about the search engines if your site is mainly used by humans. The ideal approach when a user reaches a protected page is to redirect them to a login page, so that they can be forwarded to the protected page after successful login.
You cannot accomplish that with a 401-error, unless you are planning to include a login form in the error page. From the usability point of view, the first case (302) is more reasonable.
Besides, you could write code to redirect humans to your login page, and search engines to 401.
How are the search engines going to be indexing the secured pages in the first place? Unauthorized users, such as bots, shouldn't be getting that far in the first place IMHO.

Resources