Rails - any fancy ways to handle 404s? - ruby-on-rails

I have a rails app I built for an old site I converted from another cms (in a non-rails language, hehe). Most of the old pages are mapped to the new pages using routes.rb. But there are still a few 404s.
I am a rails newb so I'm asking if there are any advanced ways to handle 404s. For example, if I was programming in my old language I'd do this:
Get the URL (script_name) that was being accessed and parse it.
Do a lookup in the database for any keywords, ids, etc found in the new URL.
If found, redirect to the page (or if multiple records are found, show them all on a results page and let user choose). With rails I'd probably want to do :status => :moved_permanently I'm guessing?
If not found, show a 404.
Are there any gems/plugins or tutorials you know of that would handle such a thing, if it's even possible. Or can you explain on a high level how that can be done? I don't need a full code sample, just a push in the right direction.
PS. It's a simple rails 3 app that uses a single Content model.

Put this in routes (after every other route that you have, this will capture every url)
match '*url' => 'errors#routing'
And now in errors controller in routing action you can implement any fancy logic that you want, and render a view as always (you might want to add :status => 404 to the render call). Requested url will be available in controller as params[:url].

There is an ugly way of doing this:
render :file => "#{RAILS_ROOT}/public/404.html", :layout => false, :status => 404
Maybe someone can come with a better solution.

Related

Rails: Find all occurrences of internal 404 links (is Rails "leaking" URLs?)

I came across something curious and was wondering how to fix it within Rails.
In my app, I have a Country model; it already contains records for all countries I'll ever need. However, many of them don't contain any data or are otherwise not yet relevant. These countries yield a 404 error, as they would if the record didn't exist in the first place:
begin
#country = Country.friendly.find(params[:id])
rescue ActiveRecord::RecordNotFound
#country = nil
end
if !#country.nil? && Country.with_data.include?(#country)
# render the view
else
render :file => "#{Rails.root}/public/404", :status => :not_found
end
So assuming that e.g. France contains data while Andorra doesn't, the following should happen:
mysite/country/france --> HTTP 200 OK
mysite/country/andorra --> HTTP 404 Not found
mysite/country/randomstring123 --> HTTP 404 Not found
This all works fine. However, what's curious is that when I track my site in Google Webmaster Tools, it is actually aware of some of the URLs that point to "empty" countries, and shows them to me as 404-yielding "crawling errors". (E.g., it knows mysite/country/andorra.) What I can't see is where Google got those URLs from. Those links are also not included in the WT "Internal Links" section, so that doesn't help.
The routes.rb excludes the index action:
resources :countries, path: "country", except: :index
I generate a sitemap with a custom controller, but it excludes the countries in question.
I conclude that there are two likely options:
It is possible that an earlier version of the sitemap controller included "empty" countries. They might then still be tried by Google eventually (as are some old URLs from a much outdated site structure > 9 months ago).
Otherwise, Rails somehow would have to "leak" URLs to these empty countries. Is there any "internal" way to check that? I'll also run external 404 checks but it would be good to know if I can somehow get an efficient output much alike rake routes somehow.

Rails: Model.find() or Model.find_by_id() to avoid RecordNotFound

I just realized I had a very hard to find bug on my website. I frequently use Model.find to retrieve data from my database.
A year ago I merged three websites causing a lot of redirections that needed to be handled. To do I created a "catch all"-functionality in my application controller as this:
around_filter :catch_not_found
def catch_not_found
yield
rescue ActiveRecord::RecordNotFound
require 'functions/redirections'
handle_redirection(request.path)
end
in addition I have this at the bottom of my routes.rb:
match '*not_found_path', :to => 'redirections#not_found_catcher', via: :get, as: :redirect_catcher, :constraints => lambda{|req| req.path !~ /\.(png|gif|jpg|txt|js|css)$/ }
Redirection-controller has:
def not_found_catcher
handle_redirection(request.path)
end
I am not sure these things are relevant in this question but I guess it is better to tell.
My actual problem
I frequently use Model.find to retrieve data from my database. Let's say I have a Product-model with a controller like this:
def show
#product = Product.find(params[:id])
#product.country = Country.find(...some id that does not exist...)
end
# View
<%= #product.country.name %>
This is something I use in some 700+ places in my application. What I realized today was that even though the Product model will be found. Calling the Country.find() and NOT find something causes a RecordNotFound, which in turn causes a 404 error.
I have made my app around the expectation that #product.country = nil if it couldn't find that Country in the .find-search. I know now that is not the case - it will create a RecordNotFound. Basically, if I load the Product#show I will get a 404-page where I would expect to get a 500-error (since #product.country = nil and nil.name should not work).
My question
My big question now. Am I doing things wrong in my app, should I always use Model.find_by_id for queries like my Country.find(...some id...)? What is the best practise here?
Or, does the problem lie within my catch all in the Application Controller?
To answer your questions:
should I always use Model.find_by_id
If you want to find by an id, use Country.find(...some id...). If you want to find be something else, use eg. Country.find_by(name: 'Australia'). The find_by_name syntax is no longer favoured in Rails 4.
But that's an aside, and is not your problem.
Or, does the problem lie within my catch all in the Application Controller?
Yeah, that sounds like a recipe for pain to me. I'm not sure what specifically you're doing or what the nature of your redirections is, but based on the vague sense I get of what you're trying to do, here's how I'd approach it:
Your Rails app shouldn't be responsible for redirecting routes from your previous websites / applications. That should be the responsibility of your webserver (eg nginx or apache or whatever).
Essentially you want to make a big fat list of all the URLs you want to redirect FROM, and where you want to redirect them TO, and then format them in the way your webserver expects, and configure your webserver to do the redirects for you. Search for eg "301 redirect nginx" or "301 redirect apache" to find out info on how to set that up.
If you've got a lot of URLs to redirect, you'll likely want to generate the list with code (most of the logic should already be there in your handle_redirection(request.path) method).
Once you've run that code and generated the list, you can throw that code away, your webserver will be handling the redirects form the old sites, and your rails app can happily go on with no knowledge of the previous sites / URLs, and no dangerous catch-all logic in your application controller.
That is a very interesting way to handle exceptions...
In Rails you use rescue_from to handle exceptions on the controller layer:
class ApplicationController < ActionController::Base
rescue_from SomeError, with: :oh_noes
private def oh_noes
render text: 'Oh no.'
end
end
However Rails already handles some exceptions by serving static html pages (among them ActiveRecord::RecordNotFound). Which you can override with dynamic handlers.
However as #joshua.paling already pointed out you should be handling the redirects on the server level instead of in your application.

How best to setup rails with only one view?

I'm working on a personal RoR project with an interesting sort of problem: the whole app only needs one HTML template.
Basically, the whole app is presented through HTML5 canvas (it's going to be a game of sorts). But I'd still like there to be URLs for accessing specific resources, such as '/player/1'.
So what's the best, DRYest way to do this? I'd really hate to specify the template in every action in the controllers.
render :file => "layout_file", :layout => false
You could define your view in app/views/layout/application.html.erb and leave all the others empty, but that wouldn't avoid the reloading of pages.
You should also have all your methods respond in json format.
Or just an old good:
render :nothing => true
at the end of your methods.

SEO fix with redirects at Ruby On Rails

I've made mistake and allowed two different routes pointing at same place. Now I've got troubles with duplicated content.
News could be viewed in two ways:
http://website.com/posts/321 and http://website.com/news/this-is-title/321
I want to fix this mess and my idea is to check by what link user is coming. For example if someone will came through http://website.com/posts/321 I would like to redirect visitor to correct route: http://website.com/news/this-is-title/321
My very first idea is to validate request url at Post controller and then in if statement decide about redirecting or simply displaying proper view. Is it good conception?
I think it's not the best fit.
You should do this at routes level using the redirect methods.
I don't think you should bother, take a look at canonical url's if you're worried about SEO
In your posts_controller.rb show:
def show
return redirect_to post_path(params[:id]) if request.fullpath.match /(your regex)/i, :status => 301, :notice => 'This page has been permanently moved'
#post = Post.find(...)
end
return redirect_to is important because you can't call redirect or render multiple times
match the regex on request.fullpath
if you're super concerned about SEO, set the status to 301. this tells search engines that the page has been permanently moved
the notice is optional and only for asthetics after the redirect in case the user has bookmarked the old page url

Rails Best Practices for RESTful controller CREATE and UPDATE methods

OK, I am trying to understand the best practices for the CREATE and UPDATE methods for both HTML and XML formats. The default code for a controller that the rails generator generates is a little unclear to me.
For the CREATE method, given a good save, the generator says to "redirect_to(#whatever)" for HTML and "render :xml => #whatever, :status => :created, :location => #whatever" for XML.
Given a bad save, the generator says to "render :action => 'new'" for HTML and "render :xml => #whatever.errors, :status => :unprocessable_entity" for XML.
However, for the UPDATE method, given a good update, the generator says to "redirect_to(#whatever)" for HTML and "head :ok" for XML.
And, given a bad update, the generator says to "render :action => 'edit'" for HTML and "render :xml => #whatever.errors, :status => :unprocessable_entity" for XML.
I understand this, and it makes sense to me, and works just fine - BUT, I have two questions:
First, for a successful CREATE and UPDATE, HTML format, why "redirect_to(#whatever)" instead of "render :action => 'show'"? I understand the differences between redirect and render, just more curious about which way you guys tend to do it and why. It seems the redirect would be an unnecessary extra trip for the browser to make.
Second, why "head :ok" upon successful UPDATE via XML, but "render :xml => #whatever, :status => :created, :location => #whatever" upon successful CREATE via XML? This seems inconsistent to me. Seems like a successful UPDATE via XML should be the same as a successful CREATE via XML. Seems like you would need the new/updated object returned so you could test it. How do you guys do it and why?
I'd already written this out when Sam C replied, but here it is anyway :-)
For the first part - why redirect instead of render? Two reasons I can think of:
1) It's consistent. If you rendered the show action and the user uses the back button later on return to that page, the user will see unexpected behaviour. Some versions of IE will give you some kind of session timeout error IIRC, other browsers may handle it a bit more gracefully.
Same goes for if the user bookmarked that page and returns to it at a later date using a GET request - they won't see the show action. Your app may throw an error or may render the index action because the user is requesting a URL like http://my.app.com/users, which would map to the index action when using a GET request.
2) if you render the show action without redirecting to a GET request and the user hits refresh, your browser will re-submit the POST request with all the same data, potentially creating duplicate instances of whatever it is you were creating. The browser will warn the user about this so they can abort, but it's potentially confusing and an unnecessary inconvenience.
As for the second part of your question, not too sure to be honest. My guess is that since you are already updating the object in question, you already have a copy of it so do not need another instance of it returned. Having said that, updating an object could trigger various callbacks which modify other attributes of the object, so returning that updated object with those modifications could make sense.
On create or update, redirect_to(#whatever) to clear the post, so that the user doesn't resubmit by refreshing. It also shows the correct url in the address bar for the create case, which posts to the collection path (/whatevers).
head :ok makes a minimal response on update, when usually you would already have the object in the dom. If you are updating the page after an update, the standard method is to use rjs views to update dom elements and render partials.

Resources