Googlebot receiving missing template error for an existing template - ruby-on-rails

In the last couple of days, we have started to receive a missing template error when the google bot attempts to access our main home page (welcome/index). I have been staring at this for a couple of hours and know that I am just missing something simple.
A ActionView::MissingTemplate occurred in welcome#index:
Missing template welcome/index with {:handlers=>[:erb, :rjs, :builder, :rhtml, :rxml, :haml], :formats=>["*/*;q=0.9"], :locale=>[:en, :en]}
But the template does exist (index.html.haml). If it didn't no one could access our home page.
Here is some additional environment information:
* REMOTE_ADDR : 66.249.72.139
* REMOTE_PORT : 56883
* REQUEST_METHOD : GET
* REQUEST_URI : /
* Parameters: {"controller"=>"welcome", "action"=>"index"}
Any insights you have would be greatly appreciated.

These errors are coming from the way GoogleBot formats its HTTP_ACCEPT header. While valid (see W3 reference), it adds a q=0.6 (last figure may change) which is used as a separator. Since there is no other media type specified, this q=0.6 is not necessary and I assume this is why Rails doesn't treat the header correctly.
(It seems to depend on Rails version. On Rails 3.0.12, it raises a MissingTemplate exception.)
Adding the following code from a previous answer to the concerned controller is not sufficient: it responds with an error 406.
respond_to do |format|
format.html
end
To make this work under Rails 3.0.12 and have something returned to the GoogleBot (better than a 406 error), you need to add this code which sets the request's format to html as soon a */*;q=0.6-like HTTP_ACCEPT is detected (Rails load the header value into request.format).
# If the request 'HTTP_ACCEPT' header indicates a '*/*;q=0.6' format,
# we set the format to :html.
# This is necessary for GoogleBot which perform its requests with '*/*;q=0.6'
# or similar HTTP_ACCEPT headers.
if request.format.to_s =~ %r%\*\/\*%
request.format = :html
end
respond_to do |format|
format.html
end
While working, this solution needs the code to be added to any controller action you want to be indexed by the GoogleBot, what is really not DRY!
To fix this issue once for all, I implemented a small Rack middleware which does even better: it checks the request's HTTP_ACCEPT header, and will replace any header matching */*;q=0.6 (the figures can vary) by the common */*. This is even better because since the q=0.6 has no meaning if it is not followed by another media type, this change of the header doesn't change its meaning. We don't force Rails into any given format, we just tell it any will do in a way it can understand.
You can find the middleware, the loading initializer and an integration test in this gist.
Gem version here:
https://github.com/ouvrages/rails_fix_google_bot_accept

I am also getting the same, I did some investigation and came to the conclusion it is a 'bug' in Rails. */*;q=0.9 is the value of the HTTP accept parameter. I'm not exactly sure what is going on, but in Rails 3.0 this works. In Rails 3.1 it returns a 500 response, and in Rails 3.2 it returns a 406 response.
Update:
There is an open bug regarding this issue. One workaround is to set this new option in Rails 3.1:
config.action_dispatch.ignore_accept_header = true
However... if you serve any pages other than HTML you'll need to rely on the extension to denote the type (e.g. /users/1.json) instead of accept headers.

The solution to the problem is to specify the format in your action.
Up until now, I had simply had the following in my index action
def index
end
Once I inserted a respond_to block
def index
respond_to do |format|
format.html
end
end
I stopped getting the missing template errors.

the interesting part in the error that you posted is :formats=>["*/*;q=0.9"]
the rails-app tries to find a template for the format "*/*;q=0.9" which is not going to work.
i guess that google is somehow using this as a format query parameter like welcome?format=*/*;q=0.9
afaik latest rails versions will just render a 406 in those cases.

Related

missing a template for this request format and variant

I am new to Ruby on Rails and am trying to gain a strong understanding of how MVC works.
I did the following:
rails new bubblesman
rails generate controller bubble
in my bubble controller I created a method as follows:
def available
puts "YEP!!!!!!"
end
I put the following in my routes file:
'welcome' => 'bubble#available'
I navigate to http://localhost:3000/welcome
I get the below error:
ActionController::UnknownFormat (BubbleController#available is missing a template for this request format and variant.
request.formats: ["text/html"]
request.variant: []
NOTE! For XHR/Ajax or API requests, this action would normally respond with 204 No Content: an empty white screen. Since you're loading it in a web browser, we assume that you expected to actually render a template, not… nothing, so we're showing an error to be extra-clear. If you expect 204 No Content, carry on. That's what you'll get from an XHR or API request. Give it a shot.):
what I also don't understand is if I put this in my helper controller instead of my main controller it all works fine.
you need to create the available.html.erb file within the views/bubble/ directory. When the route takes you to that action, it also navigates you to that view, so if you put:
<h2>YEP!!!!</h2>
as the only line in that file, it should return that to you on the webpage.
In the future, you could use rails g scaffold bubbles and that will create a majority of the files (MVC) and routes for you.

InvalidCrossOriginRequest when trying to send a Javascript Asset

I'm trying to create an "asset controller" shim which will filter static asset requests so only authorized users can get retrieve certain assets. I wanted to continue to use the asset pipeline so I setup a route like this
get 'assets/*assetfile' => 'assets#sendfile'
Then I created an AssetsController with one method "sendfile". Stripping it down to only the stuff that matters, it looks like this:
class AssetsController < ApplicationController
def sendfile
# Basically the following function forces the file
# path to be Rails.root/public/assets/basename
assetfilename=sanitize_filename(params[:assetfile] + '.' + params[:format])
send_file(assetfilename)
end
end
It looks like I have to run this in production mode as rails by-passes my route for assets in development. So I precompile my assets and I can verify in the controller that the files exist where they are expected to be.
However, now the problem is that I'm getting a "ActionController::InvalidCrossOriginRequest" when the Javascript asset is requested (just using the default application.* assets for now). I've read about this error and I understand that as of Rails 4.1 there are special cross-origin protections for Javascript assets. Sounds good to me, but I don't understand where the "cross-origin" part is coming from. Using firebug, I can see that the asset requests are being requested from the same domain as the original page.
I am certain that this is the problem because I can solve it by putting "skip_before_action :verify_authenticity_token" in the beginning of my controller. However, I really don't want to do this (I don't fully understand why this check is necessary, but I'm sure there are very good reasons).
The application.html.erb file is unchanged from the default generated file so I assume it's sending the CSRF token when the request is made, just as it would if I didn't have my own controller for assets.
So what am I missing?
Ok, I think I answered my own question (unsatisfactorily). Again, long post, so bear with me. I mistakenly forgot to add this to my original questions, but I'm using Ruby 2.2.0 and Rails 4.2.4.
From looking at the code in "actionpack-4.2.4/lib/action_controller/metal/request_forgery_protection.rb", it looks like Rails is doing two checks. The first check is the "verify_authenticity_token" method which does the expected validation of the authenticity token for POST requests. For GET requests, it ALSO sets a flag which causes a second check on the formed computed response to the request.
The check on the response simply says that if the request was NOT an XHR (AJAX) request AND the MIME Type of the response is "text/javascript", then raise an "ActionController::InvalidCrossOriginRequest", which was the error I was getting.
I verified this by setting the type to "application/javascript" for ".js" files in "send_file". Here's the code:
if request.format.js?
send_file(assetfilename, type: 'application/javascript')
else
send_file(assetfilename)
end
I can skip the response check all together by just adding the following line to the top of my controller class:
skip_after_action :verify_same_origin_request
The check on the response seems pretty weak to me and it's not clear how this really provides further protection against CSRF. But I'll post that in another question.

Suppressing ActionView::MissingTemplate exception for Rails 3.x

Starting with Rails 3.0, from time to time, I've been receiving an exception notification like this:
ActionView::MissingTemplate: Missing template [...] with {:locale=>[:en],
:formats=>[:text], :handlers=>[:erb, :builder, :haml]}. Searched in: * [...]
For instance, an arbitrary hand-written URL like http://example.com/some/path/robots.txt raises the error. Not fun.
I reported the problem in this ticket quite a long ago, and been using the patch mentioned here, but the problem persists.
https://rails.lighthouseapp.com/projects/8994/tickets/6022-content-negotiation-fails-for-some-headers-regression
A fix is suggested in this blog post,
http://trevorturk.wordpress.com/2011/12/09/handling-actionviewmissingtemplate-exceptions/
To use this:
respond_to do |format|
format.js
end
But it doesn't feel right to me, as I'm not interested in overloading an action with multiple formats. In my app, there are separate URLs for HTML and JSON API, so simple render should be sufficient.
Should I just swallow the exception by rescue_from ActionView::MissingTemplate and return 406 myself?
Is there a better way to handle this situation?
Or I can ask this way - in the first place, is there any real-world usefulness in raising this kind of exception on production?
If you've no need for formatted routes you can disable them with :format => false in your route specification, e.g.
get '/products' => 'products#index', :format => false
This will generate a RoutingError which gets converted to a 404 Not Found. Alternatively you can restrict it to a number of predefined formats:
get '/products' => 'products#index', :format => /(?:|html|json)/
If you want a formatted url but want it restricted to a single format then you can do this:
get '/products.json' => 'products#index', :format => false, :defaults => { :format => 'json' }
There are a number of valid reasons to raise this error in production - a missing file from a deploy for example or perhaps you'd want notification of someone trying to hack your application's urls.
Best that worked for me is in application_controller.rb:
rescue_from ActionView::MissingTemplate, with: :not_found
After some source diving I found another way. Put this in an initializer.
ActionDispatch::ExceptionWrapper.rescue_responses.merge! 'ActionView::MissingTemplate' => :not_found
If you have a resource that will only ever be served in one format and you want to ignore any Accept header and simply force it to always output the default format you can remove the format from the template filename. So for instance, if you have:
app/views/some/path/robots.txt.erb
You can change it to simply
app/views/some/path/robots.erb
Some schools of thought would say this is a bad thing since you are returning data in a different format from what was requested, however in practice there are a lot of misbehaving user agents, not every site carefully filters content type requests, and consistently returning the same thing is predictable behavior, so I think this is a reasonable way to go.
Try adding
render nothing: true
at the end of your method.
If there are specific paths that periodically get called that generate errors -- and they are the same set of urls that get called regularly (i.e., robots.txt or whatever) -- the best thing to do if you can is to eliminate them from hitting your rails server to begin with.
How to do this depends on your server stack. One way to do it is to block this in directly in RACK prior to having the url passed into rails.
Another way may be to block it in NGINX or Unicorn, depending on which web listener you're using for your app.
I'd recommend looking into this and then coming back and posting an additional question here on 'How to Block URL's using Rack?" (Or unicorn or nginx or wherever you think it makes sense to block access.

ActionView::MissingTemplate Error, Only When Visited By A Bot?

I have an action that serves my homepage. It works fine when visited normally (ie by a user in a web browser), but when visited by specific web crawlers, it throws the following error:
A ActionView::MissingTemplate occurred in tags#promoted:
Missing template tags/promoted with {:handlers=>[:erb, :rjs, :builder, :rhtml, :rxml], :formats=>["text/*"], :locale=>[:en, :en]} in view paths "/Apps/accounts/app/views", "/usr/local/rvm/gems/ruby-1.9.2-p180#accounts/gems/devise-1.3.0/app/views"
actionpack (3.0.4) lib/action_view/paths.rb:15:in `find'
It appears the bots are trying to fetch the text/* format, which there is no template for, which makes sense, so I tried to do the following in my action:
def promoted
request.format = :html #force html to avoid causing missing template errors
# more action stuff....
end
In essence, I am trying to force the request's format to html so it serves the html template.
Yet every time these set of bots request this page, the missing template error occurs.
It's not that big of deal, but ideally I'd like to resolve this error, if only so I stop getting these error emails from my app.
Is the only way to make a file called my_action.text.erb and put some gibberish in it? Or can I solve this more elegantly?
I've been seeing these as well. You could use some middleware to rewrite these requests:
class Bot
def initialize(app)
#app = app
end
def call(env)
h = env["HTTP_ACCEPT"]
env["HTTP_ACCEPT"] = "text/html" if h == "text/*"
#app.call(env)
end
end
I forked a gem for killing off some MS Office Discovery Requests, and it seemed to make sense to add this middleware into it.
https://github.com/jwigal/rack-options-request
It turns out this specific set of bots are as dumb as a rock, and ignore any sort of request formatting as I was trying to do. I ended up disallowing these bots' User-Agents in my robots.txt. No more errors. However, if somebody has a more elegant solution, please post it and I'll mark it as the accepted answer, otherwise, I'll accept this one in a couple of days.

Why am I suddenly getting Missing Template errors with edge Rails (2.3)?

After freezing edge rails, all my controller examples are failing with
MissingTemplate errors.
e.g., "Missing template attachments/create.erb in view path app/views"
Trying to actually render the views gives me the same error.
I noticed I can fix most of them by using respond_to but I usually
never use it. I almost always only need to respond to one format in
one action so I omit respond_to and let Rails figure out which file to
render.
Does Rails suddenly require respond_to blocks in every action as of 2.3?
Just found this, which answers my question:
http://rails.lighthouseapp.com/projects/8994/tickets/1590-xhrs-require-explicit-respond_to

Resources