How to let the Facebook scraper into dynamic, authenticated pages - ruby-on-rails

I have a social network that requires authentication and email verification before a user can enter. Once inside, users can only see content from their friends. Its actually really simple, even if it doesn't sound it. Here is my authenticate before filter:
def authenticate
if logged_in?
redirect_to authentication_url if current_user.account_disabled
else
redirect_to root_url
end
end
The problem I have is letting the Facebook scraper in to get the meta tags from some of the dynamic pages. I read that you can allow the Facebook's User Agent into non public pages, but isn't that for pages that are protected in the robots.txt file? I'm not experienced with scrapers but surely it will need a cookie and an enabled account to scrape the dynamic information on my site? I'm not even sure how to actually write the method to let the scraper in or where to write it.
I'll though about generating a token with SecureRandom.urlsafe_base64 for the scraper and making an exception on a blank page (with the meta data) that shouldn't be accessable to regular users, but technically that wouldn't be safe, considering that if you looked at the right JS file (for the URL reference in the Open Graph action POST) and meta tags you could get protected user data. This idea doesn't seem even close to correct...
Any ideas?

user agents are easily faked. Be careful allowing access based on user agent alone.
I believe they have a way to allow scrape via api instead.

As long as your content has unique URLs for what each user sees (normally protected by a login filter), you can allow access by checking the source IP or user agent to match the Facebook scraper.
However, like most social sites, you are likely using the same URLs to return customized contents rendered for the currently logged in user. This is inherently unscrapable - because there is a different version of say '/profile' for each user.

Related

How to track from which site user came from to my rails app?

I want to store url of user's referral website after sucessfull registration. In this case I can't use request.referer because user can visit few pages on my website before registration. But I need previous website url, for example http://google.com or http://facebook.com/somepage_id or whatever. I know that Google Analytics or Intercome can collect this data but I want something simple. Preferably without external APIs or libraries if this possible.
There is ruby gem called 'ahoy', which can be used for this.
When someone visits your website, Ahoy creates a visit with lots of useful information.
traffic source - referrer, referring domain, landing page, search
keyword
location - country, region, and city
technology - browser, OS, and device type
utm parameters - source, medium, term, content, campaign
Please find the link for more information,
Ahoy
When user reach your site create UserSession which will store request.referer also it has user_id. And during registration bind UserSession to created User by filling user_id. In this case you will get site from which user came + you can get some additional info from UserSession(user-agent, date, visited pages and etc.)

Getting past anti-CSRF to log a user into a site when you know their username and password

This sounds a bit evil, bear with me though. It's also not specifically a Rails question even though the two sites in question use Rails. (Apologies in advance for both these things)
Imagine two websites which both use Ruby on Rails:
mysite.com, on which i'm a developer and have full access in terms of changing code etc, and also have an admin login, so I can manage user accounts.
theirsite.com, on which i have an admin login but no dev access. I know the people who run it but i'd rather not ask them any favours for political reasons. That is an option however.
Using my admin login on each site i've made a user account for the same person. When they're logged into mysite.com, i'd like to be able to provide a button which logs them straight into theirsite.com. I have their username and password for theirsite.com stored in their user record in the mysite.com database, to facilitate this. The button is the submit button for a form which duplicates the form on the theirsite.com login page, with hidden fields for their username and password.
The stumbling block is that theirsite.com handles CSRF with an authenticity_token variable, which is failing validation when the login submits from mysite.com.
My first attempt to get past this was, in the mysite.com controller which loads the page with the form, to scrape the theirsite.com login page to get an authenticity token, and then plug that into my form. But this isn't working.
If i load the theirsite.com login page, and the mysite.com page with the remote login button in two browser tabs, and manually copy the authenticity_token from the theirsite.com form to the mysite.com form, then it works. This is because (i think) the authenticity_token is linked to my session via a cookie, and when i do it all in the same browser the session matches up, but when i get the authenticity token from theirsite.com via scraping (using Nokogiri but i could use curl instead) it's not the same session.
Question A) So, i think that i also need to set a cookie so that the session matches up between the browser and the Nokogiri request that i make. But, this might be impossible, and exactly the sort of thing that the anti-CSRF system was designed to defeat. Is that the case?
Question B) Let's say that i decide that, despite the politics, i need to ask the owner of theirsite.com to make a small change to allow me to log our users into theirsite.com when we know their theirsite.com username and password. What would be the smallest, safest change that i could ask them to make to allow this?
Please feel free to say "Get off SO you evil blackhat", i think that's a valid response. The question is a bit dodgy.
A) No, this is not possible as CSRF Protection is made to protect from actions like these only. So "Get off SO you evil blackhat"
As per the question I'm assuming that theirsite.com is using Rails(v3 or v4)
B) The smallest change that you could ask them to do is to make a special action for you, so that you could pass user credentials from your back-end and the user will be logged in from their on.
That action will work something like this :
You'll have a special code which will be passed along the credentials so that the request is verified on their servers. That code can either be a static predefined code or it can be generated on minute/hour/day basis with the same algorithm on both sites.
The function that you'd be asking to make for you will be like this:
Rails v3 and v4:
This action will be POST only.
#I'm supposing 'protect_from_forgery' is already done in theirsite.com
class ApplicationController < ActionController::Base
protect_from_forgery
end
#changes to be made are here as follows
class SomeController < ApplicationController
skip_before_filter :verify_authenticity_token, only: [:login_outside] #this turns off CSRF protection on specific actions
def login_outside
if(#check special code here)
#Their login logic here
end
end
end
Check this link for further information on skipping CSRF protection in Rails
Rails 4 RequestForgeryProtection
This shouldn't be too hard to do.
You need to send an ajax GET request to their signup page, copy the authenticity_token with javascript, and then send an ajax POST to the actual log in route that creates a session with the right credentials and authenticity_token.
One tricky part is finding out their log in route. Try /sessions/new or perhaps they have the url in the form, so look at the html there. Good luck!
The other tricky part is knowing how the parameters are usually sent. Check out the form's html. If all the input tags have user_ before their name's then you'll need to structure your parameters similarly; i.e. user_email, user_password.
It's entirely possible to fetch the crsf token and submit your own form (because a log-in page is accessible to anyone!). However, it'll be difficult to know the details of their arrangement. The guessing and checking isn't too bad of an options (again, /sessions/new is how I route my log in; you should also try your route to see if they have a similar one.)
If that doesn't work, try taking a look at their github account! It's very possible they haven't paid $7 a month and it's open to the public. You will easily be able to view their routes and parameter parsings that way.
Good luck!
This is impossible. The anti-csrf works like you send cookie to an user, and inject token in form of hidden field into a form; if the token matches with cookie form post is accepted. Now if you run form on your side, you can't set the cookie (as the cookie can be only set in domain of its origin).
If there is just some particular action you want to perform on their site, you can get away with browser automation. (i.e. your run browser on your server-side, script the action and execute it).
As for B) safest and smallest change is contradiction :) Smallest change would be to create handler for POST request on their side where you'll send username and password (this handler HAS TO run over https) and it will create auth cookie on their side.
As for safest - the whole concept of storing encrypted (not hashed) passwords is questionable at best (would you like your site to be listed here http://plaintextoffenders.com/ ?). Also if user changes his password on their side you're screwed. Secure solution would be that you'll store just 3pty UserID on your side, and you'll send asymmetrically encrypted UserID with Timestamp to their side (you'll encrypt it with your private key). They'll decrypt it (they'll have to have public key), validate if timestamp is not to old and if not they'll create auth cookie for given user id. There are also protocols for that (like SAML).
A)
What you are trying to do is really a form of a CSRF attack.
The idea behind a cross-site request forgery attack is that an attacker tricks a browser into performing an action as a user on some site, as the user who is using the site. The user is usually identified by a session identifier stored in a cookie, and cookies are sent along automatically. This means that without protection, an attacker would be able to perform actions on the target site.
To prevent CSRF, a site typically includes an anti-CSRF token in pages, which is tied to the session and is sent along in requests made from the legitimate site.
This works because the token is unpredictable, and the attacker cannot read the token value from the legitimate site's pages.
I could list various ways in which CSRF protection may be bypassed, but these all depend on on an incorrect implementation of the anti-CSRF mechanism. If you manage to do so, you have found a security vulnerability in theirsite.com.
For more background information about CSRF, see https://www.owasp.org/index.php/Cross-Site_Request_Forgery_(CSRF).
B)
The smallest change which theirsite.com could do is to disable the CSRF protection check for the login page.
CSRF protection depends on the unpredictability of requests, and for login pages, the secret password itself protects against CSRF. An extra check through an anti-CSRF token is unnecessary.

Manually Supply Referral URL to Spring Security

We have some shopping cart pages which work with both guest and user paths. We want to allow a user to login at any time during the process but don't really want to create yet another login page. I'd prefer that we can simply redirect the user to the existing login and tell Spring Security what URL to come back to.
I know this happens automatically when sessions timeout and/or protected pages are requested without a session, but is there a way I can give the URL to Spring Security myself?
If you just need a simple return-to URL to retrieve the cart, then you are probably best to implement that yourself in an AuthenticationSuccessHandler. You can look at the source for SimpleUrlAuthenticationSuccessHandler and its parent for inspiration.
The default login mechanism uses the RequestCache and a SavedRequest, but that is intended to actually replay a request which would not otherwise be authorised. That's probably overkill in your case.

Specify Cookie Domain in Authlogic When Session Is Created

Is it possible to set the cookie domain to something other than the current domain when a session is created with Authlogic?
When a new account is created from our signup domain, I'd like to redirect the user to their subdomain account and log the user in.
Current controller:
def create
#account = Account.new(params[:account])
if #account.save
#user_session = #account.user_sessions.create(#account.users.first)
# I'd like the cookie domain to be [#account.subdomain, APP_CONFIG[:domain]].join(".")
redirect_to admin_root_url(:host => [#account.subdomain, APP_CONFIG[:domain]].join("."))
else
render 'new'
end
end
If you do:
config.action_controller.session[:domain] = '.YOURDOMAIN.COM'
in your production.rb file, that will allow you to have everyone logged in on all subdomains of your subdomain. If you then add a filter (or whatever, but I use a filter so I know that works) that checks that someone is actually using the right domain before you show controller stuff, it works pretty well.
As an example, you could store the appropriate subdomain for the session as a session variable and give people link options to their specific things if they were on your main domain or looking at a page on someone else's subdomain.
This seems to be the general pattern for doing this sort of thing -- if you set a cookie specific to the subdomain otherwise you won't be able to tell when they've logged in to the main site. I also have a 'users_domain?' helper that ends up getting called occasionally in views when I do this.
If you don't want to have those sorts of common web design patterns, wesgarrion's single use -> session creation on subdomain is also a way to go. I just thought I'd mention this as a design / interaction / code issue.
If you want to log them in on the subdomain, you can use Authlogic's single use token.
Check out the Params module for an example on logging in with the single use token.
Naturally, your action will log them in and create their session (on the subdomain) so they don't have to re-authenticate for the next request.
There are options to set the domain for the cookie in process_cgi() and session(), but I don't see a way to set those per-request in Authlogic. The authlogic mailing list is pretty responsive, though, and this seems like a pretty standard use-case that someone there would have tried and figured out. And uh, I saw your note on the google group, so never mind that.
If you have an application with multiple subdomains and don't want session cookies to be shared among them, or worse - have a top-level .domain session cookie with the same session_key floating around alongside your subdomain session cookie (Rails will keep one and toss the other - I believe simply based on the order in the request header) - you can use the dispatcher hooks to force the session cookie to subdomains.
Include the hook in ActionController from an extension.
base.send :after_dispatch, :force_session_cookies_to_subdomains
Set the domain this in your after_ dispatch hook.
#env['rack.session.options'] = #env['rack.session.options'].merge(:domain => 'my_sub_domain' end)
For us, we look at the #env[HTTP_HOST] to determine what [my_sub_domain] should be.
With this approach, the user's login must occur at the subdomain for the browser to accept the subdomain'ed cookie (unless using a pattern like the Authlogic Params to propagate to the next request against the subdomain).
Note: The browser will reject the subdomain'ed cookie when the request comes from the higher level domain. For us, this isn't a bad thing - it results in the same outcome that we require, that a top level session cookie doesn't get created and later sent to subdomains.
Another approach to a similar end might be to force a cookie to not be set when not from a subdomain. Not spending much time on it, the way I was able to accomplish this was -
request.env["rack.session"] = ActionController::Session::AbstractStore::SessionHash.new(self, request.env)
in an after filter in ApplicationController.

How to create password protected RSS feed in rails

Creating RSS feed in rails is easy. I need a simple way to password protect the RSS feed. I am thinking http basic authentication.
I googled but could not find any article about creating password protected RSS.
I have this in my ApplicationController
def xml_authorize
if request.format == Mime::XML
authenticate_or_request_with_http_basic do |username, password|
username == 'foo' && password == 'bar'
end
end
end
Then I just apply a before_filter :xml_authorize to the actions that I want to password protect for XML requests only, but still want to serve normally with html.
Here's where I got the inspiration from.
Just use whatever auth system you use on your regular controllers. If the user is logged, and session is alive he will be able to download the feed.
How is HTTP authentication any different on RSS feeds than other controller actions? (This is not necessarily a rhetorical question - is it actually different?)
Have you tried just using the Rails HTTP Basic Authentication methods?
Feeds are meant to be fetched in regular intervals without user interaction.
RSS-Feeds can be consumed by something different than a browser. For example,
I have a phone where I can create a widget for the start screen from a rss-feed-link. Great function. Unfortunately authentication does not work for this widget.
And there is no user interface to put in username and password. So the authentication need to be part of the url, with all the bad security implications...
On the other hand you don't need a session to answer a feed-request.
So the solution is a create a special key for the user, and store it in the user table.
Then use it when you display the link to the rss-feed. In the feed-method, you use this key to retrieve the user, but you don't create a session. This way, even when a hacker somehow got a valid key, the hacker can only view the rss-feed, but not access the rest of your application.
If you already use some library for authentication, there may already some solution implemented for this. In Authlogic, is is the class SingleAccessToken, and you need to add a column 'single_access_token' of type string to your user table. Then authlogic creates some cryptic key when are saving the user record. You than add this key as the GET-Parameter 'user_credentials' to the url of the private rss-feed
Like knoopx said, If you use an authentication system like authlogic you should be able to specify the authentication type in the controller. You can then specify http basic authenication. Then you can, if you choose, include the authentication in the URL for the RSS Feed.
e.g:
http://username:password#example.com/rss
(sorry to break the URI up like that, but I don't have enough reputation points to post multiple links :( )

Resources