What's the best way to keep users from sharing session cookies in Rails?
I think I have a good way to do it, but I'd like to run it by the stack overflow crowd to see if there's a simpler way first.
Basically I'd like to detect if someone tries to share a paid membership with others. Users are already screened at the point of login for logging in from too many different subnets, but some have tried to work around this by sharing session cookies. What's the best way to do this without tying sessions to IPs (lots of legitimate people use rotating proxies).
The best heuristic I've found is the # of Class B subnets / Time (some ISPs use rotating proxies on different Class Cs). This has generated the fewest # of false positives for us so I'd like to stick with this method.
Right now I'm thinking of applying a before filter for each request that keeps track of which Subnets and session_ids a user has used in memcached and applies the heuristic to that to determine if the cookie is being shared.
Any simpler / easier to implement ideas? Any existing plugins that do this?
You could tie the session information to browser information. If people are coming in from 3 or 4 different browser types within a certain time period, you can infer that something suspicious may be going on.
An alternative answer relies on a bit of social-engineering. If you have some heuristic that you trust, you can warn users (at the top of the page) that you suspect they are sharing their account and that they are being watched closely. A "contact us" link in the warning would allow legitimate users to explain themselves (and thus be permanently de-flagged). This may minimize the problem enough to take it off your radar.
One way I can think of would be to set the same random value in both the session and a cookie with every page refresh. Check the two to make sure they are the same. If someone shares their session, the cookie and session will get out of sync.
Related
I am looking to set up 2 rails apps (with the same tld) which have single sign on and share some user data. If I have railsapp.com I will have the second app set up as otherapp.railsapp.com or railsapp.com/otherapp. I will most likely have railsapp.com handle registration/login etc (open to suggestion if this is not the best solution).
So lets say I sign up and upload an avatar and start accumulating user points on the main-app, I can then browse to the other-app and my profile there has the correct avatar and points total. is there an easy way to achieve this? Do the available SSO solutions create the user in the second app with the same user ID? if not, how are they tied together? (ie how can I query the other app for information I would like to be shared across the 2 - user points and avatar) I was initially looking at sharing a database or at least the user table between the 2 apps, but I can't help thinking there must be an easier solution?
I think the simplest solution is if you set the cookie on the .railsapp.com domain, then it should be sent when you do requests to otherapp.railsapp.com or any other subdomain (just stressing that as it might be a security concern). Remember to mark the cookie as secure!
And a extra bit you might need to make this work, is to store authentication tokens on a database so they can be shared between the two apps.
Disclaimer: I don't have much experience with Rails anymore, so I'm not sure if some of the frameworks like Devise can do something like this out of the box.
Edit
Got curious and ... google had the answer: http://codetheory.in/rails-devise-omniauth-sso/
I'm writing an app that make some calls to my API that have restrictions. If users were to figure out what these url routes were and the proper parameters and how to specify them, then they could exploit it right?
For example if casting a vote on something and I only want users to be able to cast one vote, a user knowing the route:
get '/castvote/' => 'votemanager#castvote'
could be problematic, could it not? Is it easy to figure out these API routes?
Does anyone know any ways to remove the possibility of this happening?
There is no way to hide AJAX calls - if nothing else, one just needs to open Developer Tools - Network panel, and simply see what was sent. Everything on clientside is an open book, if you just know how to read it.
Instead, do validation on serverside: in your example, record the votes and users that cast them; if a vote was already recorded by that user, don't let them do it again.
Your API should have authorization built into it. Only authorized users having specific access scopes should be allowed to consume your API. Checkout Doorkeeper and cancancan gems provided by the rails community.
As others have said, adding access_tokens/username/password authorisation is a good place to start. Also, if your application should only allow one vote per user, then this should be validated by your application logic on the server
This is a broader problem. There's no way to stop users from figuring out how voting works and trying to game it but there are different techniques used to make it harder. I list some solutions from least to most effective here:
Using a nonce or proof of work, in case of Rails this is implemented through authenticity token for non-GET requests. This will require user to at least load the page before voting, therefore limiting scripted replay attacks
Recording IP address or other identifiable information (i.e. browser fingerprinting). This will limit number of votes from a single device
Requiring signup. This is what other answers suggest
Requiring third-party login (i.e. Facebook, Twitter)
Require payment to cast a vote (like in tv talent shows)
None of those methods is perfect and you can quickly come up with ways to trick any of them.
The real question is what your threat model and how hard you want it to make for users to cast fake votes. From my practical experience requiring third-party login will ensure most votes are valid in typical use cases.
so I have designed this voting thing which does not let somebody vote for the same article twice in 24 hours. However, suppose a person votes and after seeing the person was able to cast vote or that he is falling in that 24 hour window, I disable the vote-casting button (and this is all Ajax btw).
But what to do when a person closes his/her browser and comes back up or even refreshes the page? Obviously, he would not be able to cast vote, because of my algorithm, but the person would still end up succeeding in making call to the server. So if he really wanted, he would keep refreshing the page and clicking on the vote and put unnecessary load on the server. How to avoid that by doing some sort of client-side thing or something?
I am using ASP.NET MVC, so session variables are out of question.
Am I being over-concerned by this?
If voting happens only from logged in (known) members then you shouldn't have any problem.
If, on the other hand, everyone can vote then you need to store all user vote events:
timestamp
poll
poll_vote
ip
user agent
user uniqueness cookie
So you'll need a random hash sent out as cookie. This will ensure that you don't accept another vote for the same poll from the same person.
If the user deletes his cookies you fallback to plan B, where you don't allow more than (say) 10 votes from the same IP and user agent combination for 24 hours.
The system is not perfect since users can change IPs and (more easily) user agents. You'd need advanced pattern detection algorithms to detect suspicious votes. The good thing about storing all user vote events is that you can process these later on using a scheduler, or outsource the votes to someone else who can process them for you.
Good luck
Refreshing is not a problem
If you're doing all this voting using Ajax, refreshing a page won't do anything except load the page using GET.
If you're not using Ajax you should make sure you call RedirectToAction/RedirectToRoute action result, that would as well help you avoid refresh problems.
How do you recognise users
If you use some sort of user authentication this re-voting is not a problem. But if your users are plain anonymous, you should store IP address with your votes. This is how things are usually done. This makes it possible to avoid session variables as well. But you have to be aware of this technique because it's not 100% perfect.
Cookies?
You could of course also use absolute expiration cookies. They'd expire in an day. Advanced users would of course be able to avoid your voting restrictions, but they would be able to avoid other ways as well. Sessions BTW are also based on cookies anyway.
Combination
But when you'd like to make you system as great as possible, you'll probably use a combination of the above.
The best way would be to track who voted for what and when on the server (probably storing it in a database). In order to do this you must use an authentication system on your site (probably forms authentication) to identify users. So every time someone tries to vote you check first in your data storage if he already voted and when and decide whether to validate the vote or not. This is the most reliable way.
If your site is anonymous (no authentication required to vote) then you could store a persistent cookie on the client computer that will last for 24 hours and indicate that a vote has already been cast from this computer. Remember though that cookies might be disabled, removed and are not a reliable way to identify a given user.
I am using ASP.NET MVC, so session
variables are out of question.
Any reason for that? Sessions are perfectly fine in ASP.NET MVC applications. It is in your case that they won't work because if the user closes the browser he will lose the session.
Obviously, he would not be able to
cast vote, because of my algorithm,
but the person would still end up
succeeding in making call to the
server. So if he really wanted, he
would keep refreshing the page and
clicking on the vote and put
unnecessary load on the server
Automated bots could also put unnecessary load to your server which is much more important than a single user clicking on F5.
If you just want to ensure the user can only vote once on an article then you just need to store a Set (i.e. HashSet) of all article id's that they've already voted on, then just check before allowing the vote.
If you still wanted a 24hr limit then you need to store a Dictionary<articleId,DateTime> then you can check if he has already voted for that article and if he has when it was.
I've been working on a web app that could be prone to user abuse, especially spam comments/accounts. I know that RECAPTCHA will take care of bots as far as fake users are concerned, but it won't do anything for those users who create an account and somehow put their spam comments on autopilot (like I've seen on twitter countless times).
The solution that I've thought up is to enable any user to flag another user and then have a list of flagged users (boolean attribute) come up on a users index action only accessible by the admin. Then the users that have been flagged can become candidates for banning(another boolean attribute) or unflagging. Banned users will still be able to access the site but will have greatly reduced privileges. For certain reasons, I don't want to delete users entirely.
However, when I thought of it, I realized that going through a list of flagged users to decide which ones should be banned or unflagged could be potentially very time consuming for an admin. Short of hiring someone to do the unflagging/banning of users, is there a more automated and elegant way to go about this?
I would create a table named abuses, containing both the reported user and the one that filed the report. Instead of the flagged boolean field, I suggest having a counter cache column such as "abuse_count". When this column reaches a predefined value, you could automatically "ban" the users.
Before "Web 2.0", web sites were moderated by administrators. Now, the goal is to get communities to moderate themselves. StackOverflow itself is a fantastic case study. The reputation system enables users to take on more "administrative" tasks as they prove themselves trustworthy. If you're allowing users to flag each other, you're already on this path. As for the details of the system (who can flag, unflag, and ban), I'd say you should look at various successful online communities (like StackOverflow) to see how they work, and how successful they are. In the end it will probably take some trial and error, since all communities differ.
If you want to write some code, you might create a script that looks for usage patterns typical of spammers (eg, same comment posted on multiple pages), though I think the goal should be to grow a community that does this for you. This may be more about planning than programming.
Some sophisticated spammers are happy to spend their time breaking your captcha if they feel that the reward is high enough. You should also consider looking at a spam server such as akismet for which there's a great rails plugin (https://github.com/joshfrench/rakismet).
There are other alternatives such as defensio (https://github.com/thewebfellas/defensio-ruby) as well as a gem that I found once which worked pretty well at detecting common blog spam, but I can't for the life of me find it any more.
I don't know much about SEO and how web spiders work, so forgive my ignorance here. I'm creating a site (using ASP.NET-MVC) which has areas that displays information retrieved from the database. The data is unique to the user, so there's no real server-side output caching going on. However, since the data can contain things the user may not wish to have displayed from search engine results, I'd like to prevent any spiders from accessing the search results page. Are there any special actions I should take to ensure that the search result directory isn't crawled? Also, would a spider even crawl a page that's dynamically generated and would any actions preventing certain directories being search mess up my search engine rankings?
edit: I should add, I'm reading up on robots.txt protocol, but it relies on co-operation from the web crawler. However, I'd also like to prevent any data-mining users who will ignore the robots.txt file.
I appreciate any help!
You can prevent some malicious clients from hitting your server too heavily by implementing throttling on the server. "Sorry, your IP has made too many requests to this server in the past few minutes. Please try again later." In practice, though, assume that you can't stop a truly malicious user from bypassing any throttling mechanisms that you put in place.
Given that, here's the more important question:
Are you comfortable with the information that you're making available for all the world to see? Are your users comfortable with this?
If the answer to those questions is no, then you should be ensuring that only authorized users are able to see the sensitive information. If the information isn't particularly sensitive but you don't want clients crawling it, throttling is probably a good alternative. Is it even likely that you're going to be crawled anyway? If not, robots.txt should be just fine.
It seems like you have 2 issues.
Firstly a concern about certain data appearing in search results. The second about malicious or unscrupulous user harvesting user related data.
The first issue will be covered by appropriate use of a robots.txt file as all the big search engines honour this.
The second issue seems more to do with data privacy. The first question which immediately springs to mind is: If there is user information which people may not want displayed, why are you making it available at all?
What is the privacy policy for such data?
Do users have the ability to control what information is made available?
If the information is potentially sensitive but important to the system could it be restricted so it is only available to logged in users?
Check out the Robots exclusion standard. It's a text file that you put on your site that tells a bot what it can and can't index. You will also want to address what happens if a bot doesn't honour the robots.txt file.
robots.txt file as mentioned. If that is not enough then you can:
Block unknown useragents - hard to maintain, easy for a bot to forge a browser's (although most legitimate bots wont)
Block unknown IP addresses - not useful for a public site
Require logins
Throttle user connections - tricky to tune, you will still be disclosing information.
Perhaps by using a combination. Either way it is a trade off, if the public can browse to it, so can a bot. Be sure you don't block & alienate people in your attempts to block bots.
a few options:
force the user to login to view the content
add a CAPTCHA page before the content
embed content in Flash
load dynamically with JavaScript