How can I count unique views - ruby-on-rails

I have a tricky problem to solve. I need to be able to track how many unique views to a URL (that includes a promo code): www.mysite.com/hello/PROMO
The issues are:
Since I pay for views based on promo code, I want to make sure people can't fake views.
It's difficult for me to count views from the same IP as fraudulent, because a lot of my users are on college campuses (i.e. behind the same NAT, often using one of a handful of IP addresses).
Any ideas on how I can have a robust view tracking system, given my concerns re IP repeats amongst my users?
FYI - I am using Ruby on Rails.
Edit - My site uses Google Analytics, so I'm wondering how accurate GA is at determining unique visitors to a specific URL (and whether the API is capable of giving me that result).

Related

Detecting use of iOS's "Hide my Email" on website signup

Apple's latest changes which allow users to hide their IP, hide their email, etc. are creating problems for my web-based app (non-native) which relies upon these things to build a sense of who a person is.
In most situations, I can see why these are great "features" to have, however in my use case I have a voting platform that utilizes things like email address and IP to do a decent job at detecting duplicate votes or fraudulent vote (i.e, logins from other countries, etc.).
Now, before anyone says "These aren't foolproof ways of identifying a person" and derail my actual question: I know. I'm not looking for perfection, but these methodologies shed light on the 95%+ of people who might be trying to circumvent our voting system.
Apple placing the ability to circumvent these measures by being right up in front of the user as a first-class feature shoots major holes in my existing strategy.
Is there a way to detect if a user is utilizing these methods to where I could prompt them that they need to sign-up without using these features?
I think it would be easily justifiable to explain that, due to the nature of the application being a voting website, the ability to create multiple aliases would directly undermine the purpose of the site.
Perhaps there is an email address pattern to look for (I know in my test cases, I was getting email addresses #icloud.com).
If there is no reasonable way, I need to rethink the entire process of identifying individuals and preventing aliases (phone / text confirmation, etc).

How can I get the participants of a Vanity experiment?

tl:dr
Is there anyway to get something like Vanity.experiment(:landing).participants_for_option(:a) returning a array of users?
The long story
I'm using the gem Vanity with a Rails 4.2 application and it is working nicely, but I want to inspect further the behaviour of participants.
I tested what kind of page converted more users: A classical signup page versus a signup with order page. The classical signup page led to almost three times more signups, but I'm still in the dark in the sense that I don't know, among the signup-only-users, how many ordered a product.
It sounds like you're trying to understand more about how an experiment affects different parts of your funnel.
At the aggregate level, one way to do that may be to to use multiple metrics for your experiment at different parts of your funnel, e.g. track!ing both signups and then purchases.
Unfortunately, Vanity isn't set up very well to query for individual participants per alternative, because testing itself is aggregate. If you want to access alternatives per user, there are methods for that, for example, Vanity.playground.adapter.ab_showing(experiment, identity), see the docs.
If you're interested in doing more in depth analytical queries, it might be worth using the SQL adapter, the schema tracks per participant and you could join to other tables that hold data about purchases/etc.
Edit:
It looks like this has changed in the most recent version of Vanity:
https://github.com/alobato/vanity/blob/master/lib/vanity/playground.rb#L231
Vanity.playground.connection.ab_assigned(experiment_name, identity)
Vanity.playground.connection.ab_showing(experiment_name, identity)

Expanding a website - providing different contents across different places

I am working on a website. Currently the website was targeted to serve users from a specific Geographic region. Now I would like to expand its userbase to another region. The need is to serve different contents to different regions with the same base functionality.
My initial thought (I might sound a noob here) is to host the content specific to different regions on different databases -> Redirect users to specific domains and thus map the users geographically. Do suggest if its the right way to proceed.
Also, I would like to know whether there is a need to localize my website for these regions (Current language used is English)
Please post your experiences in such scenarios and also your ideas to bring about the transition.
Thanks in advance.
How do you see users being matched to their specific regional content?
Will they be presented with an option to choose?
Will you use geo functions to determine location?
Will you use server based reverse DNS lookup to determine location?
Will each region get its own "entry" URL (aka different domains)?
The first three are fraught with their own specific problems...
Presenting a choice/menu is considered bad form because it adds to the number of "clicks" necessary for a user to get to the content they actually came for.
While geo functions are very widely supported in all modern browsers, it is still seen as an issue of privacy in that a large number of users will not "allow" the functionality, meaning you'll have to fallback to a choice/menu approach anyway.
Server based reverse DNS, while a common practice, is very unreliable because many users are using VPN, proxies, TOR, etc. to specifically mask their actual location via this method of lookup.
Personally, my experience is to use completely separate entry URLs that are all hosted as virtual domains on a single Web Server. This gives you a large array of methods of determining which entry URL was used to access your code, and then format/customize the content appropriately.
There is really no need to setup separate servers and/or databases to handle these different domains/regions.
With that said, even if the language is common across regions, it is a very good habit to configure your servers and databases to support UTF-8 end-to-end, such that if any language specific options need to be supported in the future, then you won't need to change your code to do so. This is especially true if your site will capture any user generated input.

Persistent way to recognize web clients other than using ip address?

I am creating a site where anyone is able to upvote and downvote content.
For the launch, I wish to not force people to create accounts in order to do this. However, without accounts, what is a reliable way to ensure people don't vote on the same content more than once?
The methods that I've looked at are ip based tracking and cookie/session based tracking.
Both have problems.
I am targeting a college campus, and so many users could potentially have the same ip (through their dorm or apartment). Whereas cookies/sessions are very easily exploitable if the user deletes their sessions or even uses a script to vote.
(Being a college campus, there's probably many tech savvy students who may do this)
As far as technology goes, are there more reliable ways to accomplish this?
You have very few options here. Cookies were invented for just this kind of thing, but as you know they can be deleted or altered by those who know how. If there were a reliable, easy way to do this, it would have a catchy name and be well documented all over the web.

Why would Google Search use client-side URL parameters?

Yesterday morning I noticed Google Search was using hash parameters:
http://www.google.com/#q=Client-side+URL+parameters
which seems to be the same as the more usual search (with search?q=Client-side+URL+parameters). (It seems they are no longer using it by default when doing a search using their form.)
Why would they do that?
More generally, I see hash parameters cropping up on a lot of web sites. Is it a good thing? Is it a hack? Is it a departure from REST principles? I'm wondering if I should use this technique in web applications, and when.
There's a discussion by the W3C of different use cases, but I don't see which one would apply to the example above. They also seem undecided about recommendations.
Google has many live experimental features that are turned on/off based on your preferences, location and other factors (probably random selection as well.) I'm pretty sure the one you mention is one of those as well.
What happens in the background when a hash is used instead of a query string parameter is that it queries the "real" URL (http://www.google.com/search?q=hello) using JavaScript, then it modifies the existing page with the content. This will appear much more responsive to the user since the page does not have to reload entirely. The reason for the hash is so that browser history and state is maintained. If you go to http://www.google.com/#q=hello you'll find that you actually get the search results for "hello" (even if your browser is really only requesting http://www.google.com/) With JavaScript turned off, it wouldn't work however, and you'd just get the Google front page.
Hashes are appearing more and more as dynamic web sites are becoming the norm. Hashes are maintained entirely on the client and therefore do not incur a server request when changed. This makes them excellent candidates for maintaining unique addresses to different states of the web application, while still being on the exact same page.
I have been using them myself more and more lately, and you can find one example here: http://blixt.org/js -- If you have a look at the "Hash" library on that page, you'll see my implementation of supporting hashes across browsers.
Here's a little guide for using hashes for storing state:
How?
Maintaining state in hashes implies that your application (I'll call it application since you generally only use hashes for state in more advanced web solutions) relies on JavaScript. Without JavaScript, the only function of hashes would be to tell the browser to find content somewhere on the page.
Once you have implemented some JavaScript to detect changes to the hash, the next step would be to parse the hash into meaningful data (just as you would with query string parameters.)
Why?
Once you've got the state in the hash, it can be modified by your code (or your user) to represent the current state in your application. There are many reasons for why you would want to do this.
One common case is when only a small part of a page changes based on a variable, and it would be inefficient to reload the entire page to reflect that change (Example: You've got a box with tabs. The active tab can be identified in the hash.)
Other cases are when you load content dynamically in JavaScript, and you want to tell the client what content to load (Example: http://beta.multifarce.com/#?state=7001, will take you to a specific point in the text adventure.)
When?
If you had a look at my "JavaScript realm" you'll see a border-line overkill case. I did it simply because I wanted to cram as much JavaScript dynamics into that page as possible. In a normal project I would be conservative about when to do this, and only do it when you will see positive changes in one or more of the following areas:
User interactivity
Usually the user won't see much difference, but the URLs can be confusing
Remember loading indicators! Loading content dynamically can be frustrating to the user if it takes time.
Responsiveness (time from one state to another)
Performance (bandwidth, server CPU)
No JavaScript?
Here comes a big deterrent. While you can safely rely on 99% of your users to have a browser capable of using your page with hashes for state, there are still many cases where you simply can't rely on this. Search engine crawlers, for example. While Google is constantly working to make their crawler work with the latest web technologies (did you know that they index Flash applications?), it still isn't a person and can't make sense of some things.
Basically, you're on a crossroads between compatability and user experience.
But you can always build a road inbetween, which of course requires more work. In less metaphorical terms: Implement both solutions so that there is a server-side URL for every client-side URL that outputs relevant content. For compatible clients it would redirect them to the hash URL. This way, Google can index "hard" URLs and when users click them, they get the dynamic state stuff!
Recently google also stopped serving direct links in search results offering instead redirects.
I believe both have to do with gathering usage statistics, what searches were performed by the same user, in what sequence, what of the search results the user has followed etc.
P.S. Now, that's interesting, direct links are back. I absolutely remember seeing there only redirects in the last couple of weeks. They are definitely experimenting with something.

Resources