What does __utma mean? [closed] - url

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 10 months ago.
The community reviewed whether to reopen this question 10 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
What does it mean when you see things like:
?__utma=1.32168570.1258672608.1258672608.1259628772.2&__utmb=1.4.10.1259628772&
etc in the the url string?
Maybe it's simple, but I'm thinking it's something I'm not aware of because I see it every now and again.

Here's a good link to explain them. They are cookies used by Google Analytics to track information on your website:
https://developers.google.com/analytics/devguides/collection/analyticsjs/cookie-usage#gajs

Your browser don't support cookies. That's the reason you see it in the url.
In fact google use cookies __utma, __utmb, __utmc, __utmz to track information.
When cookies are disabled - browser pass this information throught URL as GET param.

They are URL parameters, they pass information back to the web server.
protocol://username:password#server:port?parameterList#anchorName
Example:
http://stackoverflow.com:80/page?param1=value1&param2=value2
The #anchorName will skip you to a certain part of an HTML page
The parameterList portion is also called the query
The protocol portion is also called the scheme
The username:password part can be ommitted
The port will default to 80 if the protocol is HTTP and the port is not specified
If you don't specify the protocol in a web browser, it will default to HTTP.
You will often want to have a single page do multiple things. This is accomplished by accepting different parameters. These parameters will typically pass information to the server which will modify how the next page is displayed, or how another action is performed on the server
Sometimes URL parameters are replaced with nice looking URL paths. This is accomplished with newer web frameworks like ASP .NET MVC, Django, Ruby on Rails, etc...
There is a much more detailed description from what I gave in RFC 3986: Uniform Resource Identifier (URI): Generic Syntax.

The __utma Cookie
This cookie is what’s called a “persistent” cookie, as in, it never expires (technically, it does expire…in the year 2038…but for the sake of explanation, let’s pretend that it never expires, ever). This cookie keeps track of the number of times a visitor has been to the site pertaining to the cookie, when their first visit was, and when their last visit occurred. Google Analytics uses the information from this cookie to calculate things like Days and Visits to purchase.
The __utmb and __utmc Cookies
The B and C cookies are brothers, working together to calculate how long a visit takes. __utmb takes a timestamp of the exact moment in time when a visitor enters a site, while __utmc takes a timestamp of the exact moment in time when a visitor leaves a site. __utmb expires at the end of the session. __utmc waits 30 minutes, and then it expires. You see, __utmc has no way of knowing when a user closes their browser or leaves a website, so it waits 30 minutes for another pageview to happen, and if it doesn’t, it expires.
[By Joe Teixeira]

It is related to google analytics... it's used for their tracking. Although I suspect Brian's answer answers what you were really asking...

Related

Google script origin request url

I'm developing a Google Sheets add-on. The add-on calls an API. In the API configuration, a url like https://longString-script.googleusercontent.com had to be added to the list of urls allowed to make requests from another domain.
Today, I noticed that this url changed to https://sameLongString-0lu-script.googleusercontent.com.
The url changed about 3 months after development start.
I'm wondering what makes the url to change because it also means a change in configuration in our back-end every time.
EDIT: Thanks for both your responses so far. Helped me understand better how this works but I still don't know if/when/how/why the url is going to change.
Quick update, the changing part of the url was "-1lu" for another user today (but not for me when I was testing). It's quite annoying since we can't use wildcards in the google dev console redirect uri field. Am I supposed to paste a lot of "-xlu" uris with x from 1 to like 10 so I don't have to touch this for a while?
For people coming across this now, we've also just encountered this issue while developing a Google Add-on. We've needed to add multiple origin urls to our oauth client for sign-in, following the longString-#lu-script.googleusercontent.com pattern mentioned by OP.
This is annoying as each url has to be entered separately in the authorized urls field (subdomain or wildcard matching isn't allowed). Also this is pretty fragile since it breaks if Google changes the urls they're hosting our add-on from. Furthermore I wasn't able to find any documentation from Google confirming that these are the script origins.
URLs are managed by the host in various ways. At the most basic level, when you build a web server you decide what to call it and what to call any pages on it. Google and other large content providers with farms of servers and redundant data centers and everything are going to manage it a bit differently, but for your purposes, it will be effectively the same in that ... you need to ask them since they are the hosting provider of your cloud content.
Something that MIGHT be related is that Google rolled out some changes recently dealing with the googleusercontent.com domain and picassa images (or at least was scheduled to do so.) So the google support forums will be the way to go with this question for the freshest answers since the cause of a URL change is usually going to be specific to that moment in time and not something that you necessarily need to worry about changing repeatedly. But again, they are going to need to confirm that it was something related to the recent planned changes... or not. :-)
When you find something out you can update this question in case it is of use to others. Especially, if they tell you that it wasn't a one time thing dealing with a change on their end.
This is more likely related to Changing origin in Same-origin Policy. As discussed:
A page may change its own origin with some limitations. A script can set the value of document.domain to its current domain or a superdomain of its current domain. If it sets it to a superdomain of its current domain, the shorter domain is used for subsequent origin checks.
For example, assume a script in the document at http://store.company.com/dir/other.html executes the following statement:
document.domain = "company.com";
After that statement executes, the page can pass the origin check with http://company.com/dir/page.html
So, as noted:
When using document.domain to allow a subdomain to access its parent securely, you need to set document.domain to the same value in both the parent domain and the subdomain. This is necessary even if doing so is simply setting the parent domain back to its original value. Failure to do this may result in permission errors.

Is it possible to ensure that requests come from a specific domain?

I'm making a Rails polling site, which should have results that are very accurate. Users vote using POST links. I've taken pains to make sure users only vote once, and know exactly what they're voting for.
But it occurred to me that third parties with an interest in the results could put up POST links on their own websites, that point to my voting paths. They could skew my results this way, for example by adding a misleading description.
Is there any way of making sure that the requests can only come from my domain? So a link coming from a different domain wouldn't run any of the code in my controller.
There are various things that you'll need to check. First is request.referer, which will tell you the page that referred the link to your site. If it's not your site, you should reject it.
if URI(request.referer).host != my_host
raise ArgumentError.new, "Invalid request from external domain"
end
However, this only protects you from web clients (browsers) that accurately populate the HTTP referer header. And that's assuming that it came from a web page at all. For instance, someone could send a link by email, and an email client is unlikely to provide a referer at all.
In the case of no referer, you can check for that, as well:
if request.referer.blank?
raise ArgumentError.new, "Invalid request from unknown domain"
elsif URI(request.referer).host != my_host
raise ArgumentError.new, "Invalid request from external domain"
end
It's also very easy with simple scripting to spoof the HTTP 'referer', so even if you do get a valid domain, you'll need other checks to ensure that it's a legitimate POST. Script kiddies do this sort of thing all the time, and with a dozen or so lines of Ruby, python, perl, curl, or even VBA, you can simulate interaction by a "real user".
You may want to use something like a request/response key mechanism. In this approach, the link served from your site includes a unique key (that you track) for each visit to the page, and that only someone with that key can vote.
How you identify voters is important, as well. Passive identification techniques are good for non-critical activities, such as serving advertisements or making recommendations. However, this approach regularly fails a measurable percentage of the time when used across the general population. When you also consider the fact that people actually want to corrupt voting activities, it's very easy to suddenly become a target for everyone with a good concept to "beat the system" and some spare time on their hands.
Build in as much security as possible early on, because you'll need far more than you expect. During the 2012 Presidential Election, I was asked to pre-test 41 online voting sites, and was able to break 39 of them within the first 24 hours (6 of them within 1 hour). Be overly cautious. Know how attackers can get in, not just using "normal" mechanisms. Don't publish information about which technologies you're using, even in the code. Seeing "Rails-isms" anywhere in the HTML or Javascript code (or even the URL pathnames) will immediately give the attacker an enormous edge in defeating your safety mechanisms. Use obscurity to your advantage, and use security everywhere that you can.
NOTE: Checking the request.referer is like putting a padlock on a bank vault: it'll keep out those that are easily dissuaded, but won't even slow down the determined individual.
What you are trying to prevent here is basically cross-site request forgery. As Michael correctly pointed out, checking the Referer header will buy you nothing.
A popular counter-measure is to give each user an individual one-time token that is sent with each form and stored in the user's session. If, on submit, the submitted value and the stored value do not match, the request is disgarded. Luckily for you, RoR seems to ship such a feature. Looks like a one-liner indeed.

How does Rails handle multiple incoming requests? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
How does Rails, handle multiple requests from different users without colliding? What is the logic?
E.g., user1 logged in and browses the site. At same time, user2, user3, ... log in and browse. How does rails manage this situation without any data conflict between users?
One thing to bear in mind here is that even though users are using the site simultaneously in their browsers, the server may still only be handling a single request at a time. Request processing may take less than a second, so requests can be queued up to be processed without causing significant delays for the users. Each response starts from a blank slate, taking only information from the request and using it to look up data from the database. it does not carry anything over from one request to the next. This is called the "stateless" paradigm.
If the load increases, more rails servers can be added. Because each response starts from scratch anyway, adding more servers doesn't create any problems to do with "sharing of information", since all information is either sent in the request or loaded from the database. It just means that more requests can be handled per second.
When their is a feeling of "continuity" for the user, for example staying logged into a website, this is done via cookies, which are stored on their machine and sent through as part of the request. The server can read this cookie information from the request and, for example, NOT redirect someone to the login page as the cookie is telling them they have logged in already as user 123 or whatever.
In case your question is about how Rails differ users the answer will be that is uses cookies to store session. You can read more about it here.
Also data does not conflict since you get fresh instance of controller for each request
RailsGuides:
When your application receives a request, the routing will determine
which controller and action to run, then Rails creates an instance of
that controller and runs the method with the same name as the action.
That is guaranteed not by Rails but by the database that the webservice uses. The property you mentioned is called isolation. This is among several properties that a practical database has to satisfy, known as ACID.
This is achieved using a "session": a bunch of data specific to the given client, available server-side.
There are plenty of ways for a server to store a session, typically Rails uses a cookie: a small (typically around 4 kB) dataset that is stored on user's browser and sent with every request. For that reason you don't want to store too much in there. However, you usually don't need much, you only need just enough to identify the user and still make it hard to impersonate him.
Because of that, Rails stores the session itself in the cookie (as this guide says). It's simple and requires no setup. Some think that cookie store is unreliable and use persistence mechanisms instead: databases, key-value stores and the like.
Typically the workflow is as follows:
A session id is stored in a cookie when the server decides to initialize a session
A server receives a request from the user, fetches session by its id
If the session says that it represents user X, Rails acts as if it's actually him
Since different users send different session ids, Rails treats them as different ones and outputs data relevant to a detected one: on a per-request basis.
Before you ask: yes, it is possible to steal the other person's session id and act in that person's name. It's called session hijacking and it's only one of all the possible security issues you might run into unless you're careful. That same page offers some more insight on how to prevent your users from suffering.
As additional case You could use something like a "puma" multithread server...

How does one set the auto-logout time in mochiweb?

I'm looking at the source code for mochiweb and seeing numbers that test cookie expiration time that look nothing like the behavior of the server that I've inherited. mochiweb has 111 and 86417 (a day plus 17 seconds) in source, but it looks like it only does any of that through cookie expiration and that - in test code. (see mochiweb_cookies.erl)
The server that I'm looking at is timing out users in about 10-15 minutes, but nowhere do I see any code that is setting the cookie value, nor do I see any code path through the mochiweb source that would even allow me to set it.
Any ideas?
There are really two questions here: "How is my application doing session expiration?" and "How do I set a cookie with mochiweb_cookies?" Only the second one can be reasonably answered without further information.
Req:ok("text/plain",
[mochiweb_cookies:cookie("session", "my-session-id", [{max_age, 86417}])],
"you're logged in!")
mochiweb_cookies:cookie/3 returns a {"Set-Cookie", "headervalue"} pair which is appropriate as a value in the ResponseHeaders arguments of mochiweb_request:respond and mochiweb_request:ok.
It is of course possible to set cookies in mochiweb without using the mochiweb_cookies module, they're just headers after all. Your application may be setting the cookie by handcrafting the header, or a proxy or another application service hosted under the same domain may be setting the cookie.
That being said, if at all possible you should avoid relying on cookie expiration to log out users. The max-age is really just a hint to the browser to stop sending the cookie after that time has passed. A browser or an attacker can always misbehave and send the cookie indefinitely.

What is the point of www in web urls? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I've been trying to collect analytics for my website and realized that Google analytics was not setup to capture data for visitors to www.example.com (it was only setup for example.com). I noticed that many sites will redirect me to www.example.com when I type only example.com. However, stackoverflow does exactly the opposite (redirects www.stackoverflow.com to just stackoverflow.com).
So, I've decided that in order to get accurate analytics, I should have my web server redirect all users to either www.example.com, or example.com. Is there a reason to do one or the other? Is it purely personal preference? What's the deal with www? I never type it in when I type domains in my browser.
History lesson.
There was a time when the Web did not dominate the Internet. An organisation with a domain (e.g. my university, aston.ac.uk) would typically have several hostnames set up for various services: gopher.aston.ac.uk (Gopher is a precursor to the World-wide Web), news.aston.ac.uk (for NNTP Usenet), ftp.aston.ac.uk (FTP - including anonymous FTP archives). They were just the obvious names for accessing those services.
When HTTP came along, the convention became to give the web server the hostname "www". The convention was so widespread, that some people came to believe that the "www" part actually told the client what protocol to use.
That convention remains popular today, and it does make some amount of sense. However it's not technically required.
I think Slashdot was one of the first web sites to decide to use a www-less URL. Their head man Rob Malda refers to "TCWWW" - "The Cursed WWW" - when press articles include "www" in his URL. I guess that for a site like Slashdot which is primarily a web site to a strong degree, "www" in the URL is redundant.
You may choose whichever you like as the canonical address. But do be consistent. Redirecting from other forms to the canonical form is good practice.
Also, skipping the “www.” saves you four bytes on each request. :)
It's important to be aware that if you don't use a www (or some other subdomain) then all cookies will be submitted to every subdomain and you won't be able to have a cookie-less subdomain for serving static content thus reducing the amount of data sent back and forth between the browser and the server. Something you might later come to regret.
(On the other hand, authenticating users across subdomains becomes harder.)
It's just a subdomain based on tradition, really. There's no point of it if you don't like it, and it wastes typing time as well. I like http://somedomain.com more that http://www.somedomain.com for my sites.
It's primarily a matter of establishing indirection for hostnames. If you want to be able to change where www.example.com points without affecting where example.com points, this matters. This was more likely to be useful when the web was younger, and the "www" helped make it clear why the box existed. These days, many, many domains exist largely to serve web content, and the example.com record all but has to point to the HTTP server anyway, since people will blindly omit the www. (Just this week I was horrified when I tried going to a site someone had mentioned, only to find that it didn't work when I omitted the www, or when I accidentally added a trailing dot after the TLD.)
Omitting the "www" is very Web 2.0 Adoptr Gamma... but with good reason. If people only go to your site for the web content, why keep re-adding the www? I general, I'd drop it.
http://no-www.org/
Google Analytics should work just fine with or without a www subdomain, though. Plenty of sites using GA successfully that don't force either/or.
It is the third-level domain (see Domain name. There was a time where it designated a physical server: some sites used URLs like www1.foo.com, www3.foo.com and so on.
Now, it is more virtual (different 3rd-level domains pointing to same server, same URL handled by different servers), but it is often used to handle sub-domains, and with some trick, you can even handle an infinite number of sub-domains: see, precisely, Wikipedia which uses this level for the language (en.wikipedia.org, fr.wikipedia.org and so on) or others site to give friendly URLs to their users (eg. my page http://PhiLho.deviantART.com).
So the www. isn't just here for decoration, it has a purpose, even if the vast majority of sites just stick to this default, and if not provided, supply it automatically. I knew some sites forgetting to redirect, giving an error if you omitted it, while they communicated on the www-less URL: they expected users to supply it automatically!
Let alone the URL already specifies what protocol is to be used so "www." is really of no use.
As far as I remember, in former times services like www and ftp were located on different machines, therefore using the natural DNS features (subdomains) was necessary at this time (more or less).

Resources