Where is the difference between locating and identifying a resource? [duplicate] - url

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
What's the difference between a URI and a URL?
Is there any difference? I'm talking about URI for identifying, but URL for locating. Aren't both the same thing?

They can look the same, but they're not the same thing. A URL identifies something that can be transferred over some protocol (often http). A URI, can be used to identify a namespace (for example) but there might not be any content at the address.

Where is the difference between locating a ressource and identifying a ressource?
Knowing who I am doesn't tell you anything about where I am.

A URI identifies a resource either by location, or a name, or both. More often than not, most of us use URIs that defines a location to a resource.
A URL is a specialization of URI that defines the network location of a specific resource.
Generally, if the URL describes both the location and name of a resource, the term to use is URI.

This article might help:
URI vs. URL
Excerpt:
"...a URL is a type of URI that
identifies a resource via a
representation of its primary access
mechanism (e.g., its network
"location"), rather than by some other
attributes it may have. Thus as we
noted, "http:" is a URI scheme. An
http URI is a URL. The phrase "URL
scheme" is now used infrequently,
usually to refer to some subclass of
URI schemes..."

An identifier is a unique name for something, so we can be sure that we talk about the same thing. For example the Atom namespace is 'http://www.w3.org/2005/Atom'. This is a URI. This doesnt mean that you can put this URI in a browser and have a document there (well, in case of Atom, yes, you have a document, but it's a simple presentation of Atom for convenience, it's not the Atom namespace itself).
A URL is the location of a document. This is what you can put in your browser. It is confusing that both use the same format (http://...) but that is mostly annecdotic ...

A URL is a URI which is not a URN. (see)

Related

How can I use at symbol (#) in url?

For example I meet this url type: http://username:token#example.com/protected/files .
I searched on the web for this but I don't find what I expected.
Wikipedia explains the syntax of a URI quite well:
scheme:[//[user[:password]#]host[:port]][/path][?query][#fragment]
Your username:token#example.com/protected/files may look like and URL, but if fact it is not, because it does not include the protocol to access the data. It is an uri.
Browsers (I suppose you refer to web browsers) do work with URL's, which is a subtype of URI that includes the protocol. Please notice web browsers dont work with all existing protocols, only with some of them (http, https, ftp, file...) username is not a protocol.

Are there any URL shortening services that support protocol relative urls?

I need to shorten a protocol relative url such as //www.example.com/longurl into another protocol relative url //short/url so an iframe can be embedded on a page independent of the used protocol. Are there any free URL shortening services that allow this? I've tried isgd, tinyurl, googl, bitly, owly among others, but they all either don't accept the url or prepend it with http:// so it turns into http:////www.example.com/longurl.
It's undocument, but it turns out that goo.gl supports this, you just have to modify the URL it gives you, although it seems the URLs I tried were longer than what they accept.

Is a "file://" path a URL?

I sometimes see people refer to file system paths (POSIX/Windows) as both URIs and URLs. I'm no file system buff, but I have yet to find a file system path that conflicts with my understanding of the URL format. That is, of course, given that it includes the scheme name (e.g. file://localhost/path/to/file.txt).
File system paths are most definitely URIs - I mean, what's not - so everyone referring to file system paths as URIs is inside the safe zone. But is it safe to call them URLs?
If the URL was defined by a single (non-obsolete) RFC, rather than being comprised of half a dozen specialized ones, I wouldn't have to ask this question.
file is a registered URI scheme (for "Host-specific file names").
It links to RFC 1738, which is called "Uniform Resource Locators (URL)", in which file is specified:
A file URL takes the form:
file://<host>/<path>
So yes, file URIs are URLs.
However, the subdivision from URIs into URLs, URNs and "Other" (like data) is not that useful anyway. FWIW, the WHATWG URL spec tries to standardize on the term "URL" for all kind of URIs (even those that aren't URLs today, following the RFC). The W3C Note "URIs, URLs, and URNs: Clarifications and Recommendations 1.0" tries to summarize the confusion about the terms:
The body of documents (RFCs, etc) covering URI architecture, syntax, registration, etc., spans both the classical and contemporary periods. People who are well-versed in URI matters tend to use "URL" and "URI" in ways that seem to be interchangable. Among these experts, this isn't a problem. But among the Internet community at large, it is. People are not convinced that URI and URL mean the same thing, in documents where they (apparently) do. […]

Validating URL domain in Rails

I want to validate a URL, so I searched and found this
Brian Ray said in his post that
"#Tate's answer is good for a full URL, but if you want to validate a domain column, you don't want to allow the extra URL bits his regex allows (e.g. you definitely don't want to allow a URL with a path to a file).
So I removed the protocol, port, file path, and query string parts of the regex, resulting in this:"
I don't understand what he said at all. How can a URL be a path to a file? What is a "domain column"?
A URL consists of several parts. If you have a very eleborate URL, like:
http://www.example.com:1234/path/to/file.html?key1=value1&key2=value2
The parts are:
protocol: http://
host name: www
domain name: example.com
port: 1234
file path: path/to/file.html
query string: key1=value1&key2=value2
The only parts that may not be omitted are the protocol (but many programs allow defaulting to http://) and host name. Each part has its own requirements for what are legal characters in it. And what's worse, not all web servers agree on what those requirements are. So the only thing you can check without making an actual connection and seeing if it fails, is the part which is needed to contact the web server. This is only the protocol, host and domain name, and port. These are all case insensitive (the rest may not be). I'm not sure what are valid characters in a host or domain name, but this is also something where name servers may not agree with the specification.
In short, the only way to check if an URL is valid is to try to make a connection to it. If your program uses some magic to reject URLs (or email addresses), some people are going to hate you and/or their internet provider for it (because even if your check follows the specification, some host or domain names don't).
As to your question how an URL can refer to a local file, there is a special protocol for that: file://. Since the path must start with a / as well, this results in URLs like file:///home/user/file.html, so with three slashes at the start.

Is there a script or other method for obtaining the correct variation of a URL for a web page?

I'm assuming there is a single correct variation of a URL for every page. Please correct me if I'm wrong.
Given an input of an equivalent URL, I need to get the correction of a URL. For example, most browsers accept slight variations from the exact URL but then correct it to take you to the right page? (Or perhaps this is done at the DNS level?)
The task I'm working on is getting the correct MD5 hash of a URL that will be recognized by an API service that returns information about a URL. For example, if I hash 'http://stackoverflow.com', I get an empty response. In order to get a valid response I need to hash 'https://stackoverflow.com/', (with a trailing slash).
EDIT: The API service I'm using is the Delicious API. In case that resonates with anyone's experience.
I'm assuming there is a single correct variation of a URL for every page. Please correct me if I'm wrong.
There is only a single "correct" one if the author decides that there should be, then they will likely use a combination of canonical and HTTP redirects to push people in that direction.
For example, most browsers accept slight variations from the exact URL but then correct it to take you to the right page?
Host names are case insensitive, and the root doesn't need a slash (so http://example.com and http://EXAMPLE.cOM/ are identical).
Beyone that, the rest of the URL (except for a fragment identifier if there is one) is handled entirely by the HTTP server. It might treat it case sensitive, it might not. It might require things in a certain order, it might not.

Resources