Relative urls vs Protocol-relative URLs - url

I am just wondering if I use a relative URL as follows:
"/myfolder"
It will change to
mydomain/myfolder
But does it also maintain if the root is HTTP or HTTPS similar to the "//" approach.
i.e. if the page loading my relative URL /myfolder has HTTPS will this change to
"https://mydomain/myfolder"

tl;dr: Yes.
Relative references are always applied against a base URI (see how).
In HTML5, the document base URL is, in the common case (i.e., no base element, no iframe-srcdoc document, no about:blank), the document's address.
So if you have a document at http://example.com/foo, a link with the relative reference /bar will link to the URL http://example.com/bar. And if the document is at https://example.com/foo, it will link to https://example.com/bar.

Related

What is the relative URL for https://example.com//path?

What is the relative (to https://example.com) URL for "https://example.com//path"?
It's not "//path" as that's an absolute URL.
the absolute URL is /path
https://example.com//path is the same as https://example.com/path
The absolute path is indeed https://example.com//path, it's called absolute because you don't need any thing else to navigate to that url, unlike the relative urls, which are usually relative to the root of the document, like in your case /path
A relative URL will always refer back to the domain it's on,
so if you're on http://example.com then /path will redirect to http://example.com/path
and if you move your app to http://example2.com then the url will redirect to http://example2.com/path
The double slash is a mistake, but most of the time the web servers fix that error and redirect you back to the correct url, like for example
https://stackoverflow.com/////a//////28782401/////2149092 which will redirect you back to this same answer.

Are protocol-relative URLs relative URLs?

So consider a protocol-relative URL like so;
//www.example.com/file.jpg
The idea I've had in my head for as long as I can remember is that protocol-relative URLs are in fact absolute URLs. They behave exactly like absolute URLs, and never do they work like relative URLs. I wouldn't expect this to make the browser go find something at
http://www.example.com///www.example.com/file.jpg
The URL defines the host and the path (like an absolute URL does), and the scheme is inherited from whatever the page used, and therefore it makes a complete unambiguous URL, i.e. an absolute URL.
Right?
Now, upon further research into this, I came upon this answer, which states;
A URL is called an absolute URL if it begins with the scheme and scheme specific part (here // after http:). Anything else is a relative URL.
Neither the question nor the answer specifically discuss protocol-relative URLs, so I'm mindful that it can just be an oversight in wording.
However, I'm now also now running into an issue in my development, where a system that only accepts absolute URLs doesn't function with protocol-relative URLs, and I don't know if that's by design or due to a bug.
The RFC3986 section which is often linked to in relation to protocol-relative URLs also splashes the word "relative" around a lot. 4.3 then goes on to say that absolute URIs define a scheme.
All this evidence against my initial assumption led me to the question;
Are protocol-relative URLs relative or absolute?
Every relative URL is an unambiguous URL given the URL it is relative to. So if your page is http://mypage.com/some/folder/ then you know the relative URL this/that corresponds to http://mypage.com/some/folder/this/that and you know the relative URL //otherpage.com/ resolves to http://otherpage.com/. Importantly, it cannot be resolved without knowing the page URL it is relative to.
A relative URL is any URL that is relative to something and cannot be resolved by itself. An aboslute URL does not require any context whatsoever to resolve.
What you are calling a “protocol-relative URL” WHATWG calls a “scheme-relative URL” in the URL Standard document, and it is not an absolute URL, but a relative URL.
Granted most sites available on HTTPS show the same content on the corresponding HTTP URLs, that is not necessarily the case, and it therefore makes sense a URL that does not include the scheme cannot be considered absolute.
From the document:
An absolute URL must be a scheme, followed by ":", followed by either a scheme-relative URL, if scheme is a relative scheme, or scheme data otherwise, optionally followed by "?" and a query.
Specifically answering your question, we have:
A relative URL must be either a scheme-relative URL, an absolute-path-relative URL, or a path-relative URL that does not start with a scheme and ":", optionally followed by a "?" and a query.
At the point where a relative URL is parsed, a base URL must be in scope.
Examples (brackets indicate optional)
path-relative URL [path segment][/[path segment]]…
about
about/staff.html
about/staff.html?
about/staff.html?parameters
absolute-path-relative URL: /[path-relative URL]
/
/about
/about/staff.html
/about/staff.html?
/about/staff.html?parameters
scheme-relative URL: //[userinfo#]host[:port][absolute-path-relative URL]
//username:password#example.com:8888
//username#example.com
//example.com
//example.com/
//example.com/about
//example.com/about/staff.html
//example.com/about/staff.html?
//example.com/about/staff.html?parameters
absolute URL: scheme:[scheme-relative URL][?parameters]
https://username:password#example.com:8888
https://username#example.com
https://example.com
https://example.com/
https://example.com/about
https://example.com/about/staff.html
https://example.com/about/staff.html?
https://example.com/about/staff.html?parameters
relative URL:
Anything from scheme-relative URL list
Anything from absolute-path-relative URL list
Anything from path-relative URL list
Note: This answer does not disagree with the first answer, but it was only somewhat clear to me that post answered the question after reading it several times and doing further research. Hopefully this answer spells it out better for others stumbling on this.

Base URL Setting Improperly

In my app, I scan a webpage, extract certain parts, and build an HTML String to load in a webview. Because of this, I have to set a base URL for links that can be clicked on. I currently use:
[webView loadHTMLString:self.html baseURL:[NSURL URLWithString:#"http://www.ocacademy.org/ocacademy"]];
The issue is that subsequent links only have http://www.ocacademy.org/ in front of them instead of with the subdirectory ocacademy. Any thoughts as to what is messing up here?
You need to either append a / or more explicitly /name_of_html_file.html to the base URL.
Any trailing filename in the base URL will be stripped off when constructing a relative URL (ocacademy in this case).
So given a base URL of http://www.ocacademy.org/ocacademy and a relative reference to image.png, the ocacademy is stripped off to give a parent directory http://www.ocacademy.org/ and the resulting URL will be http://www.ocacademy.org/image.png.
If the base URL is http://www.ocacademy.org/ocacademy/.* then the .* is stripped off before constructing the URL and you will get http://www.ocacademy.org/ocacademy/image.png (what you want).

Is the Scheme Optional in URIs?

I was recently asked to add some Woopra JavaScript to a website and noticed that the URL started with a double slash (i.e. omitted the scheme). I've never seen this before, so I went trying to find out more about it, but the only thing I could really find was an item on the Woopra FAQ:
The Woopra JavaScript in the Setup does not include http in the URL call for the script. This is correct. The JavaScript has been optimized to run very fast and efficiently on your site.
However, some validation and site testing/debugging services and tools do not recognize the code as correct. It is correct and valid. If the warnings annoy you, just add the http to the script’s URL. It will not impact the script.
(For clarification, the URL is "//static.woopra.com/js/woopra.v2.js"—the colon is omitted in addition to the "http".)
Is there any more information about this practice? If this is indeed valid, there must be a spec that talks about it, and I'd very much like to see it.
Thanks in advance for satisfying my curiousity!
This is a valid URL. It's called a "network-path reference" as defined in RFC 3986. When you don't specify a scheme/protocol, it will fall back to the current scheme. So if you are viewing a page via https:// all network path references will also use https.
For an example, here's a link to the RFC 3986 document again but with a network path reference. If you were viewing this page over https (although it looks like you can't use https with StackOverflow) the link will reflect your current URI scheme, unlike the first link.
See RFC 3986, section 3:
The generic URI syntax consists of a
hierarchical sequence of components
referred to as the scheme, authority,
path, query, and fragment.
URI = scheme ":" hier-part [ "?" query ] [ "#" fragment
]
hier-part = "//" authority path-abempty
/ path-absolute
/ path-rootless
/ path-empty
The scheme and path components are
required, though the path may be
empty (no characters).

Absolute urls, relative urls, and...?

I am writing some documentation and I have a little vocabulary problem:
http://www.example.com/en/public/img/logo.gif is called an "absolute" url, right?
../../public/img/logo.gif is called a "relative" url, right?
so how do you call this: /en/public/img/logo.gif ?
Is it also considered an "absolute url", although without the protocol and domain parts?
Or is it considered a relative url, but relative to the root of the domain?
I googled a bit and some people categorize this as absolute, and others as relative.
What should I call it? A "semi-absolute url"? Or "semi-relative"? Is there another word?
Here are the URL components:
http://www.example.com/en/public/img/logo.gif
\__/ \_____________/\_____________________/
#1 #2 #3
scheme/protocol
host
path
A URL is called an absolute URL if it begins with the scheme and scheme specific part (here // after http:). Anything else is a relative URL.
A URL path is called an absolute URL path if it begins with a /. Any other URL path is called a relative URL path.
Thus:
http://www.example.com/en/public/img/logo.gif is a absolute URL,
../../public/img/logo.gif is a relative URL with a relative URL path and
/en/public/img/logo.gif is a relative URL with an absolute URL path.
Note: The current definition of URI (RFC 3986) is different from the old URL definition (RFC 1738 and RFC 1808).
The three examples with URI terms:
http://www.example.com/en/public/img/logo.gif is a URI,
../../public/img/logo.gif is a relative reference with just a relative path and
/en/public/img/logo.gif is a relative reference with just an absolute path.
I have seen it called a root relative URL.
From the Microsoft's documentation about Absolute and Relative URLs
A URL specifies the location of a target stored on a local or networked computer. The target can be a file, directory, HTML page, image, program, and so on.
An absolute URL contains all the information necessary to locate a resource.
A relative URL locates a resource using an absolute URL as a starting point. In effect, the "complete URL" of the target is specified by concatenating the absolute and relative URLs.
An absolute URL uses the following format: scheme://server/path/resource
A relative URL typically consists only of the path, and optionally, the resource, but no scheme or server. The following tables define the individual parts of the complete URL format.
scheme - Specifies how the resource is to be accessed.
server - Specifies the name of the computer where the resource is located.
path - Specifies the sequence of directories leading to the target. If resource is omitted, the target is the last directory in path.
resource - If included, resource is the target, and is typically the name of a file. It may be a simple file, containing a single binary stream of bytes, or a structured document, containing one or more storages and binary streams of bytes.
It is sometimes called a virtual url, for example in SSI:
<!--#include virtual = "/lib/functions.js" -->
Keep in mind just how many segments of the URL can be omited, making them relative (note: its all of them, just about). These are all valid URLs:
http://example.com/bar?baz
?qoo=qalue
/bar2
dat/sly
//auth.example.com (most people are surprised by this one! Will use http or https, depending on the current resource)
#anchor

Resources