How can I use at symbol (#) in url? - url

For example I meet this url type: http://username:token#example.com/protected/files .
I searched on the web for this but I don't find what I expected.

Wikipedia explains the syntax of a URI quite well:
scheme:[//[user[:password]#]host[:port]][/path][?query][#fragment]

Your username:token#example.com/protected/files may look like and URL, but if fact it is not, because it does not include the protocol to access the data. It is an uri.
Browsers (I suppose you refer to web browsers) do work with URL's, which is a subtype of URI that includes the protocol. Please notice web browsers dont work with all existing protocols, only with some of them (http, https, ftp, file...) username is not a protocol.

Related

Is a protocol (eg. http or https) required for a URL to be valid?

Recently I came across a lot of code from analytics plugins where they specify the URL as //fonts.googleapis.com or //www.google.com.
Basically it starts with two forward slashes and then the domain or subdomain. These links work fine in browsers. I have read the following documents, but I am still not sure if above can be called valid URLs (basically should these be reported as broken URLs or not).
https://developer.mozilla.org/en-US/docs/Web/API/URL and
https://url.spec.whatwg.org/
Is there a standard specification that I can refer to?
They're both valid scheme-relative-URL strings, although they need to be in the context of a Base URL to be meaningful. When used within a web page, the web page will provide the Base URL context.
Although there are other, earlier standards for URLs, the whatwg document represents the most up-to-date, web compatible definition.

How can http:// be changed to abc:// and can wireshark capture this traffic?

I've seen this before, but never knew how it is accomplished.
What is the http for? Does it direct my request? Is this related to MIME Types? How is it like saying ftp:// ?
http:// ftp:// file:// etc. are some of many URI Schemes
You're not mentioning any specific application so it's hard to answer your questions. Basicaly the URI scheme tells the application that handles to URI what is the URI for and what protocol should be used.
For example the web browsers support many protocols including HTTP, FTP, direct local file access etc. You can tell your browser to open file://path/to/local/file.html and it'll access the file from disk. You can also tell it to open ftp://server/path/to/file.html and it'll load the file from FTP server.
It's allowed to have any scheme you like in your application. For example a lot of mobile applications handle their URI schemes like fb:// for facebook or instagram:// for instagram.
Wireshark can capture any network traffic regardless of the URI scheme used. It works on low network layer and can capture even 'raw' wifi or ethernet traffic (that's huge simplification - please refer to the course mentioned in my profile bio)

How to implement Schema.org on HTTPS pages?

Is it correct to statically set up Microdata’s itemtype attribute with HTTP value (http://schema.org/WebPage) on HTTPS pages or do I need to use HTTPS value (https://schema.org/WebPage) on all pages?
Since both HTTP and HTTPS versions of the site are available, can I set it up to //schema.org/WebPage or not?
tl;dr: Use http URIs.
In this answer on Webmasters SE I explained why you should favor http over https Schema.org URIs: The http URIs seem to be canonical, as the actual definition of the Schema.org vocabulary only defines http, not https. In addition: all examples (even on HTTPS) use the HTTP variant, the authors mentioned that they prefer to see the use of the HTTP variant, and RDFa’s Initial Context defines the HTTP variant only (so most of the RDF world will use HTTP).
In this answer on Webmasters SE I explained why you should not use protocol-relative URIs for vocabularies: Vocabulary URIs typically don’t get dereferenced, and there will never get something embedded from a vocabulary, so there is absolutely no need to use HTTPS for these just because you use HTTPS (it’s similar to simply linking to an external page, which might not even be accessible via HTTPS). On top of that, your Schema.org markup would no longer work if the document is accessed via a different protocol than HTTP/HTTPS, and it’s likely that some parsers won’t be able to recognize that you are using the Schema.org vocabulary because they might look for full URIs without applying URI resolution for the itemtype attribute.
There's been an update to that answer on Webmasters SE (dated November 2015), with a link to the schema.org FAQ about https:
Q: Should we write https://schema.org or http://schema.org in our markup?
The short of it is that schema.org will be moving to https, and you can use https URLs now, but there's no rush to switch.
Regarding protocol-relative URLs… please don't use them as they're a hack. Favor use of absolute or root-relative URLs whenever hyperlinking documents on the Web.
Is it correct to statically set up Microdata’s itemtype attribute with HTTP value [...]?
Either HTTP or HTTPS is fine in your itemtype according to the Schema.org FAQ. Your examples containing HTTP and HTTPS schemes are both correct for pages served with and without TLS.
If you've got a mix of absolute URLs pointing to different schemes it's more likely a person will notice it and wonder why things aren't consistent. So when you update refactor your existing itemtypes.

Are there any URL shortening services that support protocol relative urls?

I need to shorten a protocol relative url such as //www.example.com/longurl into another protocol relative url //short/url so an iframe can be embedded on a page independent of the used protocol. Are there any free URL shortening services that allow this? I've tried isgd, tinyurl, googl, bitly, owly among others, but they all either don't accept the url or prepend it with http:// so it turns into http:////www.example.com/longurl.
It's undocument, but it turns out that goo.gl supports this, you just have to modify the URL it gives you, although it seems the URLs I tried were longer than what they accept.

Why can protocol be omitted from absolute paths on a webpage?

I recently ran across a website that had some interesting styling on a select element. I went to investigate and found this (names changed to protect the innocent):
<script type="text/javascript" src="//www.domain.tld/file.js"></script>
It works despite HTTP: being omitted. What is the purpose of leaving off the protocol?
It will use the protocol you're already using. Useful for sites with both https and http versions.
So if you're on https://www.domain.tld/file.js the script will be https://www.domain.tld/file.js.
If you're on http://www.domain.tld/ the script will be http://www.domain.tld/file.js.
i believe this is short hand for a relative path to the protocol. So it should use the same protocol as is being used for that session. e.g if you grabbed that page with http, then this url is relative to http protocol
The purpose is that the scheme (ie. http or https) can be determined relative to the containing page. This is useful if you have a common piece of code included in multiple pages that can be served via http or https.
The purpose is to "use the same protocol as in the current URL" -- presumably (?) useful if the page can be reached both as http: and https: (I have a hard time thinking of other protocols yet that it might be useful for, and even this one is not a clear-cut use case).

Resources