Maximum length of URL fragments (hash) - url

Is there a length limit for the fragment part of an URL (also known as the hash)?

The hash is client side only, so the rules for HTTP may not apply to it.

It depends on the browser.
I found that in safari, chrome, and Firefox, an URL with a long hash is legal, but if it is a param send to the server, the browser will display an 414 or 413 error.
for example:
an URL like http://www.stackoverflow.com/?abc#{hash value with 100 thousand characters} will be ok. and you can use location.hash to get the hash value in javascript but an URL like http://www.stackoverflow.com/?abc&{query with 100 thousand characters} will be illegal, if you paste this link in the address bar, a 413 error code will be given and the message is the client issued a request that was too long. If that is a link in a web page, in my computer, Nginx response the 414 error message.
I don't know the situation in IE.
So I think, the limitation of the length of URL is just for transmission or HTTP server, the browser will check it sometimes, but not every time, and it will always be allowed to be used as a hash.

There is definitely a length for the whole url.
Read
RFC2616 - Hypertext Transfer Protocol
Maximum URL length is 2,083 characters in Internet Explorer

Related

HAProxy URL length limit

I have an application which makes GET request of length 18k characters. If this requests goes through HAProxy then I get immediately 400. If I hit directly my service everything is fine. Is there a parameter in HAProxy which sets maximum length of URL request in HAProxy?
Thanks in advance

Request URI too long via POST request

I'm sending a POST request via Net as such:
http = Net::HTTP.new(mixpanel_endpoint.host, mixpanel_endpoint.port)
request = Net::HTTP::Post.new(mixpanel_endpoint.request_uri)
http.request(request)
The issue is that the request_uri is over the max limit. It's a BASE64 encoded string.
Does anybody know what to do about this?
<Net::HTTPRequestURITooLong 414 Request URI Too Long readbody=true>
Net::HTTPRequestURITooLong is a 414 HTTP code from the server, you will need to change the request to conform to what the endpoint allows.
10.4.15 414 Request-URI Too Long
The server is refusing to service the request because the Request-URI
is longer than the server is willing to interpret. This rare condition
is only likely to occur when a client has improperly converted a POST
request to a GET request with long query information, when the client
has descended into a URI "black hole" of redirection (e.g., a
redirected URI prefix that points to a suffix of itself), or when the
server is under attack by a client attempting to exploit security
holes present in some servers using fixed-length buffers for reading
or manipulating the Request-URI.
reference: https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
Are you adding the data directly to the URL?
Try splitting out the endpoint URL from the data. For example:
Net::HTTP::Post.new(request_endpoint, "whatever_param_value=#{base64_encoded_data}")

400 code error when URL contains % symbol? (NGINX)

How to prevent a server from returning an error 400 code error when the URL contains % symbol using NGINX server?
Nginx configuration for my website:
....
rewrite ^/download/(.+)$ /download.php?id=$1 last;
....
When I tried to get access to this URL:
http://mywebsite.net/download/some-string-100%-for-example
I got this error:
400 Bad Request
With this url :
http://mywebsite.net/download/some-string-%25-for-example
it's work fine !
It's because it needs to be URL encoded first.
This will explain:
http://www.w3schools.com/tags/ref_urlencode.asp
URLs can only be sent over the Internet using the ASCII character-set.
Since URLs often contain characters outside the ASCII set, the URL has to be converted into a valid ASCII format.
URL encoding replaces unsafe ASCII characters with a "%" followed by two hexadecimal digits.
URLs cannot contain spaces. URL encoding normally replaces a space with a plus (+) sign or with %20.
The URL interpreter is confused to see a % without hexadecimals after it.
Why would you think of solving by changing Nginx configuration???
It's impossible to solve from the server side. It's a problem from the client side.
https://headteacherofgreenfield.wordpress.com/2016/03/23/100-celebrations/
In that URL, the title is 100% Celebrations! but the permalink is autogenerated to 100-celebrations. It's because they know putting 100% will cause a URL encode problem.
If even Wordpress doesn't do it your way, then why should you do it?

Google docs API: can't download a file, downloading documents works

I'm trying out http requests to download a pdf file from google docs using google document list API and OAuth 1.0. I'm not using any external api for oauth or google docs.
Following the documentation, I obtained download URL for the pdf which works fine when placed in a browser.
According to documentation I should send a request that looks like this:
GET https://doc-04-20-docs.googleusercontent.com/docs/secure/m7an0emtau/WJm12345/YzI2Y2ExYWVm?h=16655626&e=download&gd=true
However, the download URL has something funny going on with the paremeters, it looks like this:
https://doc-00-00-docs.googleusercontent.com/docs/securesc/5ud8e...tMzQ?h=15287211447292764666&amp\;e=download&amp\;gd=true
(in the url '&amp\;' is actually without '\' but I put it here in the post to avoid escaping it as '&').
So what is the case here; do I have 3 parameters h,e,gd or do I have one parameter h with value 15287211447292764666&ae=download&gd=true, or maybe I have the following 3 param-value pairs: h = 15287211447292764666, amp;e = download, amp;gd = true (which I think is the case and it seems like a bug)?
In order to form a proper http request I need to know exectly what are the parameters names and values, however the download URL I have is confusing. Moreover, if the params names are h,amp;e and amp;gd, is the request containing those params valid for obtaining file content (if not it seems like a bug).
I didn't have problems downloading and uploading documents (msword docs) and my scope for downloading a file is correct.
I experimented with different requests a lot. When I treat the 3 parameters (h,e,gd) separetaly I get Unauthorized 401. If I assume that I have only one parameter - h with value 15287211447292764666&ae=download&gd=true I get 500 Internal Server Error (google api states: 'An unexpected error has occurred in the API.','If the problem persists, please post in the forum.').
If I don't put any paremeters at all or I put 3 parameters -h,amp;e,amp;gd, I get 302 Found. I tried following the redirections sending more requests but I still couldn't get the actual pdf content. I also experimented in OAuth Playground and it seems it's not working as it's supposed to neither. Sending get request in OAuth with the download URL responds with 302 Found instead of responding with the PDF content.
What is going on here? How can I obtain the pdf content in a response? Please help.
I have experimented same issue with oAuth2 (error 401).
Solved by inserting the oAuth2 token in request header and not in URL.
I have replaced &access_token=<token> in the URL by setRequestHeader("Authorization", "Bearer <token>" )

What URL encoding irregularities do browsers generally tolerate?

Web browsers tend to do their best to recover malformed URLs.
Let's start with a baseline google query.
http://www.google.com/search?q=myquery
Which results in my browser (recent-ish build of Chrome) requesting.
GET http://www.google.com/search?q=myquery HTTP/1.1
Fully expected behavior obviously.
Let's try putting an unescaped space into the mix.
http://www.google.com/search?q=my query
GET http://www.google.com/search?q=my%20query HTTP/1.1
What if we use the % character? Because it's not followed by a valid character code the browser should escape it to %25
http://www.google.com/search?q=i always give 100%
GET http://www.google.com/search?q=i%20always%20give%20100% HTTP/1.1
Chrome didn't escape the %!
Is space substitution the only URL transformation an average browser will/is expected to perform? Are there libraries for performing these kinds of URL "salvaging" transformations?

Resources