Our image server allows us to pass url parameters that trigger certain image manipulation operations. For instance the url https://www.example.com/i/example/10570250?layer0=[w=600&h=1000&bg=rgba(228,228,228,125)&cm=multi] scales the image and applies a blending mode.
The problem is that as soon as feed that string into an URL object, for example with URL(string: "https://www.example.com/i/example/10570250?layer0=[w=600&h=1000&bg=rgba(228,228,228,125)&cm=multi]") the brackets get escaped and the image server stops applying the operations.
The url becomes: https://www.example.com/i/example/10570250?layer0=%5Bw=600&h=1000&bg=rgba(228,228,228,125)&cm=multi%5D
Can I prevent URL from escaping the brackets? Or ist there some other way of escaping the url, before feeding it to URL(string:), that the image server might accept?
You´re not allowed to include [ ] in your URL. Consider changing your API call to make this work for you.
A host identified by an Internet Protocol literal address, version 6
[RFC3513] or later, is distinguished by enclosing the IP literal
within square brackets ("[" and "]"). This is the only place where
square bracket characters are allowed in the URI syntax. In
anticipation of future, as-yet-undefined IP literal address formats,
an implementation may use an optional version flag to indicate such a
format explicitly rather than rely on heuristic determination.
Reference.
Related
Colleagues from work have created API endpoint which uses language specific characters in url. This api url looks like
http://somedomain.com/someapi/somemethod/zażółć/gęślą/jaźń
Is this OK or is it a bad approach?
Technically, that's not a valid URL but web browsers and other clients finesse it. The script that characters are from is not an issue but structural characters like "/?#" could be. You'll have to consider what to do when they show up in data that you are "pasting" into your URLs.
An HTTP URL is:
an ASCII-encoded scheme (in this case the protocol "http")
a punycode-encoded, ASCII-encoded domain
a %-encoded, ASCII-encoded, server-defined sequence of octets for the path, optional query, and optional hash.
See RFC 3986
The assumption that everyone makes—quite reasonably because it is the predominant practice—is that the path, query, and hash are text. There is no text but encoded text. So, some character encoding is involved. Where %-encoding is needed outside of structural characters, browsers are going to assume UTF-8. If you don't want browsers to do the %-encoding, use valid URLs by doing it yourself with the character encoding that you are using.
As the world is standardizing on UTF-8 (where applicable), the HTML DOM has also with the encodeURIComponent function. Clients using JavaScript in a web browser are likely to use this function, either directly or through some library.
UTF-8 encoded, %-encoded (and, then on the wire, ASCII-encoded) version of your URL that my browser created:
http://somedomain.com/someapi/somemethod/za%C5%BC%C3%B3%C5%82%C4%87/g%C4%99%C5%9Bl%C4%85/ja%C5%BA%C5%84
(You can see this yourself using your browser's dev tools [F12 key, network tab] or a packet sniffer [e.g., Wireshark or Fiddler]. What you gave as a URL is never seen on the wire.)
Your server application probably understands that just fine. In any case, it is your server's rules that the client complies with. If your API uses UTF-8 encoded, %-encoded URLs then just document that. (But phrase it in a way that doesn't confuse people who do that already without knowing.)
Is there a way to make a GET request in Siesta, while providing parameter, like http://example.com/api/list.json?myparam=1?
I tried with
myAPI.resource("list.json?myparam=1")
but the question mark gets escaped.
Then I tried with
myAPI.resource("list.json").request(.GET, urlEncoded:["myparam": "1"])
but it always fails with "The network connection was lost.", but all other requests succeed, so the message is wrong.
You are looking for withParam:
myAPI.resource("list.json").withParam("myparam", "1")
The Service.resource(_:) method you are trying to use in your first example specifically avoids interpreting special characters as params (or anything except a path). From the docs:
The path parameter is simply appended to baseURL’s path, and is never interpreted as a URL. Strings such as .., //, ?, and https: have no special meaning; they go directly into the resulting resource’s path, with escaping if necessary.
This is a security feature, meant to prevent user-submitted strings from bleeding into other parts of the URL.
The Resource.request(_:urlEncoded:) method in your second example is for passing parameters in a request body (i.e. with a POST or PUT), not for parameters in the query string.
Note that you can always use Service.resource(absoluteURL:) to construct a URL yourself if you want to bypass Siesta’s URL component isolation and escaping features.
Just say I have the following url that has a query string parameter that's an url:
http://www.someSite.com?next=http://www.anotherSite.com?test=1&test=2
Should I url encode the next parameter? If I do, who's responsible for decoding it - the web browser, or my web app?
The reason I ask is I see lots of big sites that do things like the following
http://www.someSite.com?next=http://www.anotherSite.com/another/url
In the above, they don't bother encoding the next parameter because I'm guessing, they know it doesn't have any query string parameters itself. Is this ok to do if my next url doesn't include any query string parameters as well?
RFC 2396 sec. 2.2 says that you should URL-encode those symbols anywhere where they're not used for their explicit meanings; i.e. you should always form targetUrl + '?next=' + urlencode(nextURL).
The web browser does not 'decode' those parameters at all; the browser doesn't know anything about the parameters but just passes along the string. A query string of the form http://www.example.com/path/to/query?param1=value¶m2=value2 is GET-requested by the browser as:
GET /path/to/query?param1=value¶m2=value2 HTTP/1.1
Host: www.example.com
(other headers follow)
On the backend, you'll need to parse the results. I think PHP's $_REQUEST array will have already done this for you; in other languages you'll want to split over the first ? character, then split over the & characters, then split over the first = character, then urldecode both the name and the value.
According to RFC 3986:
The query component is indicated by the first question mark ("?")
character and terminated by a number sign ("#") character or by the
end of the URI.
So the following URI is valid:
http://www.example.com?next=http://www.example.com
The following excerpt from the RFC makes this clear:
... as query components are often used to carry identifying
information in the form of "key=value" pairs and one frequently used
value is a reference to another URI, it is sometimes better for
usability to avoid percent-encoding those characters.
It is worth noting that RFC 3986 makes RFC 2396 obsolete.
I have an orchard site and have the following problem:
If I use the URL: http://asiahotelct.com/tours/ct---chau-%C4%91oc---ha-tien-3n2%C4%91, it's okay. But when I change url the / to %2f (like so: http://asiahotelct.com/tours%2fct---chau-%C4%91oc---ha-tien-3n2%C4%91), it no longer works.
Why can / not be replaced by %2f?
Any url is a kind of complete address to some resource(file) in network. But according to the rules of how it must be actually (to work as you expect), its expected that a few characters must have some specific meaning; just like in this case: "/" means a separator that separates the individual elements of your address(url).
But in case you need such specific characters to be a part of any such element of address(url), we must encode it. List of codes
URL encoding converts characters into a format that can be transmitted
over the Internet.
- w3Schools
So, "/" is actually a seperator, but "%2f" becomes an ordinary character that simply represents "/" character in element of your url.
I am reworking on the URL formats of my project. The basic format of our search URLs is this:-
www.projectname/module/search/<search keyword>/<exam filter>/<subject filter>/... other params ...
On searching with no search keyword and exam filter, the URL will be :-
www.projectname/module/search///<subject filter>/... other params ...
My question is why don't we see such URLs with back to back slashes (3 slashes after www.projectname/module/search)? Please note that I am not using .htaccess rewrite rules in my project anymore. This URL works perfect functionally. So, should I use this format?
For more details on why we chose this format, please check my other question:-
Suggest best URL style
Web servers will typically remove multiple slashes before the application gets to see the request,for a mix of compatibility and security reasons. When serving plain files, it is usual to allow any number of slashes between path segments to behave as one slash.
Blank URL path segments are not invalid in URLs but they are typically avoided because relative URLs with blank segments may parse unexpectedly. For example in /module/search, a link to //subject/param is not relative to the file, but a link to the server subject with path /param.
Whether you can see the multiple-slash sequences from the original URL depends on your server and application framework. In CGI, for example (and other gateway standards based on it), the PATH_INFO variable that is typically used to implement routing will usually omit multiple slashes. But on Apache there is a non-standard environment variable REQUEST_URI which gives the original form of the request without having elided slashes or done any %-unescaping like PATH_INFO does. So if you want to allow empty path segments, you can, but it'll cut down on your deployment options.
There are other strings than the empty string that don't make good path segments either. Using an encoded / (%2F), \ (%5C) or null byte (%00) is blocked by default by many servers. So you can't put any old string in a segment; it'll have to be processed to remove some characters (often ‘slug’-ified to remove all but letters and numbers). Whilst you are doing this you may as well replace the empty string with _.
Probably because it's not clearly defined whether or not the extra / should be ignored or not.
For instance: http://news.bbc.co.uk/sport and http://news.bbc.co.uk//////////sport both display the same page in Firefox and Chrome. The server is treating the two urls as the same thing, whereas your server obviously does not.
I'm not sure whether this behaviour is defined somewhere or not, but it does seem to make sense (at least for the BBC website - if I type an extra /, it does what I meant it to do.)