This question already has answers here:
Is it valid to have more than one question mark in a URL?
(2 answers)
Closed 8 years ago.
http://xyz.com/packagesearch?cadu1=2&drtn1=05/08/2012&qryt=8&sort=10&drid1=1639&dlvl&rdct=1&star=30&subm=1&subm=1&inttkn=Dul0p4RNrlTnd61R&dsct&cmbt=2?dnam&tdpt1=362&ffst=0&rtmx&trtn1=362&tair1=IST&dcty=PAR&mcicid=174390028&rtmn&ddpt1=02/08/2012?stop_mobi=yes
what exactly this '?' does? can i use it multiple times or '&' is the only option to pass multiple parameter when '?' is already used once?
note: occurrence marked as bold.
The ? character in a URL signifies the start of the "request parameters", or "query string". Additional parameters after that have to start with &. You can develop your own way of handling "query strings", but most programming/scripting languages I know of already have built in ways of dealing with them, so it is generally easier to use the existing tools.
From http://en.wikipedia.org/wiki/Query_string
When a server receives a request for such a page, it runs a program
(if configured to do so), passing the query_string unchanged to the
program. The question mark is used as a separator and is not part of
the query string.
As a result, ? should only be used once.
Related
This question already has an answer here:
Where to find the http url scheme rfc
(1 answer)
Closed 1 year ago.
If there is a link: https://www.example.com/?Test=Im+A+Test&Data=2+Plus+2
what is the last part called "?Test=" and "?Test="
Anything after the ? but before the (optional #) is known, collectively, as the "query string".
Within that there are individual names and values of the query parameters in the format name=value
You can find more out about URLs in many places online including here
This question already has answers here:
Why should I use urlencode?
(8 answers)
Closed 8 years ago.
When submit a form while using the GET action method, changed the + token thats insert in the textfield to %2B. But why the url do this? Even other tokens like * and % will be chance.
I also wonder of this applies for the security or other things, but what are thee?
Check out what W3Schools says about URL encoding. I think it will help you out.
http://www.w3schools.com/tags/ref_urlencode.asp
Here is an exerpt:
URLs can only be sent over the Internet using the ASCII character-set.
Since URLs often contain characters outside the ASCII set, the URL has
to be converted into a valid ASCII format.
URL encoding replaces unsafe ASCII characters with a "%" followed by
two hexadecimal digits. URLs cannot contain spaces. URL encoding
normally replaces a space with a plus (+) sign or with %20.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
Hashtags sometimes combine two or more words, such as:
content marketing => #contentmarketing
If I have a bunch of hashtags assigned to an article, and the word is in that article, i.e. content marketing. How can I take that hash tag, and detect the word(s) that make up the hashtag?
If the hashtag is a single word, it's trivial: simply look for that word in the article. But, what if the hash tag is two or more words? I could simply split the hashtag in all possible indices and check if the two words produced were in the article.
So for #contentmarketing, I'd check for the words:
c ontentmarketing
co ntentmarketing
con tentmarketing
...
content marketing <= THIS IS THE ANSWER!
...
However, this fails if there are three or more words in the hashtags, unless I split it recursively but that seems very inelegant.
Again, this is assuming the words in the hash tag are in the article.
You can use a regex with an optional space between each character to do this:
your_article =~ /#{hashtag.chars.to_a.join(' ?')}/
I can think of two possible solutions depending on the requirements for the hashtags:
Assuming hashtags must be made up of words and can't be non-words like "#abfgtest":
Do the test similar to your answer above but only test the first part of the string. If the test fails then add another character and try again until you have a word. Then repeat this process on the remaining string until you have found each word. So using your example it would first test:
- c
- co
- ...
- content <- Found a word, start over with rest
- m
- ma
- ...
- marketing <- Found a word, no more string so exit
If you can have garbage, then you will need to do the same thing as option 1. with an additional step. Whenever you reach the end of the string without finding a word, go back to the beginning + 1. Using the #abfgtest example, first you'd run the above function on "abfgtest", then "bfgtest", then "fgtest", etc.
As already pointed out in the topic, I got the following error:
Character #\u009C cannot be represented in the character set CHARSET:CP1252
trying to print out a string given back by drakma:http-request, as far as I understand the error-code the problem is that the windows-encoding (CP1252) does not support this character.
Therefore to be able to process it, I might/must convert the whole string.
My question is what package/library does support converting strings to certain character-sets efficiently?
An alike question is this one, but just ignoring the error would not help in my case.
Drakma already does the job of "converting strings": after all, when it reads from some random webserver, it just gets a stream of bytes. It then has to convert that to a lisp string. You probably want to bind *drakma-default-external-format* to something else, although I can't remember off-hand what the allowable values are. Maybe something like :utf-8?
I came across the following URL today:
http://www.sfgate.com/cgi-bin/blogs/inmarin/detail??blogid=122&entry_id=64497
Notice the doubled question mark at the beginning of the query string:
??blogid=122&entry_id=64497
My browser didn't seem to have any trouble with it, and running a quick bookmarklet:
javascript:alert(document.location.search);
just gave me the query string shown above.
Is this a valid URL? The reason I'm being so pedantic (assuming that I am) is because I need to parse URLs like this for query parameters, and supporting doubled question marks would require some changes to my code. Obviously if they're in the wild, I'll need to support them; I'm mainly curious if it's my fault for not adhering to URL standards exactly, or if it's in fact a non-standard URL.
Yes, it is valid. Only the first ? in a URL has significance, any after it are treated as literal question marks:
The query component is indicated by
the first question mark ("?")
character and terminated by a number
sign ("#") character or by the end of
the URI.
...
The characters slash ("/") and
question mark ("?") may represent data
within the query component. Beware
that some older, erroneous
implementations may not handle such
data correctly when it is used as the
base URI for relative references
(Section 5.1), apparently because they
fail to distinguish query data from
path data when looking for
hierarchical separators. However, as
query components are often used to
carry identifying information in the
form of "key=value" pairs and one
frequently used value is a reference
to another URI, it is sometimes better
for usability to avoid
percent-encoding those characters.
https://www.rfc-editor.org/rfc/rfc3986#section-3.4
As a tangentially related answer, foo?spam=1?&eggs=3 gives the parameter spam the value 1?