I've been having some character encoding problems with twitter's text query string parameter.
a) http://www.twitter.com/share?url=http://www.example.com&text=touché
b) http://twitter.com/share?url=http://www.example.com&text=touché
a) seems to do extra encoding and the tweet comes out wrong
b) see lack of www works fine.
These both redirect to :
http://twitter.com/intent/tweet?text=touch%C3%A9&url=http%3A%2F%2Fwww.example.com
Is there a point in using http://twitter.com/share rather than simply just: http://twitter.com/intent
There is more information about the issue here and here. Use web intents without the www.
Twitter was double encoding characters in certain situations. Adding a www to the sharer url was one of those bugs. It was also happening with some of the other features as well.
Related
After researching Google and SO, there seems to be conflicting opinions on this.
We have run-in to a problem with Google Chrome substituting | separator as %7C, whereas Firefox and Safari do not.
Here's an example:
http://www.example.com/page1|sub-page2|sub-page-3
Are there any strict rules to follow when choosing a separator character for semantic URLs and are there any strong arguments against (or workarounds when) using |?
| is not a valid character in a URL. Modern browsers will silently encode it to %7C when sending, and may or may not display this change in the address bar. Similarly, servers will silently decode the character for you.
This would have been a problem in last millennium, where browsers would crash just because you didn't specify http://, but today you can just use whatever you want and the browser will take care of it. However, automatic parsers such as http://example.com/test|fish Markdown may not agree to it being a valid URL. In this case, it looks like it does, but try that on my forums and it will complain at you.
Internet explorer/chrome use url encoding when displaying the url in the address bar after a page request has been made, %7C is the safe way of displaying a pipe ('|')
so its not a problem that chrome is doing this.
as a cheeky fix to make all browsers behave the same way, why not use %7C as your separator from the get-go, instead of a pipe, and then all browsers should interpret this as a pipe for you behind the scenes, but display it as &7C in the address bar.
Google stopped crawling my webpage because my robots.txt file was inadvertently moved. It said I should try making sure it is there by going to the address: http://www.site.com//robots.txt. It had two slashes just like that. But it still works. It also works with three. What's up with that? Even if I can sort of see why it could be ignored—I'm not specifying any directory between the two—why would it be preferential to display a url like this, as the google webmasters' page does?
Most (all?) servers seem to allow several slashes directly after the hostname (not in other positions, though), see for example:
http://www.google.com//////////robots.txt
https://stackoverflow.com/////robots.txt
http://en.wikipedia.org////////////////////////robots.txt
(Related question: How to avoid multiple slashes after domain name in url using htaccess?)
However, when Google Webmaster Tools displays the URL with two slashes, you probably have set your domain in the GWT preferences with a trailing slash (http://example.com/ instead of http://example.com). See this question for Google Analytics (I guess it should be similar for GWT).
Have gone through https://dev.twitter.com/docs/streaming-apis/parameters
Per documentation it should be able to track URLs such as example.com/foobarbaz but I can't seem it to be tracking such URLs. It just doesn't return me any result when I tweet this URL and track it using Streaming API. Am I missing something?
Pretty late, but I found this by Google so this might help someone...
There are a few answers to this. The main answer being that Twitter treats URLs differently than anything else.
First, make sure you do NOT include the "www".
Twitter currently canonicalizes the domain “www.example.com” to “example.com” before the match is performed, so omit the “www” from URL track terms.
For me, sending the track parameter as "example.com/foobarz" and then tweeting "a test, please ignore: http://example.com/foobarz" worked perfectly.
You can NOT, in general, ask for substrings of URLs:
URLs are considered words for the purposes of matches which means that the entire domain and path must be included in the track query for a Tweet containing an URL to match.
But if you are willing to take every tweet from the whole domain (and a bit more edge cases), Twitter will accommodate:
Finally, to address a common use case where you may want to track all mentions of a particular domain name (i.e., regardless of subdomain or path), you should use “example com” as the track parameter for “example.com” (notice the lack of period between “example” and “com” in the track parameter).
All quotes are from the Twitter docs: https://dev.twitter.com/streaming/overview/request-parameters#track
They have more information, including examples.
Good luck!
I'm going to allow users to set an image with a link on my site. e.g. a profile picture and a profile link.
I will not let them upload said image, but let them give an url that i will insert into a img src.
I want to do basic checks for the best know xss patterns someone may use my site for, but thing is, i have no list of samples to check my functions works. As it is, even if I write a full RFC compliant parser to check every aspect of the URL, i will still not know what i should guard against.
I would do the following
Check that the URIs start with http:// or https://
URI encode the URI before printing it on the img src.
The first one is to make sure no javascript:, vbscript: etc. URLS are allowed. The second one is to escape any character that can cause damage (like ", ', <, > etc.).
Still a good resource for pattern, though a bit dated: http://ha.ckers.org/xss.html
Another great resource: http://html5sec.org
Scan it using the OWASP Zed Attack Proxy (ZAP).
ZAP is a free open source security tool, and its very good at finding XSS vulnerabilities.
Simon (ZAP Project Lead)
I'm using the twitter custom share button 'Build your own tweet button' (https://dev.twitter.com/docs/tweet-button).
The documentation says that I have to use the query parameters to pass on params.
PROBLEM: Twitter is encoding the text param wrong when I pass it a URL encoded string. A comma (,) is displayed as %252C in the tweet message. Other chars are also wrong encoded.
I use PHP url encode (http://php.net/manual/en/function.urlencode.php) to prepare the string for the call.
$text = urlencode("I just backed ". $project->getTitle().", an amazing new mobile app, on appbackr, where anyone can back mobile apps");
Then I build the twitter link:
'http://www.twitter.com/share?url='.urlencode($projectUrl).'&via='.$via.'&text='.$text.'&related='.$user->getTwitterProfileName()
The final twitter url call is:
http://www.twitter.com/share?url=http%3A%2F%2Flocalhost%2Fapp%2Fbig-top-ballet&via=appbackr&text=I+just+backed+Big+Top+Ballet%2C+an+amazing+new+mobile+app%2C+on+appbackr%2C+where+anyone+can+back+mobile+apps&related=philippberner
As soon as the page opens in the browser (Chrome and Firefox) twitter redirects the URL to:
https://twitter.com/intent/tweet?related=philippberner&text=I+just+backed+Big+Top+Ballet%252C+an+amazing+new+mobile+app%252C+on+appbackr%252C+where+anyone+can+back+mobile+apps&url=http%253A%252F%252Flocalhost%252Fapp%252Fbig-top-ballet&via=appbackr
This displays the following message in the tweet box:
I just backed Big Top Ballet%2C an amazing new mobile app%2C on appbackr%2C where anyone can back mobile apps via #appbackr
It converts Top+Ballet%2C+an+amazing to Top+Ballet%252C+an+amazing. The comma is displayed properly when I manually change %252C to %2C in the twitter URL.
Actually, it has nothing much to do with url encoding.
It actually works with urlencode, with rawurlencode, or even without url encoding
Try the following URLs on opening a new tab.
With urlEncode: http://twitter.com/share?url=http%3A%2F%2Fwww.appbackr.com%2Fapp%2Fglass-ceiling&via=appbackr&text=I+just+backed+Glass+Ceiling%2C+an+amazing+new+mobile+app%2C+on+appbackr%2C+where+anyone+can+back+mobile+apps&related=
With rawurlEncode: http://twitter.com/share?url=http%3A%2F%2Fwww.appbackr.com%2Fapp%2Fglass-ceiling&via=appbackr&text=I%20just%20backed%20Glass%20Ceiling%2C%20an%20amazing%20new%20mobile%20app%2C%20on%20appbackr%2C%20where%20anyone%20can%20back%20mobile%20apps&related=
Without urlEncode: http://twitter.com/share?url=http%3A%2F%2Fwww.appbackr.com%2Fapp%2Fglass-ceiling&via=appbackr&text=I just backed Glass Ceiling, an amazing new mobile app, on appbackr, where anyone can back mobile apps&related=
The trick actually lies in using twitter.com instead of www.twitter.com. Not sure why there is a difference and it does not seem to be documented anywhere in twitter's documentation nor in google's search results. Although to be fair, Twitter's documentation did point to twitter.com and not www.twitter.com.
As always, it is definitely best practice to always urlencode the text even though it works without url encoding.
Use:
$text = rawurlencode("I just backed ". $project->getTitle().", an amazing new mobile app, on appbackr, where anyone can back mobile apps");
Then, use this URL:
'https://twitter.com/intent/tweet?url='.urlencode($projectUrl).'&via='.$via.'&text='.$text.'&related='.$user->getTwitterProfileName()