Strange encoding of non-ASCII character in URL - url

My mother recently received the following scam message on Whatsapp (I have added multiple extra m's to the end of the link below in case of accidental clicks. Obviously the original link itself ended only with .com):
British Airways is giving free 5000 tickets to celebrate its birthday. Get your free ticket at : http://www.briṭishairways.commmmmmmm/
It seems to be a legitimate link to the British Airways URL, especially since Whatsapp doesn't allow link obfuscation (i.e. sending someone to SO by clicking http://www.google.com isn't possible).
However, a careful look will show that the t is in fact a ṭ (Latin T with dot below). Also, if one either hovers the mouse over the link in Chrome, the URL that appears at the bottom-left corner of the screen is in fact http://www.xn--briishairways-rt1g.commmmmmmm/. This is also the output from doing a right-click > Copy Link Address. (Try it yourself!)
Also, if I edit the body of the link, the rt1g part changes, as if it's a counter for where to put the dot. For example:
briṭishairways = xn--briishairways-rt1g
briṭishairway = xn--briishairway-vk5f
riṭishairway = xn--riishairway-yb9e
rṭishairay = xn--rishairay-5s6d
What's especially odd is that the Wikipedia link I used for Latin T with dot below also uses the same character (well, the capitalized version of it), and the URL shown on mouse-hover does not have this effect. (Try it yourself!)
What is going on here?

Related

Do URL specifications say that the #anchor must appear after the ?parameters

I've noticed that
https://someurl.com/page.html#anchor?utm_campaign=xyz
Does not correctly find the anchor in the page, but
https://someurl.com/page.html?utm_campaign=xyz#anchor
does work.
I've looked through https://datatracker.ietf.org/doc/html/rfc3986 and as far as I can see, there's nothing specific about where the #anchor needs to appear, only that the # symbol denotes the anchor in the same way that ? denotes the start of URL params.
So, where would I find in the specifications that it must appear at the end? Or are my browsers secretly non-conformant? I'm asking so that I can provide evidence to a customer one way or another.

UITextView, with spell checking, how to use `ignoreWord`?

Regarding the spell checking in iOS, it's possible to tell the checker to ignore a word (or learn a word),
https://developer.apple.com/documentation/uikit/uitextchecker
func ignoreWord(String)
Tells the receiver to ignore the specified word when spell-checking.
- Apple doco
Say I have a UITextView which opens. I want spell checking On.
I know the user may type "fattie" which would get the red underline.
How do I tell that text view in that instance, to, ignore "fattie" ?
An obvious use case ...
User is typing in "#tag" type friends; in our data of course we know what all the tags are, it's absurd they get marked as spelling errors.
It seems incredible one can't just say "don't underline these words - - list".
Code example ....
So we have
var t: UITextView
and then, there must be "some way" to:
yourTextView.something->something.textChecker.ignoreWord("fattie"
.. some way to get to the text view's textChecker instance! How ?!?!
Partial answer: I just stumbled on to that, bizarrely, you can just call
UITextChecker.learnWord("fattie")
UITextChecker.learnWord("blahdee")
from, apparently, just anywhere in an app.
However this raises many issues,
• How to call the 'ignore' one, which seems better
• That one still makes the user tap the annoying, stupid, "in quotes" OK box in the suggestions bar - it seems to have not really "learned" anything
• Disturbingly, I think this goes for the "WHOLE PHONE". I only want it in that instance of the user using that text view.
A mystery!

Automatically decode URLs in Notepad++

I am working with a lot of URL links which I need to decode.
I want to write a macro (or use any other method, really, whatever is easiest) attached to a keyboard shortcut which will automatically decode the urls into readable text.
For example, I want to press CTRL+A and have the result be that all %20 instances are replaced with a space, all & are replaced with an ENTER (\n, going down one row), all %27 replaced with ', etc.
Is there a way to accomplish this in Notepad++?
So far I have been manually changing at least 2-3 codes each time, but it's maddening as I have hundreds of such URLs to work with.
The URLs are sent to me automatically one at a time by a "report broken link" function and arrive as an openURL, example attached below.
Thank you!
URL examples:
ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&url_ver=Z39.88-2004&rfr_id=info%3Asid%2FElsevier%3ASD&svc_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Asch_svc&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.aulast=GILABERTE&rft.auinit=Y&rft.date=2014&rft.issn=00017310&rft.volume=105&rft.issue=3&rft.spage=253&rft.epage=262&rft.title=Actas%20Dermo-Sifiliogr%C3%A1ficas&rft.atitle=Realidades%20y%20retos%20de%20la%20fotoprotecci%C3%B3n%20en%20la%20infancia&rft_id=info%3Adoi%2F10.1016%2Fj.ad.2013.05.004
And this is how it looks like after changing & and %20:
ctx_ver=Z39.88-2004
ctx_enc=info%3Aofi%2Fenc%3AUTF-8
url_ver=Z39.88-2004
rfr_id=info%3Asid%2FElsevier%3ASD
svc_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Asch_svc
rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal
rft.aulast=GILABERTE
rft.auinit=Y
rft.date=2014
rft.issn=00017310
rft.volume=105
rft.issue=3
rft.spage=253
rft.epage=262
rft.title=Actas Dermo-Sifiliogr%C3%A1ficas
rft.atitle=Realidades y retos de la fotoprotecci%C3%B3n en la infancia
rft_id=info%3Adoi%2F10.1016%2Fj.ad.2013.05.004
As you can see this is much more readable for extracting details, but there are still many more codes in there that need changing.
Lets say you have your single URL or some more in a text file line by line you'd go through following steps:
Ctrl+A
Plugins > MIME Tools > URL Decode
Ctrl+H
Find what: &
Replace with: \n
Search mode: Regular expression
Click on Replace All
If you really need a shortcut like e.g. Ctrl+Alt+A you may use:
Macro > Start Recording the steps from above and Save Currently Recorded Macro when finished.
The snaps below show your sample source and the resulting text.

How to make web site iPad ready? [duplicate]

How does the Reader function of Mobile Safari in iOS 5 work? How do I enable it on my site. How do I tell it what content on my page is an article to trigger this function?
A lot of the answers posted here contain false information. Here are some corrections/clarifications:
The <article> element works fine as a wrapper; Safari Reader recognizes it. My site is an example. It doesn’t matter which wrapper element you choose, as long as there is one, other than <body> or <p>. You can use <article>, <div>, <section>; or elements that are semantically incorrect for this purpose, like <nav>, <aside>, <footer>, <header>; or even inline elements like <span> (!).
No headings are required for Reader to work. Here’s an example of a document without any <h*> elements on which Reader works fine: http://mathiasbynens.be/demo/safari-reader-test-3
I posted some more details regarding my findings here: http://mathiasbynens.be/notes/safari-reader
I've tested 100 or so variations of this on my iPhone in order to figure out what triggers this elusive Reader state. My conclusions are as follows:
Here is what I found had an impact:
Having around 200 or more words (or 1000 characters including whitespace) in the article you want to trigger the "Reader" seems necessary
The reader was NEVER triggered when I had less than 170 words; although it was sometimes triggered when I had 180 or 190 words.
Text inside certain elements such as <ol> or <ul> (that are not typically used to contain a story) will not count towards the 200 words (they will however be displayed in the reader if the reader is triggered for other reasons)
Wrapping the 200 words in a block element such as a <div> or <article> seems necessary (that said, I'd be surprised if there were any websites where that was not already the case)
For full disclosure, here is what I found did NOT have an impact:
Whether using a header or not
Whether wrapping the text in a <p> or letting it flow freely
Punctuations (ie removing all periods, commas, etc, did not have an impact)
It seems the algorithm it is based on is looking for p-Tags and it counts delimiters like "." in the innerText. The section (div) with the most points gets the focus.
see:
http://lab.arc90.com/experiments/readability/
Seems to be the base for the Reader-mode, at least Safari attributes it in the Acknowledgements, see:
file:///C:/Program%20Files/Safari/Safari.resources/Help/Acknowledgments.html
Arc90 ( Readability )
Copyright © Arc90 Inc.
Readability is licensed under the Apache License, Version 2.0.
This question (How to disable Safari Reader in a web page) has more details. Copied here:
I'm curious to know more about what triggers the Reader option in Safari and what does not. I wouldn't plan to implement anything that would disable it, but curious as a technical exercise.
Here is what I've learned so far with some basic playing around:
You need at least one H tag
It does not go by character count alone but by the number of P tags and length
Probably looks for sentence breaks '.' and other criteria
Safari will provide the 'Reader' if, with a H tag, and the following:
1 P tag, 2417 chars
4 P tags, 1527 chars
5 P tags, 1150 chars
6 P tags, 862 chars
If you subtract 1 character from any of the above, the 'Reader' option is not available.
I should note that the character count of the H tag plays a part but sadly did not realize this when I determined the results above. Assume 20+ characters for H tag and fixed throughout the results above.
Some other interesting things:
Setting for P tags removes them from the count
Setting display to none, and then showing them 230ms later with Javascript avoided the Reader option too
I'd be interested if anyone can determine this in full.
Both Firefox and Chrome have the similar plugin named iReader. Here is its project with source code.
http://code.google.com/p/ireader-extension/
Read the code to get more.
I was struggling with this. I finally took out the <ul> markings in my story, and viola! it started working.
I didn't put any wrapper around the body, but may have done it by accident.
HTML5 article tag doesn't trigger it on my tests. It also doesn't seem to work on offline content (i.e. pages saved on your local machine).
What does seem to trigger it is a div block with a lot of p's with a lot of text.
The p tag theory sounds good. I think it also detects other elements as well. One of our pages with 6 paragraphs didn't trigger the Reader, but one with 4 paragraphs and an img tag did.
It's also smart enough to detect multi-page articles. Try it out on a multi-page article on nytimes.com or nymag.com. Would be interested to know how it detects that as well.
Surprising though it may be, it indeed does not pay any attention to the HTML5 article tag, particularly disappointing given that Safari 5 has complete support for article, section, nav, etc in CSS--they can be styled just like a div now, and behave the same as any block level element.
I had specifically set up a site with an article tag and several inner section tags, in prep for semantic HTML5 labeling for exactly such a purpose, so I was really hoping that Safari 5 would use that for Reader. No such luck--probably should file a bug on this, as it would make a great deal of sense. It in fact completely ignores most of the h2 level subheads on the page, each marked as a section, only displaying the single div that adheres to the criteria mentioned previously.
Ironically, the old version of the same site, which has neither article, section, nor separating div tags, recognizes the whole body for display in Reader.
See Article Publishing Guidelines.
Here are APIs about how to read and parse: Readability Developer APIs. There's already a project you can refer: ruby-readability.
A brief history:
The Safari Reader feature since Apple's Safari 5 browser embeded a codebase named Readability, and Readability started off as a simple, Javascript-based reading tool that turned any web page into a customizable reading view. It was released by Arc90 (as an Arc90 Lab experiment), a New York City-based design and technology shop, back in early 2009. It's also embeded in Amazon Kindle and popular iPad applications like Flipboard and Reeder.
I am working on algorithms for cleaning web-sites from information "waste" similar to Safari Reader feature. It's not so good as readability but has some cool stuff.
You can learn more at smartbrowser.codeplex.com project page.

special character coming when i am using & and p

I dont know what exactly i have to type in title for this ,i tried my best
anyway coming to topic
I am making one acc checker for that purpose ,i am sending user and pass from my bsskinedit1 and bsskinedit2
here is my code
s:='http:\\site.com..premlogin='+bsskinedit.text+'&password='+bsskinedit2.text
but it giving some error ,then i used showmessage whats wrong with it then i came with strange result
see below
observer after 4 & and p combining together and appearing as a some new symbol :(
can any one tell me why its coming like this ?
Your code (where you build the URL) is most likely correct (I guess the above has some typos?!), but when you display the URL in a label for instance, the & character is treated as indicator for an accelerator key.
By Windows design, accelerator keys
enable the user to access a menu
command from the keyboard by pressing
Alt along the appropriate letter,
indicated in your code by the
preceding ampersand (&). The character
after the and sign (&) appears
underlined in the menu.
If you want to display the & character itself, you have to set your string variable to &&.
By placing two ampesands together - you state that the character following the first one is not used as the accelerator - rather you actually want to get it (only one) displayed.
Just use your debugger if you want to see the real value that your string variables have, don't output them to a message box or the like... It may have side effects, as you can see.
Regarding the URL you build: I can't possibly know how it has to be correctly, but at least you should use the right slashes!
s := 'http://site.com...'
(All quotes from delphi.about.com)
In addition to what Mef said, you can use OutputDebugString to add your string to the event log in its raw form, so you don't need to modify it before displaying it. Delphi should capture those strings automatically if you're running from the debugger. If you aren't running it from Delphi you can use DebugView instead, which captures the messages from any running applications.

Resources