I am unlucky to be in charge of maintaining some old Yahoo! Store built using their RTML-based platform.
Recently I've noticed that HTML code generated by some RTML functions is sprinkled all over with "padding images" (or whatever is the conventional name for those 1x1 pixel images used to enforce layout). I have nothing against using such images, but... all those images are supplied with an ALT attribute like this:
<img href="http://.../image1x1.gif" alt="pad">
With all due respect to the original authors of RTML, but they must have been smoking something when they came up with this "accessibility enhancement"... :-(
Anyway, here are my questions:
Does anybody know a list of all RTML functions that generate HTML with all these "pad" images?
Is there any way to get rid of all those alt="pad" attributes without rewriting a lot of RTML code?
NB: This may sound a little cynical, but improved accessibility is not the main goal here. The main goal is to stop exposing those moronic alt="pad" attributes to Google and other smart search engines. So client-side scripting is not going to help, as far as I know.
Thank you!
P.S. Probably, most of you are really lucky and never heard of RTML. Because if somebody would establish a prize for software products based on
commercial success
------------------
usability
ratio, this RTML-based "platform" would probably win the first place.
P.P.S. Apparently someone from Yahoo! finally listened, because I can no longer find those silly "pad" tags in the RTML generated for our store. Nevertheless, one of the ideas offered in response to my original question does provide a very practical solution - not just to the original problem but to any similar problem with RTML platform. See the winning answer - it's really good.
The only way I see is to have your own website front-end that will filter whatever you want from the RTML site....
for example, your rtml site is at http://rtmlusglysite.yahoo.com/store/XYZ01134 , you could host a simple PHP front-end at http:://www.example.com that would be acting like a "filtering" HTTP web proxy, so http://rtmlusglysite.yahoo.com/store/XYZ01134/item1234.rtml would be accessed by http://www.example.com/item1234.html
It's not an ideal solution, but it should work, and you could do some more fancy stuff.
Nice try from the other posters, but there is a very simple RTML command that will do it. . .
TEXT PAT-SUBST s GRAB
MULTI
HEAD
BODY
TEXT #var-with-alt-tag-equals-pad-in-it
frompat "alt=\"pad\""
topat ""
The above RTML will find all instances of alt="pad" and replace it with nothing.
Well you're right on RTML being relatively untraveled :)
Do you have a way to add your own attributes to these images tags? If so, would it be possible to override the alt attribute? If you specify alt="", I would think that would override Yahoo's... Otherwise consider putting a useful alt tag in there for the blind and dialup types.
It's the first time I'm hearing about this platform, but here is an idea: if you can add javascript to the pages, you could write a function that will run after the page has loaded and remove all the alt="pad" attributes from the page.
Unfortunately this solutions works only with browsers that know about scripting, so lynx or some other text based browsers might not support it.
I have shared a link official RTML guide from yahoo. Hope it will help. Thanks!
List of available RTML books and resources
Related
I have been using Platypus & Reportlab for several weeks using Python, and would caveat this by saying that I am definitely a beginner, and my code isn't "good" code, but ...
I tend to work by looking at an example, testing it, and then adapting it to my needs...
With this approach, I managed to get a table of contents to work.
I also wanted to have a Page x of y working, which again, I found code, and after a lot of hassle, managed to get it to work with my Table of Contents, which I thought I then understood more about the applications, but ...
I had experienced links working - or not working separate to the ToCs.
However, when I merged my samples for ToC and Page x of y, I have a wonderful Toc, with links for each topic I wanted - but, the links all go to the top of the document.
I have looked at other examples I have tried, and find some where links using <a href="#MYANCHOR"... and <a name="MYANCHOR"... have the same issue.
I have also added into my main code, a link using the <a href=... but using one of the link destinations that a ToC would use - and this again jumps to the top of the document.
I put all the elements that form the document into a list called e.g. element so I would have code such as element.append(PageBreak()) and then I can print out all the element list to see what is there, and compare it to examples where it doesn't work, and I can see no significant difference.
If I provide an external link to a website (e.g. that excellent stackoverflow.com) those links work, but internal ones don't - which I accept are handled differently, but I hope it indicates where my failures lie!
I would love to understand why the links are so fickle, as I would like to get links to work in a table, and from a drawing, which to my mind should be possible, ... which may just highlight my ignorance - for which I apologise...
Any help would be really appreciated...
Many thanks,
How could I access a website and turn components of the website into strings. For example taking information from Facebook posts. I have done a little searching but can't find any good tutorials or anything useful.
Try looking at this tutorial. It should get you more familiar on the subject and start you off on the right track.
As it states at the beginning of the tutorial...
How to Parse HTML on iOS
Let’s say you want to find some information inside a web page and
display it in a custom way in your app. This technique is called
“scraping.” Let’s also assume you’ve thought through alternatives to
scraping web pages from inside your app, and are pretty sure that’s
what you want to do. Well then you get to the question – how can you
programmatically dig through the HTML and find the part you’re looking
for, in the most robust way possible? Believe it or not, regular
expressions won’t cut it! Well, in this tutorial you’ll find out how!
You’ll get hands-on experience with parsing HTML into an Objective-C
data model that your apps can use.
http://www.raywenderlich.com/14172/how-to-parse-html-on-ios
i wonder how google manages to open external links in a new window/tab without defining target="_blank".
For example in google plus, all external links open in a new window.
I think its some Javascript voodoo but the .js code is obfuscated so i cant really look into.
edit: oh and followup question: why?
Using a framework makes this easy. Just have JavaScript look for links marked rel="external", or another identifier that shows them to be an external link, and dynamically add target="blank". Here's an example using Prototype:
$$('a[rel="external"]').each(function(a) {
a.setAttribute('target', '_blank');
};
It's not beyond reasoning for them to add the target attribute by javascript before allowing the anchor link event to return true.
It's Javascript. You can say:
window.open('http://example.org', '_blank').focus();
But please, don't. Opening links in new windows is almost always the wrong thing to do. Seriously, good uses of this are vanishingly few. If users want a link opened in a new window, they are quite capable of doing that themselves.
Jakob Nielsen was telling people this twelve years ago. Others have taken up the cudgels. The W3C removed the target attribute from HTML 4 because it was such a bad idea. I honestly don't understand how this usage persists. Don't you find it incredibly annoying when a website does this to you? Why would you want to write a website which does this to someone else?
Which brings me to your followup question. Why did Google decide to do this? I have no answer to that, and i am completely and utterly baffled how one of the very biggest, brightest, web companies could make such an elementary mistake. But then, a lot of the Google Plus interface has very poor usability (as in, mostly worse than Facebook poor); i suspect there is an interesting story behind it. Was the project under-resourced, and thus built cheaply on top of a rapid development framework such as GWT? Was it built as a spare time project by a lone wolf with a blind spot for web architecture? Was it driven by strategy wonks who didn't care about getting the technology right? Mystery.
I have the following problem: I have a lot of papers in pdf format and I have to extract information from the first page of each one and then save it into a database
I just need to extract, the title, the abstract, keywords, authors list, universities list, emails. I want to do a script to get a string for each one of that fields, for each paper.
How can I do that? Does anyone already did that? What languages and tools do you recommend me?
and Does exist a paper repository that already do that database feeding?
Considering the pdfs could be with different encodings, I have to deal with this problem too. Any help with this would be great.
An example of a paper its here
Greetings!
http://pdfbox.apache.org/
You have to check about the security of the pdf, that it's really text and not an image. Check the command line application of pdfbox if it works extracting the text, then you can use the jar and use http://pdfbox.apache.org/apidocs/org/apache/pdfbox/examples/util/ExtractTextByArea.html
Hope it helps....
By the way it's java...
edit.
I have not used this as a jar library http://www.qoppa.com/pdftext/, but I used the example application and it works, but I decided to go with pdfbox...
You need a API to read your pdf.
Seems fine (I never try it though)
You can probably find others with this link :-)
i have a website, its to exchange links, files... to say it quickly it's my 'version' of twitter+megaupload,
Well, users add links all the time and so on, but i would like user be able to syinch his bookmarks from the browser to the ones he has at his profile of mywebsite,
Where should i look into?
Basically i need to be able to:
- Acces bookmarks file (1)
- being able to send the urls to my service ( 2 )
- maybe adding the login feature (in the future)
I was google'ing about this for ages few weeks a go and i kind of give up, because i'm ok with PHP and JS, but with this plugin languages i'm very lost. So i decided posting here, wich always brings positive answers
(1) - > I don't even know where to start
(2) -> i was thinking to have a website.com/auto_import_no_confirm.php?url=[URL] and put it in a for each.
how many different languages and extension files do i have to work with? I really need any kind of tip with point (1)
feel like?
-edit-
Just found This -> https://developer.mozilla.org/En/Code_snippets/Bookmarks
wich really looks like i need, but where do i place this code?
thanks!
Might not be a bad question, but there are too many subtopics raised to answer that. (And there is too much tagspam as well. Break up your question into PHP- and Javascript-specific tasks, when you have devised the general application scheme.)
But to get started, download similar Firefox extensions (.xpi) and unzip them to inspect the general structure. You'll find examplary code for bookmark handling and invoking remote APIs pretty quickly. And basically you only need Javascript for the extension itself. (It sounds like your extension does not need much UI.)
And there are many tutorials on designing Firefox addons: http://roachfiend.com/archives/2004/12/08/how-to-create-firefox-extensions/ or http://www.google.com/search?q=firefox+develop+an+xpi
The good news first, you won't need much more than javascript if you just want to access bookmarks and send them to a server, neither on firefox nor on chrome.
But still you'll have to make yourself familiar with the apis of the browsers and learn how to develop extensions.
However, both Mozilla and Google provide all necessary information on their developer sites.
For Chrome, this is a good place to start, you'll find the api for bookmark access here.
The Corresponding site for Firefox can be found here, with information on bookmark access here.