I want to read the content (html) of a certain URL. I send a GET command, receive the html and I do what I want to do with it, no problem here.
The problem happens when a site loads more info into itself when you scroll down (like Facebook does), so the content I receive is the default one without the parts that are yet to be loaded. Any idea what can be done to retrieve the next parts automatically so I can read more and more of the content of such site?
I saw something about phantomJS but I'm developing a desktop application (using Delphi) so I can't use it. Thanks.
Related
Many ios apps contain a FAQ or How to use page, when loaded, will display some FAQs in text format. The content is loaded from remote server rather than built in so the contents can be updated anytime with flexibility.
What's the best way to implement this type of page? My app does not contain any server side function except this page so I am really looking for a cheap (or free) way instead of renting a server just for this. Also, my FAQs will be static HTML.
Thank you for any suggestions.
The simplest way to display an HTML page is to use a UIWebView as the main view of your view controller and set its URL or file-path using the loadRequest() method of your view.
You can use this method for both remote FAQs (in this case you'll set an URL) and local and static ones for which you'll use a path to the file included in your project.
I have an web-application in RoR which calculates some energy values and investment money. I use ajax to send the data from the web-browser to the server. It is something like this: Browser-server-Browser-Server-Browser
This web-application is already integrated in typo3 and I want to implement a PDF button to send the results per email (in other words, a photo of the page with the results).
I have heard an option would be to generate some links in RoR to be used in typo3 (when clicking on it, it would open exactly the web-application with the results already calculated). But as a newbie, I do not really know which would be the best approach.
Any recommendation?
A screenshot of the page can be done client-side:
http://html2canvas.hertzen.com/
You could even have another page with the same results that you use only for the rendering of the result page what you use for making a clean screen-shot (you might not want to have the footer, menu and other elements on that page, only the results)
Once you have your screenshot, you can upload it to your server where you can use it to create a PDF of that image and then send it with any mail API you prefer to use.
info about TYPO3's mail api can be found here:
https://docs.typo3.org/typo3cms/CoreApiReference/ApiOverview/Mail/Index.html
Could anyone please tell how the site http://www.outsharked.com/imagemapster/default.aspx?what.html is working in such way? Modifying the url without loading/reloading the page. I think this is not done by html5. Because it works in IE6 which doesn't support html5.
I created that site. The commenter is correct, it uses Javascript to change the URL. There's nothing about how that navigation works that is different for IE6 - that browser supports the necessary client-side functionality to do this kind of thing. The basic functionality involves:
capturing click events on the nav, and loading the inner content via AJAX
update the URL to reflect a working direct URL to target.
The links also are valid anchor links that, in the absence of Javascript, would go to the same page (but load the whole thing). This is your basic AJAX web site setup with one minor difference. It's common practice to use a URLs like this in AJAX/single page web sites:
http://mysite.com/home#somepage
or even just
http://mysite.com/#somepage
Where the hashtag part represents the actual page a user has navigated to. If someone accessed that url directly, e.g. from outside the site, the site would use Javascript to load the correct content based on the hashtag, after the page had loaded. This means that there might be a little delay for the inner content to reflect the correct page, since it has to run another request after the initial page has loaded from the browser to get the inner content via AJAX.
I was trying to avoid that by creating a setup that worked completely with and without Javascript. If you go directly to a URL within the site such as http://www.outsharked.com/imagemapster/default.aspx?faq.html you will notice it loads the content directly. This URL will work even if Javascript is disabled. You can't actually do this using hashtags, since hashtag content is not sent to the server. Only the client knows what's after the hashtag in a URL. That's why I was using query strings to represent inner pages.
This site architecture was sort of an experiment at the time. It works pretty well but the code isn't fantastic, I didn't really do anything else with it, and I'm sure there are other better-fleshed-out/tested/full-featured frameworks out there to do much the same thing.
But it might not be a bad example of the nuts and bolts of creating a basic AJAX navigation setup, as a learning tool, since it's pretty concise, and also does HTML5 history navigation (e.g. so the back button works on modern browsers).
I'm trying to write a Firefox extension that speeds up browsing page sequences by preloading sequence items, preprocessing them, and showing on request.
Is there any way to load and process DOM of arbitrary web page (on the same site as currently opened) in background from privileged extension code?
Ideally, the document's javascript should work as it would in a normal browser window. I suspect a hidden window would be required for this. The context on that javascript should not be privileged then.
Loading should allow user to continue normal browsing in all visible browser windows.
I don't like the idea of injecting iframes to currently opened document and making them optionally visible (the principle used by Webcomic reader userscript)
From the add-on SDK, the page-worker module might be close to what you need:
The page-worker module provides a way to create a permanent, invisible
page and access its DOM.
That said, I have no idea whether it's possible to load that invisible page into a (current or new) tab / window. You might be able to replace a current tab's document.body by the page-worker's one. Possibly. If it's legal.
You could use a lightweight browser extension to collect all links on a page onload and use link tags to prefetch the content for each, the browser will load those pages in the background: https://developer.mozilla.org/en/Link_prefetching_FAQ
OR
If you need to preload a page and have access to its DOM from extension land, you could use the Page Worker API from the Add-on SDK: https://addons.mozilla.org/en-US/developers/docs/sdk/1.0/packages/addon-kit/docs/page-worker.html
I believe so. assuming your javascript is already running
var doc = gBrowser.selectedBrowser.contentDocument;
will get your the document of the loaded tab, you can then process it and do with it what you want. Doing it in the background and keeping the app responsive is a different story :)
I'm looking to get the title of a webpage, a common feature of many IRC bots that I'm wanting to incorporate into a IRC client I'm writing for fun.
The method that I currently have working basically connects and sends a GET request for the entire webpage then seeks out the tags and reads inbetween them. For larger webpages this can be slower than I'd like. An additional problem I've noticed is webpages with dynamic titles (such as some phpbb forums) will not return the accurate title as it would show in a browser because I don't do any execution of javascript ect..
It seems one way to get an accurate title is to dump the html into a browser control (such as the IE COM control) and pull the title, but this is just going to make it even more time consuming.
Is there a simple method I am un aware of?
In a word, no, not really.
I guess rather than downloading the whole document you could stream the HTTP file into your application and just stop downloading when you reach </title> - that would save you waiting for the whole HTML document to download.
However that doesn't help the situation if you need to read the title after it's been changed by some client-side javascript. As you say, the only way I can think of doing that is by using a browser control.