Download entire website - ios

I want to be able to download the entire contents of a website and use the data in my app. I've used NSURLConnection to download files in the past, but I don't believe it is capable of downloading all files from an entire website. I'm aware of the app Site Sucker, but don't think there is a way to integrate it's functionality into my app. I looked into AFNetworking & ASIHttpRequest, but didn't see anything useful to me. Any ideas / thoughts? Thanks.

I doubt there is anything out of the box that you can use, but existing libraries that you mentioned (AFNetworking & ASIHttpRequest) will get you pretty far.
The way this works is, you load the main website. Then you go through the source and find any resources that that page uses to display its contents and link to other pages. You then need to recursively download the contents of those resources, as well as its resources.
As you can imagine, there are few caveats to this approach:
You will only be able to download files that are mentioned in the source codes. Hidden files or files that aren't used by any page will not be downloaded as the app doesn't know of their existence.
Be aware of relative and absolute paths: ./image.jpg, /image.jpg, http://website.com/image.jpg, www.website.com/image.jpg, etc. could all link to the same image.
Keep in mind that page1.html could link to page2.html and vice versa. If you don't put any checks in place, this could lead to an infinite loop.
Check for pages that link to external websites--you probably don't want to download those as many websites have links to the outside and here you downloading the entire Internet to an iPhone with 8GB of storage.
Any dynamic pages (the ones that use a server side scripting language, such as PHP) will become static because they lose their server backend to provide them with dynamic data.
Those are the ones I could come up with, but I'm sure that there's more.

Related

Create an offline version of web application

Question :
Where to start to write an application which can work without internet connection? Exactly like this
Explanation :
Say we have an web application which is already deployed. Since internet is not great in INDIA, I would like to create an offline version of same web application which users/people can access without internet as well. I want them to experience similar stuff of web interface without much of the changes.
One idea that came to my mind is to create a tar ball of contents of application and ship to the people/users. Users will have to use that tar ball to install/configure on their machine so that they can use it. Contents of tar ball is also debatable that what should I enclose in that tar ball. Apache, Technology stack etc etc.
I will be happy to write more in case I have not written precisely. My question is not related to any technology stack but this might be of interest to everyone. Since I am not sure which is the right tag to append here, can anybody from stackoverflow team help to tag right tag. :)
My application is actually in RoR. So, Tagging ruby on rails community. May be they can help here?
As long as your web application contains only flat files (HTML, CSS, JS, text data, etc.) and does not depend on any components that need to be installed, then you can simply distribute those files in an archive (.zip will be more cross-platform-friendly) and the user could open the application by opening the front page in a browser. To make it better for the user, a small application which invokes the user's browser with the local URI should also be included.

Alternatives to Appcache

I am developing a site using PHP, and I was a bit mislead by how Appcache works; it turns out that it also caches the current page. Which, in the case of a PHP app, is a problem. :)
I'd still like to cache my javascript, css and images on the client, but not my actual generated page. What is a good alternative for that? Just the plain old cache headers? The problem I see with them is, that they still produce requests. I am trying to mimize the amount of requests a client needs to make - this includes 304s.
As you might have found out by now, appCache is in the process of being deprecated and will sometimes disappear. It was a good solution for offline applications (static pages with variable data), but not as a cache for static files in dynamic pages.
You could try to include a blank page with a manifest in a hidden iframe in your dynamic pages, but still only pages present in the appCache would use the static resources downloaded from the manifest; the other pages would check the live static resources from the server anyway (the only part of the manifest which is valid everywhere is the "fallback" part).
So your best option is to check your cache headers as suggested by Marged, as it IS possible to avoid server access indefinitely for a static resource.
You could dig what the ServiceWorkers cache does, but I'm not an expert in the field (for now).

How to stop embedded google drive on my website from linking to Google when clicked on

After having embedded google drive files on my website (awesome feature), I found a minor drawback.
When clicking on one of the maps in the list, it will redirect/link the viewer on my page to the google drive site. However, I want to keep the viewer on my page and the folder to open within my own website.
Also I want other folders within these folders to open within the borders of my website, and so on and so forth.
The used code is simple:
The used website is Typo3 based.
Does anyone have a solution for this problem?
Thank you very much in advance; all replies and suggestions are highly appreciated!
After a quick search it seems to me this is more a hack than an official google feature, so probably there's no easy way for altering the behaviour of the stuff inside the iframe. I would rather recommend setting an outbound link and accepting the fact that you're hosting the files at Google.
In the future, there might (or might not) be a File Abstraction Layer Adapter for Drive coming up: http://wiki.typo3.org/FAL_Adapters. Well, probably not so soon. But for Dropbox!

Google Drive as a video hosting/streaming platform?

I'm developing an iOS app that generate video files and have a social gallery for users to display their clips. After a lot of research I found that Google Drive would be perfect to fit my needs so I did some testing and sucessfully made the app upload the file to GDrive and everything.
Now I need to stream the uploaded file in a MPMoviePlayerViewController, for that I would need some kind of direct link, I'm right? After my initial tests I used the variable WebContentLink as a source URL and it worked flawlessly, I was really happy with the result, however now it doesn't work anymore, I don't know what happened and I think the method that I used is not realiable? I tried all the other possible links and none of them seems to work.
Can someone give a guidance about if this is really supported by Google Drive and how it's the best way to archive that in a reliable way?
Thank you very much !
I too encounter the same error when I try to download 28 times (testing) the same 24mb file.
However I realise if I am to download using the content owner ID, it does allow downloading after the 28th time
https://docs.google.com/a/onwardsct.com/uc?id=0ByvXJAlpPqQPYWNqY0V3MGs0Ujg&export=download
Sorry, you can't view or download this file at this time.
Too many users have viewed or downloaded this file recently. Please try accessing the file again later. If the file you are trying to access is particularly large or is shared with many people, it may take up to 24 hours to be able to view or download the file. If you still can't access a file after 24 hours, contact your domain administrator.
The experience for streaming files natively is not ideal right now, sorry. It is something Google are working on.
You are doing this correctly though. The webContentLink should use the user's quota, and that should be enough for most cases. If you can give some specific numbers, we can look at it.
The embed link is the best way to show it on a mobile device, but as you say won't work everywhere.
yes, google drive can be used for hosting and stream videos as you like. It can also be used as demo server for web projects. Here is how to host a website on Google drive.

HTML contents preloaded into an app and then updated from the Internet

The title says (quite) all: I would like to distribute an app with some HTML pages preloaded into the local Documents folder (they reflect the content of a mini mobile site available on the internet); then, when the contents of the pages are updated, the local HTML files into the app should be updated, so that the user can browse the updated informations also when not connected to the internet.
The app has to work since the first start, thanks to the preloaded pages, and then update itself periodically (I didn't need to check the modify date/time of the single files, it's enough to check and update them when the local copies are older than x days).
The problem: I think I can do it all, but I was asking to myself if is there some framework/class that does it automatically, because it sounds to be a pain :)
Consider using ASIHTTPRequest. Check out this SO question.
Specifically, you might want to look into ASIWebPageRequest:
download complete webpages, including external resources like images
and stylesheets. Pages of any size can be indefinitely cached, and
displayed in a UIWebview / WebView even when you have no network
connection.
I've also used AFNetworking for my own personal projects and it's made my life 10x easier. On the AFNetworking FAQ page, there's a question regarding caching mechanisms for offline viewing. It mentions that NSURLCache in iOS 5 introduced support for caching to disk for offline use - but only for http. If you need to cache https, consider using SDURLCache.
Here's a short additional resource in regards to network caching for iOS.
Read the section titled iOS network caching
If you are looking at pre popping your iOS app with the equivalent of a browser cache then
https://github.com/rs/SDURLCache might be something to look into.
It hooks in with existing NSURLConnection frameworks such as AFNetworking and you just need to set the correct cache policy in your NSURLRequest.
Given its open source you should be able to figure out how where to place your data so it loads it without fetching from the server the first time then just specify when you want the cache to purge itself so it fetches it from the server?

Resources