Get main content from url - ios

I'm trying to build an iOS app like Pocket or Instapaper for practice. So, I need to fetch data from a url and strip the HTML of of it. I created the code below to do this.
NSURL *url = [NSURL URLWithString:self.link];
NSString *webData= [NSString stringWithContentsOfURL:url];
NSLog(#"webData is: %#", webData);
NSString *finalhtmlstring = [NSString stringWithFormat:#"%#", webData];
finalhtmlstring = [finalhtmlstring stringByConvertingHTMLToPlainText];
NSLog(#"FinalHTMLString is: %#", finalhtmlstring);
How would I fetch the body of the page? I can't get the NSString between #"<body>" and #"</body>", because some websites add attributes to the <body> tag.

It sounds like parsing XML or HTML page.
Fortunately, there is open-source libraries likes Hpple can help you to get the contents from wrappers easily.
It wraps libxml2 nicely using Objective-C objects
Here is a tutorial about how to use this library.

Related

How to pass GET request for PayTm transaction in ios Objective C

I am integrating PayTm with my app and I want to pass the parameters using GET method.
My code is as follows:
NSString *urlString = [NSString stringWithFormat:#"https://secure.paytm.in/oltp/HANDLER_INTERNAL/TXNSTATUS?JsonData={%22MID%22:%22%#%22,%22ORDERID%22:%22a84afd6c-0e54-42df-b29a-2b057f9e7c53%22}",MIDValue];
where MIDValue is a string.
When I use this code I'm getting error message.
Please give a suggestion to remove the error.
Thank You
Maybe your variable MIDValue contains space and/or &.
You first have to encode it as URL and then pass it as parameter.
Visit this for details, I guess this is what you are looking for
NSString *newParam = [MIDValue stringByAddingPercentEncodingWithAllowedCharacters:[NSCharacterSet URLHostAllowedCharacterSet]]
NSString *urlString = [NSString stringWithFormat:#"https://secure.paytm.in/oltp/HANDLER_INTERNAL/TXNSTATUS?JsonData={%22MID%22:%22%#%22,%22ORDERID%22:%22a84afd6c-0e54-42df-b29a-2b057f9e7c53%22}", newParam];

Make a local NSURL from an NSString containing HTML?

I have an NSString containing some HTML. I’m trying to pre-load that HTML, including any external image links inside it. So, to clarify: I have just HTML, and I need to basically load it, grab its images, and cache that data.
I can do this with a live website using the following code:
NSArray *paths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
NSString *filePath = [NSString stringWithFormat:#"%#/%#", [paths objectAtIndex:0], #"index.html"];
NSURL *url = [NSURL URLWithString:#"http://www.google.co.uk"];
NSData *urlData = [NSData dataWithContentsOfURL:url];
[urlData writeToFile:filePath atomically:YES];
But this won’t work without an NSURL. I’m wondering if I can make an NSURL somehow from this string of HTML, and substitute it in the URLWithString: method here.
So, question: can I take some local HTML inside an NSString and turn that into an NSURL, so that I can feed it into the code above, and save the both the HTML and any images it links to?
The question you are asking makes absolutely no sense.
A URL is a pointer to the location of a resource online. In this context a html file. You can not make one into the other.
I suggest you create a UIWebView, load the string into that, have it render and cache the result.
[webView loadHTMLString:#"data" baseURL:nil];
I believe it will need to be actually place on the screen for it to render, so make it an invisible 1 X 1 pixel square and it should be fine. Then when the didFinishLoading fires. Cache the result
Yes, you can. You would have to parse the links out of HTML yourself. Real world HTML is not XML compliant, so a quick and dirty way to achieve your goals would be to treat HTML as text and parse links out using a regular expression library.

Get html code and edit then load on to web view

I want to load a web site on a UIWebView which is not under my control and edit/add certain UI changes (Some texts, images, etc) to it. Can I do this within my iOS source code? I can't change the hosted html contents since them not under my control.
If this cannot doable within iOS source code, please advice me the correct way to achieve this.
Load the webpage into an NSString, make any modifications and then put the html into the UIWebView.
NSURL *url = [NSURL URLWithString:#"http://example.com/"];
NSString *page = [NSString stringWithContentsOfURL:url usedEncoding:nil error:nil];
/* Make changes to page here */
[self.webView loadHTMLString:page baseURL:nil];
I'd get the dom with JavaScript, manipulate, then inject back with JavaScript.
See stringByEvaluatingJavaScriptFromString:.
You can write your own full featured, minified JavaScript, then pass into using this method.
// Change body color of any HTML content inside a UIWebView.
NSString *javaScript = #"document.getElementByTagName('body').backgroundColor = '#888';";
[webView stringByEvaluatingJavaScriptFromString:javaScript];

Getting Image URLs in a Web Directory

I want to get URLs of all images or lets say "JPEG" files in a web directory (www.abcde.com/images). I just want their URLs in an array.. I couldnt manage that. Could u pls help me with this?
Thanks in advance..
Assuming you have access to an index file you could simply load via NSURL the whole html file and cut out the link lines. This however will not work (or hardly work) when you want to search ("spider or crawl") for links in more complex documents. On iOS i would suggest you use the simple, yet quite powerfull "hpple" framework (https://github.com/topfunky/hpple). It is used to parse html. You can search with it for certain html elements, such as <a href...> constructs.
a sample with hpple could looks like this:
NSURL *url = [NSURL URLWithString:#"whatver.com/images"];
NSData *data = [NSData url];
TFHpple *hppleParser = [TFHpple data];
NSString *images = #"//img"; // grabbs all image tags
NSArray *node = [hppleParser searchWithXPathQuery:images]
find a bigger example at http://www.raywenderlich.com/14172/how-to-parse-html-on-ios
Create a server side script(eg php) which gives you a list of all images in that directory as xml or json. From iOS send a request to that script get the xml or JSON parse it and use the image urls.

UIWebView loadHtmlString not working on device

I have a webview which i want to load using the loadHtmlString method. The problem is that I want to be able to change the img src's with images that i have previously downloaded. I also use google analitics in the html so I need to set the baseUrl to the actual url so it will work. Here comes the problem. If I put the baseUrl in, the images will not load. If I don't set the baseUrl, it works. How can I get around this, so I will be able to both use google analitycs and have the images store locally in my application? I would prefer not having to implement the google analitics sdk in my project.
A strange thing is that if I run it in simulator, and not put the "http://" prefix in front of my baseUrl, it works fine. However, when I run it on a device, I receive the following error and it doesn't work:
Domain=WebKitErrorDomain Code=101 "The URL can’t be shown"
Thanks
EDIT
If I do this, it works:
[appsWebView loadHTMLString:htmlString baseURL:nil];
However, I must provide a baseURL in order to have Google Analitics working, I have two further cases:
This one gives the above mentioned error: (it works ok in simulator but gives error when running on device)
[appsWebView loadHTMLString:htmlString baseURL:[NSURL urlWithString:#"test.com"]];
This one simply doesn't show anything: (neither loads the html string or the url)
[appsWebView loadHTMLString:htmlString baseURL:[NSURL urlWithString:#"http://test.com"]];
I incorrectly assumed that the problem was that the local image was not fully specifying the full path, but that does not appear to be the problem here. But, you are quite right that it appears (somewhat surprisingly) that you cannot specify some web-based baseURL and also reference a local image in your HTML string. No simple solutions are leaping out at me, but at the very least, it appears that you might have a couple of (not very good) options:
First, you could base64 encode the local image using some base64 library like Mike Gallagher's NSData+Base64 category, e.g.:
NSData *imageData = [NSData dataWithContentsOfFile:imagePath];
NSString *imageDataBase64 = [imageData base64EncodedString];
NSString *imageHtml = [NSString stringWithFormat:#"<img src='data:image/png;base64,%#'>", imageDataBase64];
This slows the initial rendering, but maybe it's better than nothing.
Second, you could always try leaving the baseURL as nil, removing the JavaScript that does the Google Analytics from the HTML string, and then try injecting that JavaScript via stringByEvaluatingJavaScriptFromString. This approach may or may not work depending upon the complexity of the Google Analytics JavaScript (e.g. what further web-based references it might have), but there's a outside chance you might be able to do something that way.
My apologies for assuming the problem was a trivial img URL. Clearly you had identified a more fundamental issue.
Original answer:
Create your image URLs in your HTML string to be fully qualified file URLs within your local file system:
The image is either in Documents:
NSString *documentsPath = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES)[0];
NSString *imagePath = [documentsPath stringByAppendingPathComponent:imageName];
Or in the bundle:
NSString *imagePath = [[NSBundle mainBundle] pathForResource:imageName
ofType:nil];
But, once you have fully qualified path, you should be able to use that:
NSURL *imageUrl = [NSURL fileURLWithPath:imagePath];
NSString *imageHtml = [NSString stringWithFormat:#"<img src='%#'>", imageUrl];
I would bet it's a casing issue. Take into account that the Device is case sensitive whereas the Simulator is not. Check the URL and make sure it contains the right characters.

Resources