I am new to multithreading and was wondering how I could run this function in the background? The function simply returns a NSURL that is used for XML parsing and is called from another function. Or is it even worth it to run in the background since the function that calls it does not continue until this function returns its NSURL. Basically, I am just trying to figure out how to speed this up because it is taking a little time to finish!
+ (NSURL *)parserURL {
NSURL *theURL = [NSURL URLWithString:#"http://www.wccca.com/PITS/"];
NSData *data = [[NSData alloc] initWithContentsOfURL:theURL];
TFHpple *xpathParser = [[TFHpple alloc] initWithHTMLData:data];
NSArray *elements = [xpathParser searchWithXPathQuery:#"//input[#id='hidXMLID']//#value"];
if (elements.count >= 1) {
TFHppleElement *element = [elements objectAtIndex:0];
TFHppleElement *child = [element.children objectAtIndex:0];
NSString *idValue = [child content];
NSString *stg = [NSString stringWithFormat:#"http://www.wccca.com/PITS/xml/fire_data_%#.xml", idValue];
NSURL *url = [NSURL URLWithString:stg];
return url;
}
return nil;
}
The main issue with your code is that you are using a blocking operation to get the data from the website. You definitely want to execute this in the background thread. However, I would recommend you to have a look at networking frameworks that help you do these kinds of operations very easily, i.e., AFNetworking,
In any case, the strategy that I would follow to multithread that operation, or a similar one is the following:
It breaks down to dispatching it with GDC, and then executing a receiving completion block back in the main thread with the results.
Here is the code:
Description
First start by declaring your function to receive a block. The block will be executed in the end, once you've finished retrieving and parsing the data. The next thing the code does it asking GDC to execute a block of code in a background queue. When it is done, we ask the code to execute the completion block that was provided as parameter of the function in the main thread, supplying the parsed string to it.
+(void) parserURL:(NSURL *) theURL completion:(void (^) (NSURL *finalURL))completionBlock{
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
NSData *data = [[NSData alloc] initWithContentsOfURL:theURL];
TFHpple *xpathParser = [[TFHpple alloc] initWithHTMLData:data];
NSArray *elements = [xpathParser searchWithXPathQuery:#"//input[#id='hidXMLID']//#value"];
NSURL *url;
if (elements.count >= 1) {
TFHppleElement *element = [elements objectAtIndex:0];
TFHppleElement *child = [element.children objectAtIndex:0];
NSString *idValue = [child content];
NSString *stg = [NSString stringWithFormat:#"http://www.wccca.com/PITS/xml/fire_data_%#.xml", idValue];
url = [NSURL URLWithString:stg];
}else{
url = nil;
}
dispatch_async(dispatch_get_main_queue(), ^{
completionBlock(url);
});
});
}
You call the method the following way:
[URLParser parserURL:[NSURL URLWithString:#"http://www.wccca.com/PITS/"] completion:^(NSURL *finalURL) {
NSLog(#"Parsed string %#", [finalURL absoluteString]);
}];
Related
I'm wrapping a for loop for function to enable multiple html to multiple pdf conversion:
for (int i=0; i<= 47; i++) {
NSString *inputHTMLfileName = [NSString stringWithFormat:#"wkhtml_tempfile_%d",j];
NSString *outputPDFfileName = [NSString stringWithFormat:#"~/Documents/%d_delegateDemo%d.pdf",loop,j];
NSURL *htmlFileUrl = [[NSBundle mainBundle]
URLForResource:inputHTMLfileName withExtension:#"html"];
// Check for existing pdf file and remove it
NSError *pdfDeleteError;
if ([[NSFileManager defaultManager] fileExistsAtPath:outputPDFfileName]){
//removing file
if (![[NSFileManager defaultManager] removeItemAtPath:outputPDFfileName error:&pdfDeleteError]){
NSString * errorMessage = [NSString stringWithFormat:#"wk %d Could not remove old pdf files. Error:%#",j, pdfDeleteError];
NSLog(#"%#",errorMessage);
}
}
self.PDFCreator = [NDHTMLtoPDF createPDFWithURL:htmlFileUrl pathForPDF:[outputPDFfileName stringByExpandingTildeInPath] pageSize:kPaperSizeA4 margins:UIEdgeInsetsMake(10, 5, 10, 5) successBlock:^(NDHTMLtoPDF *htmlToPDF) {
NSString *result = [NSString stringWithFormat:#"HTMLtoPDF did succeed (%# / %#)", htmlToPDF, htmlToPDF.PDFpath];
NSLog(#"%#",result);
self.resultLabel.text = result;
} errorBlock:^(NDHTMLtoPDF *htmlToPDF) {
NSString *result = [NSString stringWithFormat:#"HTMLtoPDF did fail (%#)", htmlToPDF];
NSLog(#"%#",result);
self.resultLabel.text = result;
}];
}
However, it crashes with
Thread 1: EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
Link to github:
https://github.com/iclems/iOS-htmltopdf
However, if I replace the for loop with a button pressed (trigger this function once per second) the app doesn't crash.
I think your for loop crashing is because you are using block within the for loop
So what happens here is that the another iteration of the calls the block before it completes the previous block call
so here what you can do that you can use dispatch_group feature of ios to call each iteration call on different thread instead of calling sequentially
so to achieve this you can create one method with block parametes and call the block in that method some thing like this,
- (void)blockTask:(NSString*)strPath
{
dispatch_group_enter(serviceGroup);
self.PDFCreator = [NDHTMLtoPDF createPDFWithURL:htmlFileUrl pathForPDF:[outputPDFfileName stringByExpandingTildeInPath] pageSize:kPaperSizeA4 margins:UIEdgeInsetsMake(10, 5, 10, 5) successBlock:^(NDHTMLtoPDF *htmlToPDF) {
NSString *result = [NSString stringWithFormat:#"HTMLtoPDF did succeed (%# / %#)", htmlToPDF, htmlToPDF.PDFpath];
NSLog(#"%#",result);
self.resultLabel.text = result;
dispatch_group_leave(serviceGroup);
} errorBlock:^(NDHTMLtoPDF *htmlToPDF) {
NSString *result = [NSString stringWithFormat:#"HTMLtoPDF did fail (%#)", htmlToPDF];
NSLog(#"%#",result);
self.resultLabel.text = result;
dispatch_group_leave(serviceGroup);
}];
}
Note : This just a psuedo code may contains errors,
For dispatch_group tutorial check here
The NSArray declaration brings up an error because "no visible #interface for NSURL declares the selector componentsseparatedbytring".
NSURL *MyURL = [[NSBundle mainBundle]
URLForResource: #"artList" withExtension:#"txt"];
NSArray *lines = [MyURL componentsSeparatedByString:#"\n"]; // each line, adjust character for line endings
for (int i = 0; i < 10; i++) {
NSString *line;
//in lines;
NSLog(#"%#", [NSString stringWithFormat:#"line: %#", line]);
_wordDefBox.text = [NSString stringWithFormat:#"%#%#",_wordDefBox.text, lines];
}
You missed a step. Once you have the URL, you need to load the file into an NSString. Then call componentsSeparatedByString on the NSString.
NSURL *myURL = [[NSBundle mainBundle]
URLForResource: #"artList" withExtension:#"txt"];
NSError *error = nil;
// Use the appropriate encoding for your file
NSString *string = [NSString stringWithContentsOfURL:myURL encoding:NSUTF8StringEncoding error:&error];
if (string) {
NSArray *lines = [string componentsSeparatedByString:#"\n"];
// and the rest
} else {
NSLog(#"Unable to load string from %#: %#", myURL, error);
}
In general when you see such an error it means class X( here NSURL) doesn't have any method named Y ( e.g. componentsseparatedbystring) or at least it doesn't have it in its interface ie it's not it's public method, it may be it's private method and available to its implementation. Always try to make sense of what the compiler is telling you. To find out more you can 'Cmmd + click' on any class and it will take you to it's interface and you can see what public methods it has. Try that on NSString and NSURL
Here specifically : NSURL doesn't have that method. It doesn't belong to NSURL, it belongs NSString.
I am currently using a dispatch_group to get notify when all concurrent tasks are done. I am offloading some heavy tasks on one concurrent queue within the [TWReaderDocument documentFileURL:url withCompletionBlock:] class method.
I have implemented the following code but never received any notification. I don't see what i am potentially doing wrong in the below code:
dispatch_group_t readingGroup = dispatch_group_create();
NSFileManager* manager = [NSFileManager defaultManager];
NSString *docsDir = [[[NSBundle mainBundle] resourcePath] stringByAppendingPathComponent:#"Data"];
NSDirectoryEnumerator *dirEnumerator = [manager enumeratorAtURL:[NSURL fileURLWithPath:docsDir]
includingPropertiesForKeys:[NSArray arrayWithObjects:NSURLNameKey,
NSURLIsDirectoryKey,nil]
options:NSDirectoryEnumerationSkipsHiddenFiles
errorHandler:nil];
// An array to store the all the enumerated file names in
NSMutableArray *arrayFiles;
// Enumerate the dirEnumerator results, each value is stored in allURLs
for (NSURL *url in dirEnumerator) {
// Retrieve the file name. From NSURLNameKey, cached during the enumeration.
NSString *fileName;
[url getResourceValue:&fileName forKey:NSURLNameKey error:NULL];
// Retrieve whether a directory. From NSURLIsDirectoryKey, also cached during the enumeration.
NSNumber *isDirectory;
[url getResourceValue:&isDirectory forKey:NSURLIsDirectoryKey error:NULL];
if (![isDirectory boolValue]) {
dispatch_group_enter(readingGroup);
TWReaderDocument* doc = [TWReaderDocument documentFileURL:url withCompletionBlock:^(BOOL success) {
dispatch_group_leave(readingGroup);
}];
[arrayFiles addObject:doc];
}
else if ([[[fileName componentsSeparatedByString:#"_" ] objectAtIndex:0] isEqualToString:#"XXXXXX"]) {
TreeItem* treeItem = [[TreeItem alloc] init];
arrayFiles = [NSMutableArray arrayWithCapacity:10];
treeItem.child = arrayFiles;
treeItem.nodeName = [[fileName componentsSeparatedByString:#"_" ] lastObject];
[self addItem:treeItem];
}
}
dispatch_group_notify(readingGroup, dispatch_get_main_queue(), ^{ // 4
NSLog(#"All concurrent tasks completed");
});
Does the dispatch_group_enter and dispatch_group_leave have to be executed on the same thread?
EDIT
The code snippet of my factory method might help aswell:
+ (TWReaderDocument *)documentFileURL:(NSURL *)url withCompletionBlock:(readingCompletionBlock)completionBlock{
TWReaderDocument * twDoc = [[TWReaderDocument alloc] init];
twDoc.status = ReaderDocCreated;
twDoc.doc = [ReaderDocument withDocumentFilePath:[url path] withURL:url withLoadingCompletionBLock:^(BOOL completed) {
twDoc.status = completed ? ReaderDocReady : ReaderDocFailed;
completionBlock(completed);
}];
return twDoc;
}
TWReaderDocument is a wrapper class that call internally the following methods of a third-party library (it is a PDF reader)
+ (ReaderDocument *)withDocumentFilePath:(NSString *)filePath withURL:(NSURL*)url withLoadingCompletionBLock:(readingCompletionBlock)completionBlock{
ReaderDocument *document = [[ReaderDocument alloc] initWithFilePath:filePath withURL:url withLoadingCompletionBLock:[completionBlock copy]];
return document;
}
- (id)initWithFilePath:(NSString *)fullFilePath withURL:(NSURL*)url withLoadingCompletionBLock:(readingCompletionBlock)completionBlock {
id object = nil; // ReaderDocument object;
if ([ReaderDocument isPDF:fullFilePath] == YES) // File must exist
{
if ((self = [super init])) // Initialize superclass object first
{
_fileName = [ReaderDocument relativeApplicationFilePath:fullFilePath]; // File name
dispatch_async([ReaderDocument concurrentLoadingQueue], ^{
self.guid = [ReaderDocument GUID]; // Create a document GUID
self.password = nil; // Keep copy of any document password
self.bookmarks = [NSMutableIndexSet indexSet]; // Bookmarked pages index set
self.pageNumber = [NSNumber numberWithInteger:1]; // Start on page 1
CFURLRef docURLRef = (__bridge CFURLRef)url;// CFURLRef from NSURL
self.fileURL = url;
CGPDFDocumentRef thePDFDocRef = CGPDFDocumentCreateX(docURLRef, self.password);
BOOL success;
if (thePDFDocRef != NULL) // Get the number of pages in the document
{
NSInteger pageCount = CGPDFDocumentGetNumberOfPages(thePDFDocRef);
self.pageCount = [NSNumber numberWithInteger:pageCount];
CGPDFDocumentRelease(thePDFDocRef); // Cleanup
success = YES;
}
else // Cupertino, we have a problem with the document
{
// NSAssert(NO, #"CGPDFDocumentRef == NULL");
success = NO;
}
NSFileManager *fileManager = [NSFileManager new]; // File manager instance
self.lastOpen = [NSDate dateWithTimeIntervalSinceReferenceDate:0.0]; // Last opened
NSDictionary *fileAttributes = [fileManager attributesOfItemAtPath:fullFilePath error:NULL];
self.fileDate = [fileAttributes objectForKey:NSFileModificationDate]; // File date
self.fileSize = [fileAttributes objectForKey:NSFileSize]; // File size (bytes)
completionBlock(success);
});
//[self saveReaderDocument]; // Save the ReaderDocument object
object = self; // Return initialized ReaderDocument object
}
}
return object;
}
It's hard to say what's going on here without knowing more about TWReaderDocument, but I have a suspicion...
First off, no, dispatch_group_enter and dispatch_group_leave do not have to be executed on the same thread. Definitely not.
My best guess based on the info here would be that for some input, [TWReaderDocument documentFileURL:withCompletionBlock:] is returning nil. You might try this instead:
if (![isDirectory boolValue]) {
dispatch_group_enter(readingGroup);
TWReaderDocument* doc = [TWReaderDocument documentFileURL:url withCompletionBlock:^(BOOL success) {
dispatch_group_leave(readingGroup);
}];
// If the doc wasn't created, leave might never be called.
if (nil == doc) {
dispatch_group_leave(readingGroup);
}
[arrayFiles addObject:doc];
}
Give that a try.
EDIT:
It's exactly as I expected. There are cases in which this factory method will not call the completion. For instance:
if ([ReaderDocument isPDF:fullFilePath] == YES) // File must exist
If -isPDF: returns NO the completionBlock will never be called, and the returned value will be nil.
Incidentally, you should never compare something == YES. (anything non-zero is equivalent to YES, but YES is defined as 1. Just do if ([ReaderDocument isPDF:fullFilePath]). It's equivalent, and safer.
is there any way to parsing google shopping results using TFHpple without using google API (deprecated) but simple using url like for example this: https://www.google.com/search?hl=en&tbm=shop&q=AudiR8 ?
I've tried many types of tags:
...
myCar = #"Audi R8";
myURL = [NSString stringWithFormat:#"https://www.google.com/search?hl=en&tbm=shop&q=%#",myCar];
NSData *htmlData = [[NSData alloc] initWithContentsOfURL:[NSURL URLWithString:myURL]];
TFHpple *xpath = [[TFHpple alloc] initWithHTMLData:htmlData];
//use xpath to search element
NSArray *elements = [NSArray new];
elements = [xpath searchWithXPathQuery:#"//html//body"]; // <-- tags
...
but nothing to do, always the same output console message: UNABLE TO PARSE.
I've found various problem and finally i've solved all.
First of all it's necessary to encoding URL adding:
myURL = [myURL stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
Then, inside original (and actual) TFHPPLE code (for exactly XPathQuery.m) parsing phase going to crash 'cause any time nodeContent and Raw are NIL.
So, to solve this crash I've changed
[resultForNode setObject:currentNodeContent forKey:#"nodeContent"];
with (ATTENTION FOR BOTH ROWS [resultForNode...:
if (currentNodeContent != nil)
[resultForNode setObject:currentNodeContent forKey:#"nodeContent"];
and:
[resultForNode setObject:rawContent forKey:#"raw"];
with:
if (rawContent != nil)
[resultForNode setObject:rawContent forKey:#"raw"];
I want to remember that, 'cause the harder html code used by google, i decide to use these xpathqueries:
...
NSArray *elementsImages = [NSArray new];
NSArray *elementsPrices = [NSArray new];
elementsImages = [xpath searchWithXPathQuery:#"//html//*[#class=\"psliimg\"]"];
elementsPrices = [xpath searchWithXPathQuery:#"//html//*[#class=\"psliprice\"]"];
...
Another inconvenience is when you decide to use a for or while cycle to retrieve various html pages, in fact if you use:
NSData *htmlData = [[NSData alloc] initWithContentsOfURL:[NSURL URLWithString:myURL]];
initWithContenctsOfURL many times during the cycle cannot get correctly page (and debug console write the famous UNABLE TO PARSE )so I've decide to change it with:
// Send a synchronous request
NSURLRequest * urlRequest = [NSURLRequest requestWithURL:[NSURL URLWithString:myURL]];
NSURLResponse * response = nil;
NSError * error = nil;
NSData * data = [NSURLConnection sendSynchronousRequest:urlRequest
returningResponse:&response
error:&error];
if (error == nil)
{
// Parse data here
}
And if you don't want to waiting this cycle 'cause it's maded by syncronous NSURLRequests try to call parent method with (and your viewcontroller don't freeze waiting for parser):
_dispatch_queue_t *queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
dispatch_async( _queue, // now i call my google shopping parser cycle
^{
[self GShoppingParser];
});
Can you try changing the below line
NSData *htmlData = [[NSData alloc] initWithContentsOfURL:[NSURL URLWithString:myURL]];
to
NSData *Data = [[NSData alloc] initWithContentsOfURL:[NSURL URLWithString:myURL]];
and also the below line
TFHpple *xpath = [[TFHpple alloc] initWithHTMLData:htmlData];
to
TFHpple *xpathParser = [[TFHpple alloc] initWithHTMLData:data];
Let me know if this helps else there is one more line that you may need to change in your code.
happy coding!
I'm trying to learn objective-c (I'm very new to that) and I have issues with memory management...
I'm developing an iPad app that uses TouchXML.
I've created my class that extends CXMLDocument and does some initialisation by reading some contents and saving into properties.
Here is my code (SimpleManifest.h):
#interface SimpleManifest : CXMLDocument {
CXMLNode *_defaultOrganization;
NSString *_title;
NSDictionary *dictionary;
}
#property (readonly) CXMLNode *defaultOrganization;
#property (readonly) NSString* title;
- (id) initWithPath:(NSString *)path options:(NSUInteger)options error:(NSError **)error;
#end
(SimpleManifest.m):
#import "SimpleManifest.h"
#import "CXMLNode_XPathExtensions.h"
#implementation SimpleManifest
- (id) initWithPath:(NSString *)path options:(NSUInteger)options error:(NSError **)error
{
/*
NSURL *theURL = [[[NSURL alloc] initFileURLWithPath:path] autorelease];
self = [self initWithContentsOfURL:theURL options:options error:error];
*/
NSData *data = [NSData dataWithContentsOfFile:path];
NSString *s = [[[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding] autorelease];
self = [self initWithXMLString:s options:options error:error];
if (self==nil) return nil;
// load main props
dictionary = [NSDictionary dictionaryWithObjectsAndKeys:
#"http://www.imsglobal.org/xsd/imscp_v1p1", #"imscp",
#"http://ltsc.ieee.org/xsd/LOM", #"lom", nil];
// defualt organization
#try {
CXMLNode *orgsElem = [[[self childAtIndex:0] nodesForXPath:#"//imscp:organizations" namespaceMappings:dictionary error:nil] objectAtIndex:0];
NSString *xpath = [NSString stringWithFormat:#"//imscp:organization[#identifier='%#']", [[orgsElem attributeForName:#"default"] stringValue]];
_defaultOrganization = [[[self childAtIndex:0] nodesForXPath:xpath namespaceMappings:dictionary error:nil] objectAtIndex:0];
/*
NSArray *nodes = [[self childAtIndex:0] nodesForXPath:#"//imscp:organizations" namespaceMappings:dictionary error:nil];
NSString *xpath = [NSString stringWithFormat:#"//imscp:organization[#identifier='%#']", [[[nodes objectAtIndex:0] attributeForName:#"default"] stringValue]];
_defaultOrganization = [[[self childAtIndex:0] nodesForXPath:xpath namespaceMappings:dictionary error:nil] objectAtIndex:0];
*/
CXMLNode *titleElem = [[[self childAtIndex:0]
nodesForXPath:#"//lom:general/lom:title/lom:string"
namespaceMappings:dictionary
error:nil] objectAtIndex:0];
_title = [[titleElem stringValue] copy];
} #catch (NSException * e){
self = nil;
return nil;
}
return self;
}
#end
Later on in another class I do:
- (BOOL) isValidSCORMLesson:(NSString*) path {
NSString *manifPath = [path stringByAppendingPathComponent:#"imsmanifest.xml"];
if (![[NSFileManager defaultManager] fileExistsAtPath: manifPath isDirectory: NO])
return NO;
SimpleManifest *manifest = [[[SimpleManifest alloc] initWithPath:manifPath options:0 error:nil] autorelease];
NSLog(#"%#", manifest.defaultOrganization);
NSLog(#"%#", manifest.title);
return (manifest!=nil);
}
It gives me tons of "pointer being freed was not allocated" errors...
The thing changes if I comment out the NSLog calls above or just log the manifest.title property.
Project is not using ARC, so I'm sure I'm doing something wrong with memory management.
Can someone please help me understand where I'm doing wrong? Thanks!
There isn't anything obviously wrong with that code that would cause malloc errors. Best guess is that there is a bug in the CXMLDocument class/library or some mistake in the way you are using it.
Note that a "pointer being freed was not allocated" means that someone called free() (or dealloc, effectively) on a pointer to a piece of memory that was not allocated in the first place. It usually gives you a breakpoint you can set that will then give you a backtrace of exactly where it happened.
Some comments:
(1) Do not #try/#catch in that fashion. Just don't catch at all. The pattern you are using will hide any errors. Exceptions are not meant to be recoverable in iOS/Cocoa.
(2) You can create an NSString instance directly from a file; no need to load via NSData first.
(3) You should use ARC.