A clean and robust way to parse URL strings in Objective C - ios

I have a requirement to take a string that represents a URL that can be in many formats and standardise it so it conforms with the URL spec.
If the URL string does not have a scheme, or it has a scheme that is not 'http' or 'https', it should use a default scheme.
I wanted to use NSURLComponents but if a scheme is not provided it parses the host as a path
NSURLComponents *components = [NSURLComponents componentsWithString:#"www.google.com.au"];
components.scheme = #"http";
NSLog(#"1: %#", components.path);
NSLog(#"2: %#", components.host);
NSLog(#"3: %#", components.string);
testtest[2619:869020] 1: www.google.com.au
testtest[2619:869020] 2: ((null))
testtest[2619:869020] 3: http:www.google.com.au <-- Invalid
Therefore I ended up with this category on NSString
#define DEFAULT_SCHEME #"http"
#implementation NSString (standardiseUrlFormat)
- (NSString*)standardiseUrlFormat {
NSURLComponents *components = [NSURLComponents componentsWithString:self];
BOOL hasScheme = components.scheme != nil;
// If no scheme or an invalid scheme is provided, default to http
if (!hasScheme) {
// We have to use string concatenation here because NSURLComponents will
// put the hostname as the path if there is no scheme
return [NSString stringWithFormat:#"%#://%#", DEFAULT_SCHEME, self];
}
// Now we know that a scheme exists, check if it is a correct scheme
if (![components.scheme isEqualToString:#"http"] &&
![components.scheme isEqualToString:#"https"]) {
// Overwrite scheme if not supported
components.scheme = DEFAULT_SCHEME;
}
return [components string];
}
#end
With the following output
NSLog(#"1: %#", [#"http://www.google.com" standardiseUrlFormat]);
NSLog(#"2: %#", [#"www.google.com" standardiseUrlFormat]);
NSLog(#"3: %#", [#"https://www.google.com" standardiseUrlFormat]);
NSLog(#"4: %#", [#"https://www.google.com/some_path" standardiseUrlFormat]);
NSLog(#"5: %#", [#"www.google.com/some_path" standardiseUrlFormat]);
testtest[7411:944022] 1: http://www.google.com
testtest[7411:944022] 2: http://www.google.com
testtest[7411:944022] 3: https://www.google.com
testtest[7411:944022] 4: https://www.google.com/some_path
testtest[7411:944022] 5: http://www.google.com/some_path
Can anyone suggest a cleaner solution that doesn't use two methods (NSURLComponents and string concatenation) to construct the string?

Don't use string concatenation at all. Use NSURLComponents to form the desired NSURL; that's what it's for. For example, if you don't like what the scheme is, set the scheme to what you do want.
EDIT I guess I was thinking that having detected that this is a hostless URL you would rejigger it by hand, e.g.
let s = "www.apple.com/whatever" as NSString
let arr = s.pathComponents
let c = NSURLComponents()
c.scheme = "http"
c.host = arr[0]
c.path = "/" + (Array(arr.dropFirst()) as NSArray).componentsJoinedByString("/")
But perhaps this can't be done, and the problem really is that a URL without a scheme is more or less not a URL.

Related

NSURLComponents componentsWithString - Rule

I'm writing unit tests to test out a URL generator class.
I'm using NSURLComponents componentsWithString] to generate the final URL object.
Is there a rule regarding how componentsWithString escapes forward slashes (/)?
Case 1:
NSURLComponents *urlComponents = [NSURLComponents componentsWithString: #"/foo"];
urlComponents.scheme = #"http";
urlComponents.host = [NSString stringWithFormat:#"www.bar.com"];
// [urlComponents URL] = http://www.bar.com/foo - Seems okay
Case 2:
NSURLComponents *urlComponents = [NSURLComponents componentsWithString: #"////foo"];
urlComponents.scheme = #"http";
urlComponents.host = [NSString stringWithFormat:#"www.bar.com"];
// [urlComponents URL] = http://www.bar.com//foo
Case 3:
NSURLComponents *urlComponents = [NSURLComponents componentsWithString: #"//////foo"];
urlComponents.scheme = #"http";
urlComponents.host = [NSString stringWithFormat:#"www.bar.com"];
// [urlComponents URL] = http://www.bar.com////foo
Why do Case 2 and 3 reduce the number of slashes to 2 and 4 respectively?
Your case 2 and 3 don't conform to RFC 3986 path format as specified in the NSURLComponents documentation: https://developer.apple.com/documentation/foundation/nsurlcomponents?language=objc
The NSURLComponents class is a class that is designed to parse URLs based on RFC 3986 and to construct URLs from their constituent parts.
From the path section: https://www.rfc-editor.org/rfc/rfc3986#section-3.3 of the RFC 3986 spec it mentions that your path can't begin with // unless there's an authority component:
If a URI
does not contain an authority component, then the path cannot begin
with two slash characters ("//").
If you adjusts your case 2 and 3 to have at least one character in between like this:
NSURLComponents *urlComponents = [NSURLComponents componentsWithString: #"/a/////foo"];
I believe it should output the correct number of slashes.

How to get Values from a NSString

I have an NSString. It is a URL I am getting when using Universal Links. I want to get id value from it. Is there any direct methods in SDK or do we need to use componententsSeparated String values?
Below is the NSString/URL:
https://some.com/cc/test.html?id=3039#value=test
I want to get two things: "test" from test.html and "id" value.
Use NSURLComponents created from an NSURL or NSString of your URL.
From there you can use path to get the /cc/test.html part. Then use lastPathComponent to get test.html and finally use stringByDeletingPathExtension to get test.
To get the "id" value start with the components' queryItems value. Iterate that array finding the NSURLQueryItem with the name of "id" and then get its value.
You could create NSURLComponents from URL string get parameters by calling queryItems method. It will return array of NSURLQueryItem
NSURLComponents *components = [NSURLComponents componentsWithString:#"https://some.com/cc/test.html?id=3039&value=test"];
NSArray *array = [components queryItems];
for(NSURLQueryItem *item in array){
NSLog(#"Name: %#, Value: %#", item.name, item.value);
}
NSLog(#"Items: %#", array);
We can make extension
extension URL {
func getParamValue(paramaterName: String) -> String? {
guard let url = URLComponents(string:self.absoluteString ) else { return nil }
return url.queryItems?.first(where: { $0.name == paramaterName})?.value
}
}
Now you can call like below
let someURL = URL(string: "https://some.com/cc/test.html?id=3039&value=test")!
someURL.getParamValue("id") // 3039
someURL.getParamValue("value") // test

Youtube URL validation ios

I am using this method to validate youtube url but it's not working.
-(BOOL) validateUrl: (NSString *) candidate
{
NSString *urlRegEx = #"(http://youtu.be/[a-zA-Z0-9_-]+)";
urlTest = [NSPredicate predicateWithFormat:#"SELF MATCHES %#", urlRegEx];
return [urlTest evaluateWithObject:candidate];
}
-(BOOL) validateUrl1: (NSString *) candidate1
{
NSString *urlRegEx1 = #"(https://(www|m){0,1}.youtube.com/(watch|playlist)[?]v=[a-zA-Z0-9_-]+)";
urlTest1 = [NSPredicate predicateWithFormat:#"SELF MATCHES %#", urlRegEx1];
return [urlTest1 evaluateWithObject:candidate1];
}
Even if I edit the url and make it y.be instead of youtu.be, still
these methods are returning YES
. Kindly tell me what's wrong in my code.
If any one has a better RegEx please share that with me.
I would also like to know how to write a RegEx.
If you want to check if String is the link to the youtube video (not the link for channel, or for embedding video):
func isYoutubeLink(checkString checkString: String) -> Bool {
let youtubeRegex = "(http(s)?:\\/\\/)?(www\\.|m\\.)?youtu(be\\.com|\\.be)(\\/watch\\?([&=a-z]{0,})(v=[\\d\\w]{1,}).+|\\/[\\d\\w]{1,})"
let youtubeCheckResult = NSPredicate(format: "SELF MATCHES %#", youtubeRegex)
return youtubeCheckResult.evaluateWithObject(checkString)
}
NSString *urlString = [NSString stringWithFormat:#"URL_STRING"];
NSString *urlRegEx = #"(?:(?:\.be\/|embed\/|v\/|\\?v=|\&v=|\/videos\/)|(?:[\\w+]+#\\w\/\\w(?:\/[\\w]+)?\/\\w\/))([\\w-_]+)";
NSPredicate *urlPredic = [NSPredicate predicateWithFormat:#"SELF MATCHES %#", urlRegEx];
BOOL isValidURL = [urlPredic evaluateWithObject:urlString];
if(isValidURL)
{
// your URL is valid;
}
else
{
// show alert message for invalid URL;
}
now please check let me know i'm waiting your replay it's work or not.
If it is regex that you are looking for, then perhaps you could try this:
(((?:https?:\/\/)?(?:www\.|m\.)?(?:youtube\.[a-z]{2,}|youtu\.be)(?:\/(?:playlist|watch\?|channel\/|user\/)[\w=]+)+)
It will match:
http://, https:// or none of these
www. or m. or none of these
youtube.<anyTLD> or youtu.be
/
playlist, watch?, channel or user
a string of characters from a-zA-Z0-9 and = more than 1 time
This should effectively match most urls on Youtube apart from the homepage, which if requested I could add in with a bit of tweaking.
You may have to add another escape, I haven't got much experience with objective-c
(((?:https?:\\/\\/)?(?:www\\.|m\.)?(?:youtube\\.[a-z]{2,}|youtu\\.be)(?:\\/(?:playlist|watch\\?|channel\\/|user\\/)[\\w=]+)+)
Here is an example of it working in Javascript, Python and PCRE:
https://regex101.com/r/sG1xP7/1
I hope this helps you
For Swift 5:
func isValidYouTubeLink(givenString: String) -> Bool {
let youtubeRegex = "(http(s)?:\\/\\/)?(www\\.|m\\.)?youtu(be\\.com|\\.be)(\\/watch\\?([&=a-z]{0,})(v=[\\d\\w]{1,}).+|\\/[\\d\\w]{1,})"
let youtubeCheckResult = NSPredicate(format: "SELF MATCHES %#", youtubeRegex)
return youtubeCheckResult.evaluate(with: givenString)
}

ios8 to validate urlField? [duplicate]

Help me to write the code like "if my string is a valid URL do smth"
Is it possible to write this in a couple strings of code?
I will assume that by URL, you are referring to a string identifying a internet resource location.
If you have an idea about the format of the input string , then why not manually check if the string starts with http://, https:// or any other scheme you need. If you expect other protocols, you can also add them to the check list (e.g. ftp://, mailto://, etc)
if ([myString hasPrefix:#"http://"] || [myString hasPrefix:#"https://"])
{
// do something
}
If you are looking for a more solid solution and detect any kind of URL scheme, then you should use a regular expression.
As a side note, the NSURL class is designed to express any kind of resource location (not just internet resources). That is why, strings like img/demo.jpg or file://bla/bla/bla/demo.jpg can be transformed into NSURL objects.
However, according to the documentation the [NSURL URLWithString] should return nil if the input string is not a valid internet resource string. In practice it doesn't.
+ (BOOL)validateUrlString:(NSString*)urlString
{
if (!urlString)
{
return NO;
}
NSDataDetector *linkDetector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypeLink error:nil];
NSRange urlStringRange = NSMakeRange(0, [urlString length]);
NSMatchingOptions matchingOptions = 0;
if (1 != [linkDetector numberOfMatchesInString:urlString options:matchingOptions range:urlStringRange])
{
return NO;
}
NSTextCheckingResult *checkingResult = [linkDetector firstMatchInString:urlString options:matchingOptions range:urlStringRange];
return checkingResult.resultType == NSTextCheckingTypeLink
&& NSEqualRanges(checkingResult.range, urlStringRange);
}
I used this solution which is apparently a better and less complex check than a Regex check -
- (BOOL)isURL:(NSString *)inputString
{
NSURL *candidateURL = [NSURL URLWithString:inputString];
return candidateURL && candidateURL.scheme && candidateURL.host;
}
Try to create NSUrl with it, and see if it returns non-nil result.
if ([NSURL URLWithString:text]) {
// valid URL
}
else {
// invalid URL
}

Get position of NSString in string - iOS

I am developing an iOS app and one of the things I need to do it to go over URLs and replace the first protocol section with my own custom protocol.
How can I delete the first few characters of a NSString before the "://"?
So for example I need convert the following:
http://website.com --> cstp://website.com
ftp://website.com --> oftp://website.com
https://website.com --> ctcps://website.com
The main problem I face, is that I can't just delete the first 'x' number of characters from the URL string. I have to detect how many characters there are till the "://" characters are reached.
So how can I count how many characters there are from that start of the string to the "://" characters?
Once I know this, I can then simply do the following to delete the characters:
int counter = ... number of characters ...
NSString *newAddress = [webURL substringFromIndex:counter];
Thanks for your time, Dan.
http://website.com is a URL, and http is the scheme part of the URL. Instead of string manipulation I would recommend to use the
NSURLComponents class which is made exactly for this purpose: inspect, create and modify URLs:
NSString *originalURL = #"http://website.com";
NSURLComponents *urlcomp = [[NSURLComponents alloc] initWithString:originalURL];
if ([urlcomp.scheme isEqualToString:#"http"]) {
urlcomp.scheme = #"cstp";
} else if ([urlcomp.scheme isEqualToString:#"ftp"]) {
urlcomp.scheme = #"otfp";
}
// ... handle remaining cases ...
NSString *modifiedURL = [urlcomp string];
NSLog(#"%#", modifiedURL); // cstp://website.com
If the number of cases grows then a dictionary mapping is easier to
manage:
NSDictionary *schemesMapping = #{
#"http" : #"cstp",
#"ftp" : #"otfp"
#"https" : #"ctcps" };
NSURLComponents *urlcomp = [[NSURLComponents alloc] initWithString:originalURL];
NSString *newScheme = schemesMapping[urlcomp.scheme];
if (newScheme != nil) {
urlcomp.scheme = newScheme;
}
NSString *modifiedURL = [urlcomp string];
You can use:
NSRange range = [urlString rangeOfString:#"://"];
range.location will give you the first index from where the "://" starts and you can use it as:
NSString *newAddress = [urlString substringFromIndex:range.location];
and append your prefix:
NSString *finalAddress = [NSString stringWithFormat:#"%#%#", prefixString, newAddress];

Resources