regular expression for objective c to check consecutive incremental number - ios

I want to check if the number pin user entered is too simple. 3 cases would fail, repeating numbers, like "1111"; increasing ones like "1234"; decreasing ones like "4321". Is there a regex which could check these restrictions?

Regex can match specific text pattern but it can't understand its context..Yes you want to check for increasing numbers but there's no such thing as increasing pattern in regex.
You can check for repeated numbers using ^(\d)\1+$
But to check for increasing,decreasing numbers you would have to parse the string to int and check if they are in increasing or decreasing order manually using the %,/ operations

Related

Google Sheets - Split Data

I have these data in Google Sheets
$71,675_x000d_
$80,356_x000d_
$107,361_x000d_
$123,393_x000d_
$116,878
I want them to be split into different columns.
However, when I do so using Data > Split Data into Different Columns, it separates $71 and 675_x000d_ but I need the $71,275 and remove the xoood
Please note that the last number doesn't have those extra characters.
Please help.
Your post says you want to "remove the x000d (that is, extract only the dollar amounts). That said, let's say your raw data starts in A2 (i.e., the data is in A2:A). Place the following formula into the first cell of another otherwise empty column (e.g., B1):
=ArrayFormula({"Extracted";IF(A2:A="",,REGEXEXTRACT(SUBSTITUTE(A2:A&"_",",",""),"\d+"))})
How It Works:
ArrayFormula(...) signifies that we'll be processing an entire range and not just one cell.
The outer curly brackets {...} signify that a virtual array will be formed from non-like or non-contiguous pieces.
The first piece of the virtual array is the header. Here, that is "Extracted"; but you can change it as you like.
The semicolon means "place the next information below the previous part."
IF(A2:A="",, ...) is a standard check that basically says "Don't try to process any blank cells in Column A"; or alternatively worded, "If any cell in A2:A is blank/null, do nothing."
Skipping the REGEXEXTRACT for now, A2:A&"_" appends an underscore to every entry in A2:A. This allows entries in A2:A that are just a dollar amount (e.g., from the post, $116,878) to have a consistent symbol following them if not already there. (And adding the underscore to anything that already has an underscore won't matter, because we won't be extracting that far out.)
Now that we've got the new strings, we SUBSTITUTE every comma for a null (i.e., delete all commas).
Finally, REGEXEXTRACT will take all of the virtually modified strings and extract \d+, which means only digits (\d) in an unbroken sequence of any length greater than 0 (+). Note that REGEXEXTRACT will only return the first such match it encounters as written, so 000 will not be extracted.
An IFERROR wrap is placed around the REGEXEXTRACT, just in case you have any situations in real life that don't have any sequence of numbers at all. In these cases, nothing will be returned (whereas, without the IFERROR, an error would have been returned).
Once the extraction is done, you can apply Format > Number > Currency (rounded) to the entire column.
Addendum:
After an additional comment (below), it appears that the raw data is in Column T, that all five entries are in one cell and that the OP would like all five amounts extracted across each row. That being the case, assuming that Columns U:Y are empty to start, place the following in cell U1 (not U2):
=ArrayFormula({"Va11","Val2","Val3","Val4","Val5";IF(T2:T="",,IFERROR(REGEXEXTRACT(SUBSTITUTE(T2:T&"_",",",""),REPT("\$(\d+)[^\$]*",5))))})
This works much the same way as the previous formula. The differences:
There are five headers now.
You'll see REPT(...,5) here. This is an easy way to repeat the same extraction five times.
That repeated extraction is now the following:
\$(\d+)[^\$]*
The backslash in front of the dollar signs means to treat those symbols as literals instead of as their usual meaning (i.e., end-of-string). So the extraction reads as follows:
\$ anything that starts with a dollar sign
(\d+) extract what is between the ( ), which is any group of digits [^$]*` followed by any number (including 0) characters that are not dollar signs
As I said, the REPT will repeat this five times; so five groups matching this pattern will be extracted.
Understand that if you have any groups that don't follow the pattern exactly, resulting in five matching extractions, nothing will be returned.
Be sure to format U:Y as currency rounded, or you will wind up with some of those numbers translating as raw dates and therefore being completely off.
Please use the following formula and format cells to your needs.
=ArrayFormula(IFERROR(SPLIT(REGEXREPLACE(A2:A,"\n|_x000d_","√"),"√")))
The big advantage of the above formula compared to others is that it works for any number of lines included within a single cell (as shown in the image below).
Functions used:
ArrayFormula
IFERROR
SPLIT
REGEXREPLACE
You can use SPLIT function:
=ArrayFormula(IF(LEN(A:A),SPLIT(A:A,"_x000d_",FALSE),""))

Character Replacements

I have a UniCode string UniStr.
I also have a MAP of { UniCodeChar : otherMappedStrs }
I need the 'otherMappedStrs' version of UniStr.
Eg: UniStr = 'ABC', MAP = { 'A':'233','B':'#$','C':'9ij' }, Result = '233#$9ij'
I have come up with the formula below which works;
=ArrayFormula(JOIN("",VLOOKUP(REGEXEXTRACT(A1,REPT("(.)",LEN(A1))),MapRange,2,FALSE)))
The MAP being a whole character set (40 chars) is quite large.
I need to use this function in multiple spreadsheets. How can I subsume the MAP into the formula for portability ?
Is there a better way to iterate a string other than the REGEXEXTRACT method in formula ? This method has limitation for long strings.
I also tested the below formula. Problem here is it gives 2 results (or the size of the array within SUBSTITUTE replacement). If 3 substitutions made, then it gives three results. Can this be resolved ?
=ArrayFormula(SUBSTITUTE(A1,{"s","i"},{"#","#"}))
EDIT;
#Tom 's first solution appears best for my case (1) REGEX has an upper limit on search criteria which does not hinder in your solution (2) Feels fast (did not do empirical testing) (3) This is a better way to iterate string characters, I believe (you answered my Q2 - thanks)
I digress here. I wish google would introduce Named-Formulas or Formula-Aliases. In this case, hypothetically below. I have sent feed back along those lines many times. Nothing :(
MyFormula($str) == ArrayFormula(join(,vlookup(mid($str,row(indirect("1:"&len($str))),1), { "A","233";"B","#$";"C","9ij" },2,false)))
Not sure how long you want your strings to be, but the more traditional
=ArrayFormula(join(,vlookup(mid(A1,row(indirect("1:"&len(A1))),1), { "A","233";"B","#$";"C","9ij" },2,false)))
seems a bit more robust for long strings.
For a more radical idea, supposing the maximum length of your otherMappedStrings is 3 characters, then you could try:
=ArrayFormula(join(,trim(mid("233 #$9ij",find(mid(A1,row(indirect("1:"&len(A1))),1), "ABC")*3-2,3))))
where I have put a space in before #$ to pad it out to 3 characters.
Incidentally the original VLOOKUP is not case sensitive. If you want this behaviour, use SEARCH instead of FIND.
You seem to have several different Qs, but considering only portability, perhaps something like the following would help:
=join(,switch(arrayformula(regexextract(A1&"",rept("(.)",len(A1)))),"A",233,"B","#$","C","9ij"))
extended with 37 more pairs.

Create custom formula from Mathematical numbers and operators

I need to create a customized formula generated from Numbers (1-100) & Operators (+,-,*,/). The precedence of operation has to be dynamic. User can choose the numbers/operators one-by-one. But how to arrange the precedence for where to put the Brackets and all to make it a valid expression. For Example :
(21*430)-17+((45/5)+135)+243-16
((97+74)+242-(169+13)*4)/76-21+(36*4012)
We need to take numbers/operators as inputs but the order/type can be dynamic & random. The only limitation is in an expression the total numbers used is fixed (here in 1. 8 numbers & 2. 10 numbers).
Please suggest a feasible solution to this.

Why is it best to store a telephone number as a string vs. integer?

As the question states, why is it considered best practice to store telephone numbers as strings rather than integers in the telephone_number column?
Not sure I understand the rationale for this. Please help clear this up!
Thanks!
Telephone numbers are strings of digit characters, they are not integers.
Consider for example:
Expressing a telephone number in a different base would render it meaningless
Adding or multiplying two telephone numbers together, or any math operation on a phone number, is meaningless. The result is not another telephone number (except by conicidence)
Telephone numbers are intended to be entered "as-is" into a connected device.
Telephone numbers may have leading zeroes.
Manipulations of telephone numbers, such as adding an area code, are String operations.
Storing the string version of the telephone number makes this clear and unambiguous.
History: On old pulse-encoded dial systems, the code for each digit in a telephone number was sent as the same number of pulses as the digit (or 10 pulses for "0"). That may be why we still use digits to represent the parts of a phone number. See http://en.wikipedia.org/wiki/Pulse_dialing
What Neil Slater said is correct. I would add that there are lots of edge cases where you can't express a telephone number as a number value consistently.
For example, consider these numbers:
011-123-555-1212
+11-123-555-1212
+1 (112) 355-5121 x2
These are all potentially valid phone numbers, but they mean very different things. Yet, in integer form, they are all 111235551212.
If you are going to store the number for display from input, then you must use a string.
However, while it is true that no mathematical operations can be performed on a number that have meaning. Using a number in hashsets and for indexing is quicker than using a string. So provided you can guarantee or homogenise your set of numbers, so they are all consistent, then you may see better performance operating on a number.
For example, in the Telco world, rating calls for a given customer includes a lot of searching on their CLI and in this situation it is faster and cheaper to search by integer. Generally though strings will be fine performance wise, it is only where performance matters and you have multiple searches to perform for a huge range of numbers - i.e. Rating 250 million calls across 2 million lines and 2000 tariffs. In memory rating also gets expensive, so being able to use a 64bit int or uint is cheaper when dealing with these volumes.
Consider these phone numbers for example
099-1234-56789 or +91-8907-687665.
In this case,if the phone_number attribute is of type integer,then it can't accept these values.It should be a string to hold these type of values.So string is always preferred than integer
There is several reasons for this :
Phone numbers often start with a "0" : an integer will remove all leading "0"s
Phone number can have special char : +, (, -, etc. (for exemple : +33 (0)6 12 23 34)
You cannot perform operations on phones : adding phones, for instance, would be meaningless
Phone number may be internationalised, i.e. different format for different people, thus not possible with integers
There might be other reasons, but I guess that's already a fair amount of those :)

How to get a % difference of two NSStrings

I'm thinking this may be impossible to do resonably, but I figured I would take a shot at it. So lets say I have two NSStrings. One is #"Singin' In The Rain" and the other is #"Singing In The Rain". These strings are very similar, but have a small difference. I'm trying to find a way where I could write something like the following:
NSString *stringOne = #"Singin' In The Rain";
NSString *stringTwo = #"Singing In The Rain";
float dif = [stringOne differenceFrom:stringTwo];
//dif = .9634 or something like that
One project that I did find similar to this was taken from the previous similar question on Stack Overflow: Check if two NSStrings are similar. However, this simply returns a BOOL which isn't as accurate as I need it to be. I also tried looking into the compare: documentation for NSString but it all looked too basic. Another similar thing I found is at https://gist.github.com/iloveitaly/1515464. However, this gives varying results, even saying two of the same string are different occasionally. Any advice would be much appreciated.
The question is a little vague, but I would assume that the most satisfactory results will come from using NSLinguisticTagger. If you parse each for tags with the NSLinguisticTagSchemeLexicalClass scheme then your string will be broken down into verbs, nouns, adjectives, etc. In your example, even if you weren't spotting that singin' and singing are the same, you'd spot the other three words are the same and that the thing at the end is a noun, so they're both about doing something in the same thing.
It'd probably be wise to use something like a BK-Tree to compare individual words where you suspect there may be a match (a noun obviously doesn't match an adverb but two nouns may match even if spellings differ).
Another off the wall suggestion:
The source, and hence the algorithm, for diff and similar programs is easily available. These compare input on a line-by-line basis and detect insertions, deletions and changes.
When comparing text strings for "closeness" then the insertion, deletion or changing of words seems as good a measure as any.
So:
Break each string into "words" (white space separated should be sufficient).
Compare the two lists using the diff algorithm, treating each "word" as a "line", use a re-sync length of 1 (the number of "lines" that need to be the same to treat the two inputs as back in sync)
Calculate the "closeness" as the number of insertions/deletions/changes compared to the total word count.
For the two example strings this would give 1:4 changes or 75% similar.
If you want greater granularity for each change split the two words into characters and repeat the algorithm giving you a fraction the word is similar by (as opposed to the whole word).
For the two example strings this would give 3 6/7 words out of 4, or 96% similar.
I'd recommend dynamic time warping for such comparisons:
http://en.wikipedia.org/wiki/Dynamic_time_warping
This will however return distance between two strings (so you'll get 0 for identical), but this the best starting point I can think of.

Resources