I came across a weird behaviour with the String.removeSubrange function.
This is what the documentation says:
Removes the characters in the given range.
Parameters
bounds The range of the elements to remove. The upper and lower bounds
of bounds must be valid indices of the string and not equal to the
string’s end index.
bounds The range of the elements to remove. The upper and lower bounds
of bounds must be valid indices of the string.
The documentation already states that the range can not include the endIndex of the string, but I think that should be changed.
Lets look at an example why.
I have a string "12345" and I want to remove the first three characters which would result in "45".
The code for that is the following:
// Remove the characters 123
var str = "12345"
let endRemoveIndex = str.index(str.startIndex, offsetBy: 2)
str.removeSubrange(str.startIndex...endRemoveIndex)
So far so good I just create a closed range from the startIndex to the startIndex advanced by 2.
Lets say I want to remove the characters "345" I would expect the following code to work:
str = "12345"
let startRemoveIndex = endRemoveIndex
str.removeSubrange(startRemoveIndex...str.endIndex)
However this does not work as the documentation has already mentioned.
This results in fatalError saying
Can't advance past endIndex
The code that works for removing the last three characters is the following:
// Remove the characters 345
str = "12345"
let startRemoveIndex = endRemoveIndex
str.removeSubrange(startRemoveIndex..<str.endIndex)
That in my opinion is syntactically incorrect, because the half range operator implies that the maximum will not be included, but in this case it is.
What do you think about that?
Hamish pointed out that the String.endIndex is a “past the end” position which is the position one greater than the last valid subscript argument.
Related
I've been looking for a good way to see if a string of items are all numbers, and thought there might be a way of specifying a range from 0 to 9 and seeing if they're included in the string, but all that I've looked up online has really confused me.
def validate_pin(pin)
(pin.length == 4 || pin.length == 6) && pin.count("0-9") == pin.length
end
The code above is someone else's work and I've been trying to identify how it works. It's a pin checker - takes in a set of characters and ensures the string is either 4 or 6 digits and all numbers - but how does the range work?
When I did this problem I tried to use to_a? Integer and a bunch of other things including ranges such as (0..9) and ("0..9) and ("0".."9") to validate a character is an integer. When I saw ("0-9) it confused the heck out of me, and half an hour of googling and youtube has only left me with regex tutorials (which I'm interested in, but currently just trying to get the basics down)
So to sum this up, my goal is to understand a more semantic/concise way to identify if a character is an integer. Whatever is the simplest way. All and any feedback is welcome. I am a new rubyist and trying to get down my fundamentals. Thank You.
Regex really is the right way to do this. It's specifically for testing patterns in strings. This is how you'd test "do all characters in this string fall in the range of characters 0-9?":
pin.match(/\A[0-9]+\z/)
This regex says "Does this string start and end with at least one of the characters 0-9, with nothing else in between?" - the \A and \z are start-of-string and end-of-string matchers, and the [0-9]+ matches any one or more of any character in that range.
You could even do your entire check in one line of regex:
pin.match(/\A([0-9]{4}|[0-9]{6})\z/)
Which says "Does this string consist of the characters 0-9 repeated exactly 4 times, or the characters 0-9, repeated exactly 6 times?"
Ruby's String#count method does something similar to this, though it just counts the number of occurrences of the characters passed, and it uses something similar to regex ranges to allow you to specify character ranges.
The sequence c1-c2 means all characters between c1 and c2.
Thus, it expands the parameter "0-9" into the list of characters "0123456789", and then it tests how many of the characters in the string match that list of characters.
This will work to verify that a certain number of numbers exist in the string, and the length checks let you implicitly test that no other characters exist in the string. However, regexes let you assert that directly, by ensuring that the whole string matches a given pattern, including length constraints.
Count everything non-digit in pin and check if this count is zero:
pin.count("^0-9").zero?
Since you seem to be looking for answers outside regex and since Chris already spelled out how the count method was being implemented in the example above, I'll try to add one more idea for testing whether a string is an Integer or not:
pin.to_i.to_s == pin
What we're doing is converting the string to an integer, converting that result back to a string, and then testing to see if anything changed during the process. If the result is =>true, then you know nothing changed during the conversion to an integer and therefore the string is only an Integer.
EDIT:
The example above only works if the entire string is an Integer and won’t properly deal with leading zeros. If you want to check to make sure each and every character is an Integer then do something like this instead:
pin.prepend(“1”).to_i.to_s(1..-1) == pin
Part of the question seems to be exactly HOW the following portion of code is doing its job:
pin.count("0-9")
This piece of the code is simply returning a count of how many instances of the numbers 0 through 9 exist in the string. That's only one piece of the relevant section of code though. You need to look at the rest of the line to make sense of it:
pin.count("0-9") == pin.length
The first part counts how many instances then the second part compares that to the length of the string. If they are equal (==) then that means every character in the string is an Integer.
Sometimes negation can be used to advantage:
!pin.match?(/\D/) && [4,6].include?(pin.length)
pin.match?(/\D/) returns true if the string contains a character other than a digit (matching /\D/), in which case it it would be negated to false.
One advantage of using negation here is that if the string contains a character other than a digit pin.match?(/\D/) would return true as soon as a non-digit is found, as opposed to methods that examine all the characters in the string.
I am checking the values of a string that is a unique identifier for a third party service that has some strict rules about the identifier, if a duplicate is generated I need to catch it and replace a character to make it unique. The Rules: It must be a string, it must be <= 21 characters long, the last four characters are significant and come preset and can't be altered, the first 15 characters are significant come preset and can't be altered, so I only have two characters that I can alter, and finally another third party system sets the string and will gladly duplicate them if the circumstances are right. They're always right. like. always... lol
At first I thought of using str.next! but that violates the last four rule. Then I tried str.insert(-5, rand(9).to_s) That would alter one of the correct characters and make the string unique, but it violates the <=21 characters rule.
str = "abcdefghijklmnoXX_123" (I can safely alter the XX)
str.next! (makes it unique but violates last four rule)
str.insert(-5, rand(9).to_s) (alters the correct characters and makes it unique, but violates the str.length rule.
How can I replace the correct character set without altering the string length or violating any further rules? Oh, It is also preferred that I not shorten the string length if possible.
I have assumed that the characters being replaced do not have to be random, but simply different from each other and different from all of the other characters in the string. If they are for some reason to be selected randomly, further specificity is required, specifically the collection of characters from which characters are to be drawn randomly. I have a further comment on this at the end of my answer.
REQD_BEGIN = 15
REQD_END = 4
PERMITTED_CHARS = ('a'..'z').to_a.join
#=> "abcdefghijklmnopqrstuvwxyz"
str = "abcdefrqsjklmnoXX_123"
nbr_replacements = str.size - REQD_BEGIN - REQD_END
#=> 2
available_chars =
PERMITTED_CHARS.delete(str[0,REQD_BEGIN].downcase +
str[-REQD_END, REQD_END].downcase)
#=> "ghiptuvwxyz"
str[0, REQD_BEGIN] + available_chars[0, nbr_replacements] +
str[-REQD_END, REQD_END]
#=> "abcdefrqsjklmnogh_123"
This does not modify ("mutate) str. To mutate the string, change the last line to:
s[REQD_BEGIN, nbr_replacements] = available_chars[0, nbr_replacements]
#=> "gh"
Now:
s #=> "abcdefrqsjklmnogh_123"
If the replacement characters are to be selected randomly (but satisfy the uniqueness properties set out at the onset), the constant PERMITTED_CHARS would be set equal to a string containing the characters from which a random sample would be drawn. available_chars would be computed as now, but available_chars[0, nbr_replacements] would be changed to available_chars.sample(nbr_replacements).
Clearest for me would be something like:
prefix = str[0..14]
middle = str[15..17]
suffix = str[18..-1]
unique_id = prefix + middle.next + suffix
If I understand right.
I want to find and sort by quantity the most passed 3 words in my UITextView.
For example:
"good good good very very good good. bad bad unfortunately bad."
It must do that:
good (5 times)
bad (3 times)
very (2 times)
How can I do this?
Thanks.
You can use String.components(separatedBy:) to get the words of textView.text, then you can use an NSCountedSet to get the count of each word.
You can of course tweak the separator characters used as an input to components(separatedBy:) to meet your exact criteria.
let textViewText = "good good good very very good good. bad bad unfortunately bad."
//separate the text into words and get rid of the "" results
let words = textViewText.components(separatedBy: [" ","."]).filter({ !$0.isEmpty })
//count the occurrence of each word
let wordCounts = NSCountedSet(array: words)
//sort the words by their counts in a descending order, then take the first three elements
let sortedWords = wordCounts.allObjects.sorted(by: {wordCounts.count(for: $0) > wordCounts.count(for: $1)})[0..<3]
for word in sortedWords {
print("\(word) \(wordCounts.count(for: word))times")
}
Output:
good 5times
bad 3times
very 2times
Here's a one liner that will give you the top 3 words in order of frequency:
let words = "good good good very very good good. bad bad unfortunately bad"
let top3words = Set(words.components(separatedBy:" "))
.map{($0,words.components(separatedBy:$0).count-1)}
.sorted{$0.1 > $01.1}[0..<3]
print(top3words) // [("good", 5), ("bad", 3), ("very", 2)]
It creates a set with each distinct words and then maps each of them with the count of occurrences in the string (words). Finally it sorts the (word,count) tuples on the count and returns the first 3 elements.
[EDIT] the only issues with the above method is that, although it works with your example string, it assumes that no word is contained in another and that they are only separated by spaces.
To do a proper job, the words must first be isolated in an array eliminating any special characters (i.e. non-letters). It may also be appropriate to ignore upper and lower case but you didn't specify that and I dint't want to add to the complexity.
Here's how the same approach would be used on an array of words (produced from the same string):
let wordList = words.components(separatedBy:CharacterSet.letters.inverted)
.filter{!$0.isEmpty}
let top3words = Set(wordList)
.map{ word in (word, wordList.filter{$0==word}.count) }
.sorted{$0.1>$1.1}[0..<3]
let word = "sample string"
let firstLetter = Character(word.substringToIndex(advance(word.startIndex,1)).uppercaseString)
I got the above example from a tutorial. Can anyone know what they mean by "advance" and what is difference between "substringToIndex" and "substringWithRange".
This advance syntax is from Swift 1, it's different now.
Swift 2
let firstLetter = Character(word.substringToIndex(word.startIndex.advancedBy(1)).uppercaseString)
The advancedBy method moves the current index along the String.
With substringToIndex you slice a part of the String, beginning at the start of the String and ending at the index defined by advancedBy.
Here you advance by 1 in the String, so it means that substringToIndex will get the first character from the String.
Swift 3
The syntax has changed again, we now use substring and an index with an offset:
let firstLetter = Character(word.substring(to: word.index(word.startIndex, offsetBy: 1)).uppercased())
substringToIndex
Returns a new string containing the characters of the receiver up to,
but not including, the one at a given index.
Return Value A new string containing the characters of the receiver up to, but not including, the one at anIndex. If anIndex is
equal to the length of the string, returns a copy of the receiver.
substringWithRange
Returns a string object containing the characters of the receiver that
lie within a given range.
Return Value A string object containing the characters of the receiver that lie within aRange.
Special Considerations This method detects all invalid ranges (including those with negative lengths). For applications linked
against OS X v10.6 and later, this error causes an exception; for
applications linked against earlier releases, this error causes a
warning, which is displayed just once per application execution.
For more info detail, you can get in the Apple NSString Class Reference
Your tutorial is outdated. advance was deprecated in Swift 2. Strings in Swift cannot be randomly accessed, i.e. there's no word[0] to get the first letter of the string. Instead, you need an Index object to specify the position of the character. You create that index by starting with another index, usually the startIndex or endIndex of the string, then advance it to the character you want:
let word = "sample string"
let index0 = word.startIndex // the first letter, an 's'
let index6 = word.startIndex.advancedBy(6) // the seventh letter, the whitespace
substringToIndex takes all characters from the left of string, stopping before the index you specified. These two are equivalent:
print("'\(word.substringToIndex(index6))'")
print("'\(word[index0..<index6])'")
Both print 'sample'
I do not understand why in the following code, the extended range that will be printed is
"location: 1, length: 1" . Why was the range length extended from 0 to 1?
NSString * text = #"abc";
NSRange range = NSMakeRange(1, 0);
NSRange extendedRange = [text rangeOfComposedCharacterSequencesForRange:range];
NSLog(#"extended range: location %d, length : %d ",extendedRange.location,extendedRange.length);
The doc says that the result of this is:
The range in the receiver that includes the composed character
sequences in range.
with the following discussion
This method provides a convenient way to grow a range to include all composed character sequences it overlaps.
But the text #"abc" does not contain any composed character, which makes me think that the result should be the same range, unmodified, and anyway, I would think that a range of length 0 would not overlap any character.
This looks like a bug to me, but I might have missed something. Is that normal?
It's probably a bug.
The implementation of rangeOfComposedCharacterSequencesForRange: just calls rangeOfComposedCharacterSequenceAtIndex: twice, with the start and end indexes of the range, and returns the combined range.
The documentation does not explicitly state that the characters at the edges of the provided range are never included but I agree that the observed behavior feels wrong.
You should file a bug.
The documentation of rangeOfComposedCharacterSequencesForRange: say
Return Value
The range in the receiver that includes the composed character sequences in range.
Discussion
This method provides a convenient way to grow a range to include all composed character sequences it overlaps.
as the location is valid it is considering that the range overlap the character at this location