I am trying to understand what does _rawBits really mean in Swift.String.Index(_rawBits:). If you print a String.Index, you get something like Swift.String.Index(_rawBits: 983040). But what does that really mean?
Can I (mathematically) calculate the actual index in a string using this rawBits number at all? either base 32, 16, or whatever else?
Swift Range uses String.Index as its upper and lower bounds.
Swift Range uses String.Index as its upper and lower bounds.`
No, it doesn't. Range<String.Index> does, but that's only one particular type of Range, which is otherwise generic over a type variable called Bound.
The _rawBits are an internal implementation detail of the string indices. You should treat them as an opaque type, that you can only manipulate using the corresponding index apis on String, Substring and friends.
Related
I'm new to Dart. The documentation say: "To test whether two objects x and y represent the same thing, use the == operator. (In the rare case where you need to know whether two objects are the exact same object, use the identical() function instead.)"
So, if type this code:
var foo = 'bar';
var baz = 'bar';
print(identical(foo, baz));
If i've well understood, foo and bar do not reference the same object. So identical() must return false, isn't it ?
But it's not the case, at least in DartPad.
Where is the problem.
In this case foo and bar do reference the same object.
That is because the compiler canonicalizes string literals.
The specification requires most constants to be canonicalized. If you create const Duration(seconds: 1) in two places, it will become the same object. Integers, doubles and booleans are always canonicalized, whether constant or not (the language pretends that there is only one instance per value).
Strings are special in that the specification is not entirely clear on whether they need to be canonicalized or not, but constant strings need to be canonicalized for constants to make sense, and all the compilers do that.
A literal is a constant expression, so string literals are always canonicalized. That means that 'bar' denotes the same object no matter where it occurs in your code.
For several built-in "literals", you will always get identical true for equal values.
bool
String
int
double (I think)
I have some very simple code that does a calculation and converts the resulting double to an int.
let startingAge = (Double(babyAge/2).rounded().nextDown)
print(startingAge)
for each in 0..<allQuestions.count {
if allQuestions[each] == "\(Int(startingAge))"
The first print of startingAge gives me the correct answer, for example 5.0. But when it converts to an Int, it gives me an answer of 4. When the Double is 6.0, the int is 5.
I'm feeling stupid, but can't figure out what I'm doing wrong.
When you call rounded(), you round your value to the nearest integer.
When you call .nextDown, you get the next possible value less than the existing value, which means you now have the highest value that's less than the nearest integer to your original value. This still displays as the integer when you print it, but that's just rounding; it's really slightly less than the integer. So if it's printing as "4.0", it's really something like 3.9999999999999 or some such.
When you convert the value to an Int, it keeps the integer part and discards the part to the right of the decimal. Since the floating-point value is slightly less than the integer you rounded to thanks to .nextDown, the integer part is going to be one less than that integer.
Solution: Get rid of the .nextDown.
When you cast you lose precession.
In your case the line returns a double: Assume baby age is 9 then startingAge is 3.999999
let startingAge = (Double(babyAge/2).rounded().nextDown)
and when you print it your answer becomes 3
print("\(Int(startingAge))")
To fix this use this line instead:
let startingAge = (Double(babyAge/2).rounded().nextDown).rounded()
This is what nextdown does, it does not round values, and if the number is
a floating point number it becomes slightly less. If the number was to be an int it would become 1 less I presume.
I'm creating an extension for String and I'm trying to decide what proper/expected/good behavior would be for a subscript operator. Currently, I have this:
// Will crash on 0 length strings
subscript(kIndex: Int) -> Character {
var index = kIndex
index = index < 0 ? 0 : index
index = index >= self.length ? self.length-1 : index
let i = self.startIndex.advancedBy(index)
return self.characters[i]
}
This causes all values outside the range of the string to be capped to the edge of the string. While this reduces crashing from passing a bad index to the subscript, it doesn't feel like the right thing to do. I am unable to throw an exception from a subscript and not checking the subscript causes a BAD_INSTRUCTION error if the index is out of bounds. The only other option I can think of is to return an optional, but that seems awkward. Weighing the options, what I have seems to be the most reasonable, but I don't think anybody using this would expect a bad index to return a valid result.
So, my question is: what is the "standard" expected behavior of the subscript operator and is returning a valid element from an invalid index acceptable/appropriate? Thanks.
If you're implementing a subscript on String, you might want to first think about why the standard library chooses not to.
When you call self.startIndex.advancedBy(index), you're effectively writing something like this:
var i = self.startIndex
while i < index { i = i.successor() }
This occurs because String.CharacterView.Index is not a random-access index type. See docs on advancedBy. String indices aren't random-access because each Character in a string may be any number of bytes in the string's underlying storage — you can't just get character n by jumping n * characterSize into the storage like you can with a C string.
So, if one were to use your subscript operator to iterate through the characters in a string:
for i in 0..<string.characters.count {
doSomethingWith(string[i])
}
... you'd have a loop that looks like it runs in linear time, because it looks just like an array iteration — each pass through the loop should take the same amount of time, because each one just increments i and uses a constant-time access to get string[i], right? Nope. The advancedBy call in first pass through the loop calls successor once, the next calls it twice, and so on... if your string has n characters, the last pass through the loop calls successor n times (even though that generates a result that was used in the previous pass through the loop when it called successor n-1 times). In other words, you've just made an O(n2) operation that looks like an O(n) operation, leaving a performance-cost bomb for whoever else uses your code.
This is the price of a fully Unicode-aware string library.
Anyhow, to answer your actual question — there are two schools of thought for subscripts and domain checking:
Have an optional return type: func subscript(index: Index) -> Element?
This makes sense when there's no sensible way for a client to check whether an index is valid without performing the same work as a lookup — e.g. for a dictionary, finding out if there's a value for a given key is the same as finding out what the value for a key is.
Require that the index be valid, and make a fatal error otherwise.
The usual case for this is situations where a client of your API can and should check for validity before accessing the subscript. This is what Swift arrays do, because arrays know their count and you don't need to look into an array to see if an index is valid.
The canonical test for this is precondition: e.g.
func subscript(index: Index) -> Element {
precondition(isValid(index), "index must be valid")
// ... do lookup ...
}
(Here, isValid is some operation specific to your class for validating an index — e.g. making sure it's > 0 and < count.)
In just about any use case, it's not idiomatic Swift to return a "real" value in the case of a bad index, nor is it appropriate to return a sentinel value — separating in-band values from sentinels is the reason Swift has Optionals.
Which of these is more appropriate for your use case is... well, since your use case is problematic to being with, it's sort of a wash. If you precondition that index < count, you still incur an O(n) cost just to check that (because a String has to examine its contents to figure out which sequences of bytes constitute each character before it knows how many characters it has). If you make your return type optional, and return nil after calling advancedBy or count, you've still incurred that O(n) cost.
Are the following two examples equivalent?
Example 1:
let x = String::new();
let y = &x[..];
Example 2:
let x = String::new();
let y = &*x;
Is one more efficient than the other or are they basically the same?
In the case of String and Vec, they do the same thing. In general, however, they aren't quite equivalent.
First, you have to understand Deref. This trait is implemented in cases where a type is logically "wrapping" some lower-level, simpler value. For example, all of the "smart pointer" types (Box, Rc, Arc) implement Deref to give you access to their contents.
It is also implemented for String and Vec: String "derefs" to the simpler str, Vec<T> derefs to the simpler [T].
Writing *s is just manually invoking Deref::deref to turn s into its "simpler form". It is almost always written &*s, however: although the Deref::deref signature says it returns a borrowed pointer (&Target), the compiler inserts a second automatic deref. This is so that, for example, { let x = Box::new(42i32); *x } results in an i32 rather than a &i32.
So &*s is really just shorthand for Deref::deref(&s).
s[..] is syntactic sugar for s.index(RangeFull), implemented by the Index trait. This means to slice the "whole range" of the thing being indexed; for both String and Vec, this gives you a slice of the entire contents. Again, the result is technically a borrowed pointer, but Rust auto-derefs this one as well, so it's also almost always written &s[..].
So what's the difference? Hold that thought; let's talk about Deref chaining.
To take a specific example, because you can view a String as a str, it would be really helpful to have all the methods available on strs automatically available on Strings as well. Rather than inheritance, Rust does this by Deref chaining.
The way it works is that when you ask for a particular method on a value, Rust first looks at the methods defined for that specific type. Let's say it doesn't find the method you asked for; before giving up, Rust will check for a Deref implementation. If it finds one, it invokes it and then tries again.
This means that when you call s.chars() where s is a String, what's actually happening is that you're calling s.deref().chars(), because String doesn't have a method called chars, but str does (scroll up to see that String only gets this method because it implements Deref<Target=str>).
Getting back to the original question, the difference between &*s and &s[..] is in what happens when s is not just String or Vec<T>. Let's take a few examples:
s: String; &*s: &str, &s[..]: &str.
s: &String: &*s: &String, &s[..]: &str.
s: Box<String>: &*s: &String, &s[..]: &str.
s: Box<Rc<&String>>: &*s: &Rc<&String>, &s[..]: &str.
&*s only ever peels away one layer of indirection. &s[..] peels away all of them. This is because none of Box, Rc, &, etc. implement the Index trait, so Deref chaining causes the call to s.index(RangeFull) to chain through all those intermediate layers.
Which one should you use? Whichever you want. Use &*s (or &**s, or &***s) if you want to control exactly how many layers of indirection you want to strip off. Use &s[..] if you want to strip them all off and just get at the innermost representation of the value.
Or, you can do what I do and use &*s because it reads left-to-right, whereas &s[..] reads left-to-right-to-left-again and that annoys me. :)
Addendum
There's the related concept of Deref coercions.
There's also DerefMut and IndexMut which do all of the above, but for &mut instead of &.
They are completely the same for String and Vec.
The [..] syntax results in a call to Index<RangeFull>::index() and it's not just sugar for [0..collection.len()]. The latter would introduce the cost of bound checking. Gladly this is not the case in Rust so they both are equally fast.
Relevant code:
index of String
deref of String
index of Vec (just returns self which triggers the deref coercion thus executes exactly the same code as just deref)
deref of Vec
Could someone please tell me how to write a custom function in Open Office Basic to be used in Open Office Calc and that returns an array of values. An example of one such built-in function is MINVERSE. I need to write a custom function that populates a range of cells in much the same way.
Help would be much appreciated.
Yay, I just figured it out: all you do is return an array from your macro, BUT you also have to press Ctrl+Shift+Enter when typing in the cell formula to call your function (which is also the case when working with other arrays in calc). Here's an example:
Function MakeArray
Dim ret(2,2)
ret(0,0) = 1
ret(1,0) = 2
ret(0,1) = 3
ret(1,1) = 4
MakeArray = ret
End Function
FWIW, damjan's MakeArray function returns a Variant containing an array, I think. (The type returned by MakeArray is unspecified, so it defaults to Variant. A Variant is a container with a descriptive header, apparently cast as needed by the interpreter.)
Almost, but not quite, the same thing as returning an array. According to http://www.cpearson.com/excel/passingandreturningarrays.htm, Microsoft did not introduce the ability to return an array until 2000. His example [ LoadNumbers(Low As Long, High As Long) As Long()] does not compile in OO, flagging a syntax error on the parens following Long. It appears that OO's Basic emulates the pre-2k VBA.