How to capture mutable reference into move closure contained in iterator returned from a closure - closures

I have a PRNG that I would like to allow a closure to access by mutable reference. The lifetimes of everything should theoretically be able to work out, here is what it looks like:
fn someFunction<F, I>(mut crossover_point_iter_generator: F)
where F: FnMut(usize) -> I, I: Iterator<Item=usize>;
let mut rng = Isaac64Rng::from_seed(&[1, 2, 3, 4]);
someFunction(|x| (0..3).map(move |_| rng.gen::<usize>() % x));
Here, a closure is creating an iterator that wraps PRNG generated values. This iterator contains a map with a closure that has the wrap range x cloned into it, but the problem is that it unintentionally clones rng as well, which I have verified. It is necessary to make it a move closure because the value of x must be captured, otherwise the closure will outlive x.
I attempted to add this line to force it to move the reference into the closure:
let rng = &mut rng;
However, Rust complains with this error:
error: cannot move out of captured outer variable in an `FnMut` closure
Can I mutably access the PRNG from inside the move closure, and if not, since the PRNG clearly outlives the function call, is there an alternative solution (aside from redesigning the API)?
Edit:
I have rewritten it to remove the copy issue and the call looks like this:
someFunction(|x| rng.gen_iter::<usize>().map(move |y| y % x).take(3));
This results in a new error:
error: cannot infer an appropriate lifetime for autoref due to conflicting requirements

You're got a situation that requires multiple conflicting mutable borrows, and rustc is denying this as it should. It's just up to us to understand how & why this happens!
A note that will be important:
Isaac64Rng implements Copy, which means that it implicitly copies instead of just moving. I'm assuming is is a legacy / backwards compatibility thing.
I wrote this version of the code to get it straight:
extern crate rand;
use rand::Isaac64Rng;
use rand::{Rng, SeedableRng};
fn someFunction<F, I>(crossover_point_iter_generator: F)
where F: FnMut(usize) -> I, I: Iterator<Item=usize>
{
panic!()
}
fn main() {
let mut rng = Isaac64Rng::from_seed(&[1, 2, 3, 4]);
let rng = &mut rng; /* (##) Rust does not allow. */
someFunction(|x| {
(0..3).map(move |_| rng.gen::<usize>() % x)
});
}
Let me put this in points:
someFunction wants a closure it can call, that returns an iterator each time it's called. The closure is mutable and can be called many times (FnMut).
We must assume all the returned iterators can be used at the same time, and not in sequence (one at a time).
We would like to borrow the Rng into the iterator, but mutable borrows are exclusive. So borrowing rules do not allow more than one iterator at a time.
FnOnce instead of FnMut would be one example of a closure protocol to help us here. It would make rustc see that there can be only one iterator.
In the working version, without the line (##), you have several iterators active at the same time, what's happening there? It's the implicit copying kicking in, so each iterator will use an identical copy of the original Rng (sounds undesirable).
Your second version of the code runs into essentially the same limitation.
If you want to work around the exclusivity of borrowing, you can use special containers like RefCell or Mutex to serialize access to the Rng.

Related

zig structs, pointers, field access

I was trying to implement vector algebra with generic algorithms and ended up playing with iterators. I have found two examples of not obvious and unexpected behaviour:
if I have pointer p to a struct (instance) with field fi, I can access the field as simply as p.fi (rather than p.*.fi)
if I have a "member" function fun(this: *Self) (where Self = #This()) and an instance s of the struct, I can call the function as simply as s.fun() (rather than (&s).fun())
My questions are:
is it documented (or in any way mentioned) somewhere? I've looked through both language reference and guide from ziglearn.org and didn't find anything
what is it that we observe in these examples? syntactic sugar for two particular cases or are there more general rules from which such behavior can be deduced?
are there more examples of weird pointers' behaviour?
For 1 and 2, you are correct. In Zig the dot works for both struct values and struct pointers transparently. Similarly, namespaced functions also do the right thing when invoked.
The only other similar behavior that I can think of is [] syntax used on arrays. You can use both directly on an array value and an array pointer interchangeably. This is somewhat equivalent to how the dot operates on structs.
const std = #import("std");
pub fn main() !void {
const arr = [_]u8{1,2,3};
const foo = &arr;
std.debug.print("{}", .{arr[2]});
std.debug.print("{}", .{foo[2]});
}
AFAIK these are the only three instances of this behavior. In all other cases if something asks for a pointer you have to explicitly provide it. Even when you pass an array to a function that accepts a slice, you will have to take the array's pointer explicitly.
The authoritative source of information is the language reference but checking it quickly, it doesn't seem to have a dedicated paragraph. Maybe there's some example that I missed though.
https://ziglang.org/documentation/0.8.0/
I first learned this syntax by going through the ziglings course, which is linked to on ziglang.org.
in exercise 43 (https://github.com/ratfactor/ziglings/blob/main/exercises/043_pointers5.zig)
// Note that you don't need to dereference the "pv" pointer to access
// the struct's fields:
//
// YES: pv.x
// NO: pv.*.x
//
// We can write functions that take pointer arguments:
//
// fn foo(v: *Vertex) void {
// v.x += 2;
// v.y += 3;
// v.z += 7;
// }
//
// And pass references to them:
//
// foo(&v1);
The ziglings course goes quite in-depth on a few language topics, so it's definitely work checking out if you're interested.
With regards to other syntax: as the previous answer mentioned, you don't need to dereference array pointers. I'm not sure about anything else (I thought function pointers worked the same, but I just ran some tests and they do not.)

Creating an 'add' computation expression

I'd like the example computation expression and values below to return 6. For some the numbers aren't yielding like I'd expect. What's the step I'm missing to get my result? Thanks!
type AddBuilder() =
let mutable x = 0
member _.Yield i = x <- x + i
member _.Zero() = 0
member _.Return() = x
let add = AddBuilder()
(* Compiler tells me that each of the numbers in add don't do anything
and suggests putting '|> ignore' in front of each *)
let result = add { 1; 2; 3 }
(* Currently the result is 0 *)
printfn "%i should be 6" result
Note: This is just for creating my own computation expression to expand my learning. Seq.sum would be a better approach. I'm open to the idea that this example completely misses the value of computation expressions and is no good for learning.
There is a lot wrong here.
First, let's start with mere mechanics.
In order for the Yield method to be called, the code inside the curly braces must use the yield keyword:
let result = add { yield 1; yield 2; yield 3 }
But now the compiler will complain that you also need a Combine method. See, the semantics of yield is that each of them produces a finished computation, a resulting value. And therefore, if you want to have more than one, you need some way to "glue" them together. This is what the Combine method does.
Since your computation builder doesn't actually produce any results, but instead mutates its internal variable, the ultimate result of the computation should be the value of that internal variable. So that's what Combine needs to return:
member _.Combine(a, b) = x
But now the compiler complains again: you need a Delay method. Delay is not strictly necessary, but it's required in order to mitigate performance pitfalls. When the computation consists of many "parts" (like in the case of multiple yields), it's often the case that some of them should be discarded. In these situation, it would be inefficient to evaluate all of them and then discard some. So the compiler inserts a call to Delay: it receives a function, which, when called, would evaluate a "part" of the computation, and Delay has the opportunity to put this function in some sort of deferred container, so that later Combine can decide which of those containers to discard and which to evaluate.
In your case, however, since the result of the computation doesn't matter (remember: you're not returning any results, you're just mutating the internal variable), Delay can just execute the function it receives to have it produce the side effects (which are - mutating the variable):
member _.Delay(f) = f ()
And now the computation finally compiles, and behold: its result is 6. This result comes from whatever Combine is returning. Try modifying it like this:
member _.Combine(a, b) = "foo"
Now suddenly the result of your computation becomes "foo".
And now, let's move on to semantics.
The above modifications will let your program compile and even produce expected result. However, I think you misunderstood the whole idea of the computation expressions in the first place.
The builder isn't supposed to have any internal state. Instead, its methods are supposed to manipulate complex values of some sort, some methods creating new values, some modifying existing ones. For example, the seq builder1 manipulates sequences. That's the type of values it handles. Different methods create new sequences (Yield) or transform them in some way (e.g. Combine), and the ultimate result is also a sequence.
In your case, it looks like the values that your builder needs to manipulate are numbers. And the ultimate result would also be a number.
So let's look at the methods' semantics.
The Yield method is supposed to create one of those values that you're manipulating. Since your values are numbers, that's what Yield should return:
member _.Yield x = x
The Combine method, as explained above, is supposed to combine two of such values that got created by different parts of the expression. In your case, since you want the ultimate result to be a sum, that's what Combine should do:
member _.Combine(a, b) = a + b
Finally, the Delay method should just execute the provided function. In your case, since your values are numbers, it doesn't make sense to discard any of them:
member _.Delay(f) = f()
And that's it! With these three methods, you can add numbers:
type AddBuilder() =
member _.Yield x = x
member _.Combine(a, b) = a + b
member _.Delay(f) = f ()
let add = AddBuilder()
let result = add { yield 1; yield 2; yield 3 }
I think numbers are not a very good example for learning about computation expressions, because numbers lack the inner structure that computation expressions are supposed to handle. Try instead creating a maybe builder to manipulate Option<'a> values.
Added bonus - there are already implementations you can find online and use for reference.
1 seq is not actually a computation expression. It predates computation expressions and is treated in a special way by the compiler. But good enough for examples and comparisons.

What's the difference between filter(|x|) and filter(|&x|)?

In the Rust Book there is an example of calling filter() on an iterator:
for i in (1..100).filter(|&x| x % 2 == 0) {
println!("{}", i);
}
There is an explanation below but I'm having trouble understanding it:
This will print all of the even numbers between one and a hundred. (Note that because filter doesn't consume the elements that are being iterated over, it is passed a reference to each element, and thus the filter predicate uses the &x pattern to extract the integer itself.)
This, however does not work:
for i in (1..100).filter(|&x| *x % 2 == 0) {
println!("i={}", i);
}
Why is the argument to the closure a reference |&x|, instead of |x|? And what's the difference between using |&x| and |x|?
I understand that using a reference |&x| is more efficient, but I'm puzzled by the fact that I didn't have to dereference the x pointer by using *x.
When used as a pattern match (and closure and function arguments are also pattern matches), the & binds to a reference, making the variable the dereferenced value.
fn main() {
let an_int: u8 = 42;
// Note that the `&` is on the right side of the `:`
let ref_to_int: &u8 = &an_int;
// Note that the `&` is on the left side of the `:`
let &another_int = ref_to_int;
let () = another_int;
}
Has the error:
error: mismatched types:
expected `u8`,
found `()`
If you look at your error message for your case, it indicates that you can't dereference it, because it isn't a reference:
error: type `_` cannot be dereferenced
I didn't have to dereference the x pointer by using *x.
That's because you implicitly dereferenced it in the pattern match.
I understand that using a reference |&x| is more efficient
If this were true, then there would be no reason to use anything but references! Namely, references require extra indirection to get to the real data. There's some measurable cut-off point where passing items by value is more efficient than passing references to them.
If so, why does using |x| not throw an error? From my experience with C, I would expect to receive a pointer here.
And you do, in the form of a reference. x is a reference to (in this example) an i32. However, the % operator is provided by the trait Rem, which is implemented for all pairs of reference / value:
impl Rem<i32> for i32
impl<'a> Rem<i32> for &'a i32
impl<'a> Rem<&'a i32> for i32
impl<'a, 'b> Rem<&'a i32> for &'b i32
This allows you to not need to explicitly dereference it.
Or does Rust implicitly allocate a copy of value of the original x on the stack here?
It emphatically does not do that. In fact, it would be unsafe to do that, unless the iterated items implemented Copy (or potentially Clone, in which case it could also be expensive). This is why references are used as the closure argument.

Was point free functions able to inline?

let inline myfunction x y = ...
let inline mycurried = myfunction x // error, only functions may be marked inline
It seems impossible to explicitly inline curried functions.
So whenever mycurried is called, it won't get inlined even if myfunction is inlined properly, is it correct?
So can this be regarded as one of the drawback of curried function?
I think your question is whether a point-free function can be inlined or not.
The limitation you found is not because of the curried function.
Note that in your example the curried function is on the right side, on the left side you have a point-free function.
F# only allows functions to be inline, not constants.
I principle you may think this could be considered as a bug given that type inference is smart enough to find out that is a (point-free) function, but read the notes from Tomas regarding side-effects.
Apparently when the compiler finds on the left side only an identifier it fails with this error:
let inline myfunction x y = x + y
let inline mycurried = myfunction 1
--> Only functions may be marked 'inline'
As Brian said a workaround is adding an explicit parameter on both sides:
let inline mycurried x = (myfunction 1) x
but then your function is no longer point-free, it's the same as:
let inline mycurried x = myfunction 1 x
Another way might be to add an explicit generic parameter:
let inline mycurried<'a> = myfunction 1
when generic parameters are present explicitly on the left side it compiles.
I wish they remove the error message and turn it a warning, something like:
Since only functions can be 'inline' this value will be compiled as a function.
UPDATE
Thanks Tomas for your answer (and your downvote).
My personal opinion is this should be a warning, so you are aware that the semantic of your code will eventually change, but then it's up to you to decide what to do.
You say that inline is "just an optimization" but that's not entirely true:
. Simply turning all your functions inline does not guarantee optimal code.
. You may want to use static constraints and then you have to use inline.
I would like to be able to define my (kind-of) generic constants, as F# library already does (ie: GenericZero and GenericOne). I know my code will be pure, so I don't care if it is executed each time.
I think you just need to add an explicit parameter to both sides (though I have not tried):
let inline myfunction x y = ...
let inline mycurried y = myfunction 42 y // or whatever value (42)
The compiler only allows inline on let bindings that define a function. This is essentially the same thing as what is happening with F# value restriction (and see also here). As Brian says, you can easily workaround this by adding a parameter to your function.
Why does this restriction exist? If it was not there, then adding inline would change the meaning of your programs and that would be bad!
For example, say you have a function like this (which creates mutable state and returns a counter function):
let createCounter n =
let state = ref n
(fun () -> incr state; !state)
Now, the following code:
let counter = createCounter 0
... creates a single global function that you can use multiple times (call counter()) and it will give you unique integers starting from 1. If you could mark it as inline:
let inline counter = createCounter 0
... then every time you use counter(), the compiler should replace that with createCounter 0 () and so you would get 1 every time you call the counter!

F# Functions vs. Values

This is a pretty simple question, and I just wanted to check that what I'm doing and how I'm interpreting the F# makes sense. If I have the statement
let printRandom =
x = MyApplication.getRandom()
printfn "%d" x
x
Instead of creating printRandom as a function, F# runs it once and then assigns it a value. So, now, when I call printRandom, instead of getting a new random value and printing it, I simply get whatever was returned the first time. I can get around this my defining it as such:
let printRandom() =
x = MyApplication.getRandom()
printfn "%d" x
x
Is this the proper way to draw this distinction between parameter-less functions and values? This seems less than ideal to me. Does it have consequences in currying, composition, etc?
The right way to look at this is that F# has no such thing as parameter-less functions. All functions have to take a parameter, but sometimes you don't care what it is, so you use () (the singleton value of type unit). You could also make a function like this:
let printRandom unused =
x = MyApplication.getRandom()
printfn "%d" x
x
or this:
let printRandom _ =
x = MyApplication.getRandom()
printfn "%d" x
x
But () is the default way to express that you don't use the parameter. It expresses that fact to the caller, because the type is unit -> int not 'a -> int; as well as to the reader, because the call site is printRandom () not printRandom "unused".
Currying and composition do in fact rely on the fact that all functions take one parameter and return one value.
The most common way to write calls with unit, by the way, is with a space, especially in the non .NET relatives of F# like Caml, SML and Haskell. That's because () is a singleton value, not a syntactic thing like it is in C#.
Your analysis is correct.
The first instance defines a value and not a function. I admit this caught me a few times when I started with F# as well. Coming from C# it seems very natural that an assignment expression which contains multiple statements must be a lambda and hence delay evaluated.
This is just not the case in F#. Statements can be almost arbitrarily nested (and it rocks for having locally scoped functions and values). Once you get comfortable with this you start to see it as an advantage as you can create functions and continuations which are inaccessible to the rest of the function.
The second approach is the standard way for creating a function which logically takes no arguments. I don't know the precise terminology the F# team would use for this declaration though (perhaps a function taking a single argument of type unit). So I can't really comment on how it would affect currying.
Is this the proper way to draw this
distinction between parameter-less
functions and values? This seems less
than ideal to me. Does it have
consequences in currying, composition,
etc?
Yes, what you describe is correct.
For what its worth, it has a very interesting consequence able to partially evaluate functions on declaration. Compare these two functions:
// val contains : string -> bool
let contains =
let people = set ["Juliet"; "Joe"; "Bob"; "Jack"]
fun person -> people.Contains(person)
// val contains2 : string -> bool
let contains2 person =
let people = set ["Juliet"; "Joe"; "Bob"; "Jack"]
people.Contains(person)
Both functions produce identical results, contains creates its people set on declaration and reuses it, whereas contains2 creates its people set everytime you call the function. End result: contains is slightly faster. So knowing the distinction here can help you write faster code.
Assignment bodies looking like function bodies have cought a few programmers unaware. You can make things even more interesting by having the assignment return a function:
let foo =
printfn "This runs at startup"
(fun () -> printfn "This runs every time you call foo ()")
I just wrote a blog post about it at http://blog.wezeku.com/2010/08/23/values-functions-and-a-bit-of-both/.

Resources