Why nested iterator closures won't copy values from outer scope - closures

I'm trying to use nested iterators, where the inner iterator uses value from the outer iterator.
vec![0;10].iter().flat_map(|&a| {
(0..10).map(|b|{
a + b
})
});
error: a does not live long enough
(0..10).map(|b|{
^^^
note: reference must be valid for the method call...
This compiles if I move the inner closure (move |b|{), but I don't understand why it is necessary, given that a is an integer and could have been copied instead of moved.

Both flat_map and map are lazy. The inner map does not use a immediately but tries to “save” it for when it will be needed later thus borrows a. But since a is local to the outer closure and you return map's result, that borrow would become invalid. You would need to consume the inner iterator:
vec![0;10].iter().flat_map(|&a| {
(0..10).map(|b|{
a + b
}).collect::<Vec<_>>()
});
Of course that's not efficient, and it would be much better for the inner closure to "keep" a. You would do this by marking the inner closure as move:
vec![0;10].iter().flat_map(|&a| {
(0..10).map(move |b|{
a + b
})
});
Normally, the compiler would not let you do this, because the flat_map closure does not own a, it merely has a reference to it. However,
since the numeric types in Rust (like isize) implement the Copy trait, the compiler will copy a instead of trying to move it, giving the behavior you want. Note that this is also the reason why you are allowed to dereference a (using |&a|) in the flat_map; normally that would have required owning a, not merely a reference to it (which is what .iter() yields).

Related

Armadillo subview issue

The following fragment produces a compilation error:
arma::Mat<double> a(10,10,arma::fill::zeros);
arma::ucolvec w = whatever1;
whatever2 = a.rows(w).each_col() + another-col-vector;
The error is that arma::subview_elem2 has no member named each_col.
In a number of cases in Armadillo, the standard array functions are not always available on expressions or results of other function calls. Clearly the rows() function does not return a Mat object, but a subview_elem2 object, presumably for optimization. Another way to do this would be to declare all the array functions in interfaces/pure abstract classes that Mat and other internal classes, such as subviews, implement. It seems it should be possible to make all Armadillo array expressions interchangeable with array objects aside from write operations for expressions that only generate r-values.
So... I could wish for the following
a) An explanation of which methods are not available for which results.
b) Preferably, enabling all combinations of array methods that make sense.
Absent the above, how can accomplish the desired result, which is to evaluate the expression:
a.rows(w).each_col()
??
Some prior information about armadillo
The armadillo library uses templates heavily and most operations return expression templates. Only when you assign the result to a variable the actual computation is performed. This is why you should not store the result of some computation with armadillo using auto.
For instance, given some matrices A, B and C, something like
auto D = A * B + C;
will not perform the computation and only the expression template is stored in D. On the other hand, using
arma::mat D = A * B + C;
will force the computation to happen and the result is stored in D.
Solution to your problem
Particularly to your question, something like a.rows(w) returns an expression template of type subview_elem2 (this file is defined in the source code armadillo_bits/subview_elem2_bones.hpp). This "temporary type" does not have a .each_col method, which results in the error you got. One way around is to store the result of a.rows(w) in a variable, but since you are not interested in the variable you can use the .eval() method. The .eval() method forces the expression template to perform the actual computation up to that point and thus the subsequent call to .each_col will work. That is, replace
a.rows(w).each_col() + another-col-vector;
with
a.rows(w).eval().each_col() + another-col-vector;

Dart - Pass by value for int but reference for list?

In Dart, looking at the code below, does it 'pass by reference' for list and 'pass by value' for integers? If that's the case, what type of data will be passed by reference/value? If that isn't the case, what's the issue that causes such output?
void main() {
var foo = ['a','b'];
var bar = foo;
bar.add('c');
print(aoo); // [a, b, c]
print(bar); // [a, b, c]
var a = 3;
int b = a;
b += 2;
print(a); // 3
print(b); // 5
}
The question your asking can be answered by looking at the difference between a value and a reference type.
Dart like almost every other programming langue makes a distinction between the two. The reason for this is that you divide memory into the so called stack and the heap. The stack is fast but very limited so it cannot hold that much data. (By the way, if you have too much data stored in the stack you will get a Stack Overflow exception which is where the name of this site comes from ;) ). The heap on the other hand is slower but can hold nearly infinite data.
This is why you have value and reference types. The value types are all your primitive data types (in Dart all the data type that are written small like int, bool, double and so on). Their values are small enough to be stored directly in the stack. On the other hand you have all the other data types that may potentially be much bigger so they cannot be stored in the stack. This is why all the other so called reference types are basically stored in the heap and only an address or a reference is stored in the stack.
So when you are setting the reference type bar to foo you're essentially just copying the storage address from bar to foo. Therefore if you change the data stored under that reference it seems like your changing both values because both have the same reference. In contrast when you say b = a your not transferring the reference but the actual value instead so it is not effected if you make any changes to the original value.
I really hope I could help answering your question :)
In Dart, all type are reference types. All parameters are passed by value. The "value" of a reference type is its reference. (That's why it's possible to have two variables containing the "same object" - there is only one object, but both variables contain references to that object). You never ever make a copy of an object just by passing the reference around.
Dart does not have "pass by reference" where you pass a variable as an argument (so the called function can change the value bound to the variable, like C#'s ref parameters).
Dart does not have primitive types, at all. However (big caveat), numbers are always (pretending to be) canonicalized, so there is only ever one 1 object in the program. You can't create a different 1 object. In a way it acts similarly to other languages' primitive types, but it isn't one. You can use int as a type argument to List<int>, unlike in Java where you need to do List<Integer>, you can ask about the identity of an int like identical(1, 2), and you can call methods on integers like 1.hashCode.
If you want to clone or copy a list
var foo = ['a', 'b'];
var bar = [...foo];
bar.add('c');
print(bar); // [a, b, c]
print(foo); // [a, b]
var bar_two = []; //or init an empty list
bar_two.addAll([...bar]);
print(bar_two); // [a, b, c]
Reference link
Clone a List, Map or Set in Dart

How to capture mutable reference into move closure contained in iterator returned from a closure

I have a PRNG that I would like to allow a closure to access by mutable reference. The lifetimes of everything should theoretically be able to work out, here is what it looks like:
fn someFunction<F, I>(mut crossover_point_iter_generator: F)
where F: FnMut(usize) -> I, I: Iterator<Item=usize>;
let mut rng = Isaac64Rng::from_seed(&[1, 2, 3, 4]);
someFunction(|x| (0..3).map(move |_| rng.gen::<usize>() % x));
Here, a closure is creating an iterator that wraps PRNG generated values. This iterator contains a map with a closure that has the wrap range x cloned into it, but the problem is that it unintentionally clones rng as well, which I have verified. It is necessary to make it a move closure because the value of x must be captured, otherwise the closure will outlive x.
I attempted to add this line to force it to move the reference into the closure:
let rng = &mut rng;
However, Rust complains with this error:
error: cannot move out of captured outer variable in an `FnMut` closure
Can I mutably access the PRNG from inside the move closure, and if not, since the PRNG clearly outlives the function call, is there an alternative solution (aside from redesigning the API)?
Edit:
I have rewritten it to remove the copy issue and the call looks like this:
someFunction(|x| rng.gen_iter::<usize>().map(move |y| y % x).take(3));
This results in a new error:
error: cannot infer an appropriate lifetime for autoref due to conflicting requirements
You're got a situation that requires multiple conflicting mutable borrows, and rustc is denying this as it should. It's just up to us to understand how & why this happens!
A note that will be important:
Isaac64Rng implements Copy, which means that it implicitly copies instead of just moving. I'm assuming is is a legacy / backwards compatibility thing.
I wrote this version of the code to get it straight:
extern crate rand;
use rand::Isaac64Rng;
use rand::{Rng, SeedableRng};
fn someFunction<F, I>(crossover_point_iter_generator: F)
where F: FnMut(usize) -> I, I: Iterator<Item=usize>
{
panic!()
}
fn main() {
let mut rng = Isaac64Rng::from_seed(&[1, 2, 3, 4]);
let rng = &mut rng; /* (##) Rust does not allow. */
someFunction(|x| {
(0..3).map(move |_| rng.gen::<usize>() % x)
});
}
Let me put this in points:
someFunction wants a closure it can call, that returns an iterator each time it's called. The closure is mutable and can be called many times (FnMut).
We must assume all the returned iterators can be used at the same time, and not in sequence (one at a time).
We would like to borrow the Rng into the iterator, but mutable borrows are exclusive. So borrowing rules do not allow more than one iterator at a time.
FnOnce instead of FnMut would be one example of a closure protocol to help us here. It would make rustc see that there can be only one iterator.
In the working version, without the line (##), you have several iterators active at the same time, what's happening there? It's the implicit copying kicking in, so each iterator will use an identical copy of the original Rng (sounds undesirable).
Your second version of the code runs into essentially the same limitation.
If you want to work around the exclusivity of borrowing, you can use special containers like RefCell or Mutex to serialize access to the Rng.

Why does return/redo evaluate result functions in the calling context, but block results are not evaluated?

Last night I learned about the /redo option for when you return from a function. It lets you return another function, which is then invoked at the calling site and reinvokes the evaluator from the same position
>> foo: func [a] [(print a) (return/redo (func [b] [print b + 10]))]
>> foo "Hello" 10
Hello
20
Even though foo is a function that only takes one argument, it now acts like a function that took two arguments. Something like that would otherwise require the caller to know you were returning a function, and that caller would have to manually use the do evaluator on it.
Thus without return/redo, you'd get:
>> foo: func [a] [(print a) (return (func [b] [print b + 10]))]
>> foo "Hello" 10
Hello
== 10
foo consumed its one parameter and returned a function by value (which was not invoked, thus the interpreter moved on). Then the expression evaluated to 10. If return/redo did not exist you'd have had to write:
>> do foo "Hello" 10
Hello
20
This keeps the caller from having to know (or care) if you've chosen to return a function to execute. And is cool because you can do things like tail call optimization, or writing a wrapper for the return functionality itself. Here's a variant of return that prints a message but still exits the function and provides the result:
>> myreturn: func [] [(print "Leaving...") (return/redo :return)]
>> foo: func [num] [myreturn num + 10]
>> foo 10
Leaving...
== 20
But functions aren't the only thing that have behavior in do. So if this is a general pattern for "removing the need for a DO at the callsite", then why doesn't this print anything?
>> test: func [] [return/redo [print "test"]]
>> test
== [print "test"]
It just returned the block by value, like a normal return would have. Shouldn't it have printed out "test"? That's what do would...uh, do with it:
>> do [print "test"]
test
The short answer is because it is generally unnecessary to evaluate a block at the call point, because blocks in Rebol don't take parameters so it mostly doesn't matter where they are evaluated. However, that "mostly" may need some explanation...
It comes down to two interesting features of Rebol: static binding, and how do of a function works.
Static Binding and Scopes
Rebol doesn't have scoped word bindings, it has static direct word bindings. Sometimes it seems like we have lexical scope, but we really fake that by updating the static bindings each time we're building a new "scoped" code block. We can also rebind words manually whenever we want.
What that means for us in this case though, is that once a block exists, its bindings and values are static - they're not affected by where the block is physically located, or where it is being evaluated.
However, and this is where it gets tricky, function contexts are weird. While the bindings of words bound to a function context are static, the set of values assigned to those words are dynamically scoped. It's a side effect of how code is evaluated in Rebol: What are language statements in other languages are functions in Rebol, so a call to if, for instance, actually passes a block of data to the if function which if then passes to do. That means that while a function is running, do has to look up the values of its words from the call frame of the most recent call to the function that hasn't returned yet.
This does mean that if you call a function and return a block of code with words bound to its context, evaluating that block will fail after the function returns. However, if your function calls itself and that call returns a block of code with its words bound to it, evaluating that block before your function returns will make it look up those words in the call frame of the current call of your function.
This is the same for whether you do or return/redo, and affects inner functions as well. Let me demonstrate:
Function returning code that is evaluated after the function returns, referencing a function word:
>> a: 10 do do has [a] [a: 20 [a]]
** Script error: a word is not bound to a context
** Where: do
** Near: do do has [a] [a: 20 [a]]
Same, but with return/redo and the code in a function:
>> a: 10 do has [a] [a: 20 return/redo does [a]]
** Script error: a word is not bound to a context
** Where: function!
** Near: [a: 20 return/redo does [a]]
Code do version, but inside an outer call to the same function:
>> do f: function [x] [a: 10 either zero? x [do f 1] [a: 20 [a]]] 0
== 10
Same, but with return/redo and the code in a function:
>> do f: function [x] [a: 10 either zero? x [f 1] [a: 20 return/redo does [a]]] 0
== 10
So in short, with blocks there is usually no advantage to doing the block elsewhere than where it is defined, and if you want to it is easier to use another call to do instead. Self-calling recursive functions that need to return code to be executed in outer calls of the same function are an exceedingly rare code pattern that I have never seen used in Rebol code at all.
It could be possible to change return/redo so it would handle blocks as well, but it probably isn't worth the increased overhead to return/redo to add a feature that is only useful in rare circumstances and already has a better way to do it.
However, that brings up an interesting point: If you don't need return/redo for blocks because do does the same job, doesn't the same apply to functions? Why do we need return/redo at all?
How DO of a Function Works
Basically, we have return/redo because it uses exactly the same code that we use to implement do of a function. You might not realize it, but do of a function is really unusual.
In most programming languages that can call a function value, you have to pass the parameters to the function as a complete set, sort of how R3's apply function works. Regular Rebol function calling causes some unknown-ahead-of-time number of additional evaluations to happen for its arguments using unknown-ahead-of-time evaluation rules. The evaluator figures out these evaluation rules at runtime and just passes the results of the evaluation to the function. The function itself doesn't handle the evaluation of its parameters, or even necessarily know how those parameters were evaluated.
However, when you do a function value explicitly, that means passing the function value to a call to another function, a regular function named do, and then that magically causes the evaluation of additional parameters that weren't even passed to the do function at all.
Well it's not magic, it's return/redo. The way do of a function works is that it returns a reference to the function in a regular shortcut-return value, with a flag in the shortcut-return value that tells the interpreter that called do to evaluate the returned function as if it were called right there in the code. This is basically what is called a trampoline.
Here's where we get to another interesting feature of Rebol: The ability to shortcut-return values from a function is built into the evaluator, but it doesn't actually use the return function to do it. All of the functions you see from Rebol code are wrappers around the internal stuff, even return and do. The return function we call just generates one of those shortcut-return values and returns it; the evaluator does the rest.
So in this case, what really happened is that all along we had code that did what return/redo does internally, but Carl decided to add an option to our return function to set that flag, even though the internal code doesn't need return to do so because the internal code calls the internal function. And then he didn't tell anyone that he was making the option externally available, or why, or what it did (I guess you can't mention everything; who has the time?). I have the suspicion, based on conversations with Carl and some bugs we've been fixing, that R2 handled do of a function differently, in a way that would have made return/redo impossible.
That does mean that the handling of return/redo is pretty thoroughly oriented towards function evaluation, since that is its entire reason for existing at all. Adding any overhead to it would add overhead to do of a function, and we use that a lot. Probably not worth extending it to blocks, given how little we'd gain and how rarely we'd get any benefit at all.
For return/redo of a function though, it seems to be getting more and more useful the more we think about it. In the last day we've come up with all sorts of tricks that this enables. Trampolines are useful.
While the question originally asked why return/redo did not evaluate blocks, there were also formulations like: "is cool because you can do things like tail call optimization", "[can write] a wrapper for the return functionality", "it seems to be getting more and more useful the more we think about it".
I do not think these are true. My first example demonstrates a case where return/redo can really be used, an example being in the "area of expertise" of return/redo, so to speak. It is a variadic sum function called sumn:
use [result collect process] [
collect: func [:value [any-type!]] [
unless value? 'value [return process result]
append/only result :value
return/redo :collect
]
process: func [block [block!] /local result] [
result: 0
foreach value reduce block [result: result + value]
result
]
sumn: func [] [
result: copy []
return/redo :collect
]
]
This is the usage example:
>> sumn 1 * 2 2 * 3 4
== 12
Variadic functions taking "unlimited number" of arguments are not as useful in Rebol as it may look at the first sight. For example, if we wanted to use the sumn function in a small script, we would have to wrap it into a paren to indicate where it should stop collecting arguments:
result: (sumn 1 * 2 2 * 3 4)
print result
This is not any better than using a more standard (non-variadic) alternative called e.g. block-sum and taking just one argument, a block. The usage would be like
result: block-sum [1 * 2 2 * 3 4]
print result
Of course, if the function can somehow detect what is its last argument without needing enclosing paren, we really gain something. In this case we could use the #[unset!] value as the sumn stopping argument, but that does not spare typing either:
result: sumn 1 * 2 2 * 3 4 #[unset!]
print result
Seeing the example of a return wrapper I would say that return/redo is not well suited for return wrappers, return wrappers being outside of its area of expertise. To demonstrate that, here is a return wrapper written in Rebol 2 that actually is outside of return/redo's area of expertise:
myreturn: func [
{my RETURN wrapper returning the string "indefinite" instead of #[unset!]}
; the [throw] attribute makes this function a RETURN wrapper in R2:
[throw]
value [any-type!] {the value to return}
] [
either value? 'value [return :value] [return "indefinite"]
]
Testing in R2:
>> do does [return #[unset!]]
>> do does [myreturn #[unset!]]
== "indefinite"
>> do does [return 1]
== 1
>> do does [myreturn 1]
== 1
>> do does [return 2 3]
== 2
>> do does [myreturn 2 3]
== 2
Also, I do not think it is true that return/redo helps with tail call optimizations. There are examples how tail calls can be implemented without using return/redo at the www.rebol.org site. As said, return/redo was tailor-made to support implementation of variadic functions and it is not flexible enough for other purposes as far as argument passing is concerned.

Some question about "Closure" in Lua

Here's my code, I confuse the local variable 'count' in the return function(c1,c2) with memory strack and where does they store in?
function make_counter()
local count = 0
return function()
count = count + 1
return count
end
end
c1 = make_counter()
c2 = make_counter()
print(c1())--print->1
print(c1())--print->2
print(c1())--print->3
print(c2())--print->1
print(c2())--print->2
in the return function(c1,c2) with memory strack and where does they store in?
It's stored in the closure!
c1 is not a closure, it is the function returned by make_counter(). The closure is not explicitly declared anywhere. It is the combination of the function returned by make_counter() and the "free variables" of that function. See closures # Wikipedia, specifically the implementation:
Closures are typically implemented with a special data structure that contains a pointer to the function code, plus a representation of the function's lexical environment (e.g., the set of available variables and their values) at the time when the closure was created.
I'm not quite sure what you're asking exactly, but I'll try to explain how closures work.
When you do this in Lua:
function() <some Lua code> end
You are creating a value. Values are things like the number 1, the string "string", and so forth.
Values are immutable. For example, the number 1 is always the number 1. It can never be the number two. You can add 1 to 2, but that will give you a new number 3. The same goes for strings. The string "string" is a string and will always be that particular string. You can use Lua functions to take away all 'g' characters in the string, but this will create a new string "strin".
Functions are values, just like the number 1 and the string "string". Values can be stored in variables. You can store the number 1 in multiple variables. You can store the string "string" in multiple variables. And the same goes for all other kinds of values, including functions.
Functions are values, and therefore they are immutable. However, functions can contain values; these values are not immutable. It's much like tables.
The {} syntax creates a Lua table, which is a value. This table is different from every other table, even other empty tables. However, you can put different stuff in tables. This doesn't change the unique value of the table, but it does change what is stored within that table. Each time you execute {}, you get a new, unique table. So if you have the following function:
function CreateTable()
return {}
end
The following will be true:
tableA = CreateTable()
tableB = CreateTable()
if(tableA == tableB) then
print("You will never see this")
else
print("Always printed")
end
Even though both tableA and tableB are empty tables (contain the same thing), they are different tables. They may contain the same stuff, but they are different values.
The same goes for functions. Functions in Lua are often called "closures", particularly if the function has contents. Functions are given contents based on how they use variables. If a function references a local variable that is in scope at the location where that function is created (remember: the syntax function() end creates a function every time you call it), then the function will contain a reference to that local variable.
But local variables go out of scope, while the value of the function may live on (in your case, you return it). Therefore, the function's object, the closure, must contain a reference to that local variable that will cause it to continue existing until the closure itself is discarded.
Where do the values get stored? It doesn't matter; only the closure can access them (though there is a way through the C Lua API, or through the Lua Debug API). So unlike tables, where you can get at anything you want, closures can truly hide data.
Lua Closures can also be used to implement prototype-based classes and objects. Closure classes and objects behave slightly differently than normal Lua classes and their method of invocation is somewhat different:
-- closure class definition
StarShip = {}
function StarShip.new(x,y,z)
self = {}
local dx, dy, dz
local curx, cury, curz
local engine_warpnew
cur_x = x; cur_y = y; cur_z = z
function setDest(x,y,z)
dx = x; dy=y; dz=z;
end
function setSpeed(warp)
engine_warpnew = warp
end
function self.warp(x,y,z,speed)
print("warping to ",x,y,x," at warp ",speed)
setDest(x,y,z)
setSpeed(speed)
end
function self.currlocation()
return {x=cur_x, y=cur_y, z=cur_z}
end
return self
end
enterprise = StarShip.new(1,3,9)
enterprise.warp(0,0,0,10)
loc = enterprise.currlocation()
print(loc.x, loc.y, loc.z)
Produces the following output:
warping to 0 0 0 at warp 10
1 3 9
Here we define a prototype object "StarShip" as an empty table.
Then we create a constructor for the StarShip in the "new" method. The first thing it does is create a closure table called self that contains the object's methods. All methods in the closure (those defined as 'function self.') are "closed" or defined for all values accessible by the constructor. This is why it's called a closure. When the constructor is done it returns the closure object "return self".
A lot more information on closure-based objects is available here:
http://lua-users.org/wiki/ObjectOrientationClosureApproach

Resources