Why does the closure for `take_while` take its argument by reference? - closures

Here is an example from Rust by Example:
fn is_odd(n: u32) -> bool {
n % 2 == 1
}
fn main() {
println!("Find the sum of all the squared odd numbers under 1000");
let upper = 1000;
// Functional approach
let sum_of_squared_odd_numbers: u32 =
(0..).map(|n| n * n) // All natural numbers squared
.take_while(|&n| n < upper) // Below upper limit
.filter(|n| is_odd(*n)) // That are odd
.fold(0, |sum, i| sum + i); // Sum them
println!("functional style: {}", sum_of_squared_odd_numbers);
}
Why does the closure for take_while take its argument by reference, while all the others take by value?

The implementation of Iterator::take_while is quite illuminating:
fn next(&mut self) -> Option<I::Item> {
if self.flag {
None
} else {
self.iter.next().and_then(|x| {
if (self.predicate)(&x) {
Some(x)
} else {
self.flag = true;
None
}
})
}
}
If the value returned from the underlying iterator were directly passed to the predicate, then ownership of the value would also be transferred. After the predicate was called, there would no longer be a value to return from the TakeWhile adapter if the predicate were true!

Related

How to build a numbered list parser in nom?

I'd like to parse a numbered list using nom in Rust.
For example, 1. Milk 2. Bread 3. Bacon.
I could use separated_list1 with an appropriate separator parser and element parser.
fn parser(input: &str) -> IResult<&str, Vec<&str>> {
preceded(
tag("1. "),
separated_list1(
tuple((tag(" "), digit1, tag(". "))),
take_while(is_alphabetic),
),
)(input)
}
However, this does not validate the increasing index numbers.
For example, it would happily parse invalid lists like 1. Milk 3. Bread 4. Bacon or 1. Milk 8. Bread 1. Bacon.
It seems there is no built-in nom parser that can do this. So I ventured to try to build my own first parser...
My idea was to implement a parser similar to separated_list1 but which keeps track of the index and passes it to the separator as argument. It could accept a closure as argument that can then create the separator parser based on the index argument.
fn parser(input: &str) -> IResult<&str, Vec<&str>> {
preceded(
tag("1. "),
separated_list1(
|index: i32| tuple((tag(" "), tag(&index.to_string()), tag(". "))),
take_while(is_alphabetic),
),
)(input)
}
I tried to use the implementation of separated_list1 and change the separator argument to G: FnOnce(i32) -> Parser<I, O2, E>,, create an index variable let mut index = 1;, pass it to sep(index) in the loop, and increase it at the end of the loop index += 1;.
However, Rust's type system is not happy!
How can I make this work?
Here's the full code for reproduction
use nom::{
error::{ErrorKind, ParseError},
Err, IResult, InputLength, Parser,
};
pub fn separated_numbered_list1<I, O, O2, E, F, G>(
mut sep: G,
mut f: F,
) -> impl FnMut(I) -> IResult<I, Vec<O>, E>
where
I: Clone + InputLength,
F: Parser<I, O, E>,
G: FnOnce(i32) -> Parser<I, O2, E>,
E: ParseError<I>,
{
move |mut i: I| {
let mut res = Vec::new();
let mut index = 1;
// Parse the first element
match f.parse(i.clone()) {
Err(e) => return Err(e),
Ok((i1, o)) => {
res.push(o);
i = i1;
}
}
loop {
let len = i.input_len();
match sep(index).parse(i.clone()) {
Err(Err::Error(_)) => return Ok((i, res)),
Err(e) => return Err(e),
Ok((i1, _)) => {
// infinite loop check: the parser must always consume
if i1.input_len() == len {
return Err(Err::Error(E::from_error_kind(i1, ErrorKind::SeparatedList)));
}
match f.parse(i1.clone()) {
Err(Err::Error(_)) => return Ok((i, res)),
Err(e) => return Err(e),
Ok((i2, o)) => {
res.push(o);
i = i2;
}
}
}
}
index += 1;
}
}
}
Try to manually use many1(), separated_pair(), and verify()
fn validated(input: &str) -> IResult<&str, Vec<(u32, &str)>> {
let current_index = Cell::new(1u32);
let number = map_res(digit1, |s: &str| s.parse::<u32>());
let valid = verify(number, |digit| {
let i = current_index.get();
if digit == &i {
current_index.set(i + 1);
true
} else {
false
}
});
let pair = preceded(multispace0, separated_pair(valid, tag(". "), alpha1));
//give current_index time to be used and dropped with a temporary binding. This will not compile without the temporary binding
let tmp = many1(pair)(input);
tmp
}
#[test]
fn test_success() {
let input = "1. Milk 2. Bread 3. Bacon";
assert_eq!(validated(input), Ok(("", vec![(1, "Milk"), (2, "Bread"), (3, "Bacon")])));
}
#[test]
fn test_fail() {
let input = "2. Bread 3. Bacon 1. Milk";
validated(input).unwrap_err();
}

Building expression parser with Dart petitparser, getting stuck on node visitor

I've got more of my expression parser working (Dart PetitParser to get at AST datastructure created with ExpressionBuilder). It appears to be generating accurate ASTs for floats, parens, power, multiply, divide, add, subtract, unary negative in front of both numbers and expressions. (The nodes are either literal strings, or an object that has a precedence with a List payload that gets walked and concatenated.)
I'm stuck now on visiting the nodes. I have clean access to the top node (thanks to Lukas), but I'm stuck on deciding whether or not to add a paren. For example, in 20+30*40, we don't need parens around 30*40, and the parse tree correctly has the node for this closer to the root so I'll hit it first during traversal. However, I don't seem to have enough data when looking at the 30*40 node to determine if it needs parens before going on to the 20+.. A very similar case would be (20+30)*40, which gets parsed correctly with 20+30 closer to the root, so once again, when visiting the 20+30 node I need to add parens before going on to *40.
This has to be a solved problem, but I never went to compiler school, so I know just enough about ASTs to be dangerous. What "a ha" am I missing?
// rip-common.dart:
import 'package:petitparser/petitparser.dart';
// import 'package:petitparser/debug.dart';
class Node {
int precedence;
List<dynamic> args;
Node([this.precedence = 0, this.args = const []]) {
// nodeList.add(this);
}
#override
String toString() => 'Node($precedence $args)';
String visit([int fromPrecedence = -1]) {
print('=== visiting $this ===');
var buf = StringBuffer();
var parens = (precedence > 0) &&
(fromPrecedence > 0) &&
(precedence < fromPrecedence);
print('<$fromPrecedence $precedence $parens>');
// for debugging:
var curlyOpen = '';
var curlyClose = '';
buf.write(parens ? '(' : curlyOpen);
for (var arg in args) {
if (arg is Node) {
buf.write(arg.visit(precedence));
} else if (arg is String) {
buf.write(arg);
} else {
print('not Node or String: $arg');
buf.write('$arg');
}
}
buf.write(parens ? ')' : curlyClose);
print('$buf for buf');
return '$buf';
}
}
class RIPParser {
Parser _make_parser() {
final builder = ExpressionBuilder();
var number = char('-').optional() &
digit().plus() &
(char('.') & digit().plus()).optional();
// precedence 5
builder.group()
..primitive(number.flatten().map((a) => Node(0, [a])))
..wrapper(char('('), char(')'), (l, a, r) => Node(0, [a]));
// negation is a prefix operator
// precedence 4
builder.group()..prefix(char('-').trim(), (op, a) => Node(4, [op, a]));
// power is right-associative
// precedence 3
builder.group()..right(char('^').trim(), (a, op, b) => Node(3, [a, op, b]));
// multiplication and addition are left-associative
// precedence 2
builder.group()
..left(char('*').trim(), (a, op, b) => Node(2, [a, op, b]))
..left(char('/').trim(), (a, op, b) => Node(2, [a, op, b]));
// precedence 1
builder.group()
..left(char('+').trim(), (a, op, b) => Node(1, [a, op, b]))
..left(char('-').trim(), (a, op, b) => Node(1, [a, op, b]));
final parser = builder.build().end();
return parser;
}
Result _result(String input) {
var parser = _make_parser(); // eventually cache
var result = parser.parse(input);
return result;
}
String parse(String input) {
var result = _result(input);
if (result.isFailure) {
return result.message;
} else {
print('result.value = ${result.value}');
return '$result';
}
}
String visit(String input) {
var result = _result(input);
var top_node = result.value; // result.isFailure ...
return top_node.visit();
}
}
// rip_cmd_example.dart
import 'dart:io';
import 'package:rip_common/rip_common.dart';
void main() {
print('start');
String input;
while (true) {
input = stdin.readLineSync();
if (input.isEmpty) {
break;
}
print(RIPParser().parse(input));
print(RIPParser().visit(input));
}
;
print('done');
}
As you've observed, the ExpressionBuilder already assembles the tree in the right precedence order based on the operator groups you've specified.
This also happens for the wrapping parens node created here: ..wrapper(char('('), char(')'), (l, a, r) => Node(0, [a])). If I test for this node, I get back the input string for your example expressions: var parens = precedence == 0 && args.length == 1 && args[0] is Node;.
Unless I am missing something, there should be no reason for you to track the precedence manually. I would also recommend that you create different node classes for the different operators: ValueNode, ParensNode, NegNode, PowNode, MulNode, ... A bit verbose, but much easier to understand what is going on, if each of them can just visit (print, evaluate, optimize, ...) itself.

What is the difference between 'is' and '==' in Dart?

Let's say I have:
class Test<T> {
void method() {
if (T is int) {
// T is int
}
if (T == int) {
// T is int
}
}
}
I know I can override == operator but what's the main difference between == and is in Dart if I don't override any operator.
Edit:
Say I have
extension MyIterable<T extends num> on Iterable<T> {
T sum() {
T total = T is int ? 0 : 0.0; // setting `T == int` works
for (T item in this) {
total += item;
}
return total;
}
}
And when I use my extension method with something like:
var addition = MyIterable([1, 2, 3]).sum();
I get this error:
type 'double' is not a subtype of type 'int'
identical(x, y) checks if x is the same object as y.
x == y checks whether x should be considered equal to y. The default implementation for operator == is the same as identical(), but operator == can be overridden to do deep equality checks (or in theory could be pathological and be implemented to do anything).
x is T checks whether x has type T. x is an object instance.
class MyClass {
MyClass(this.x);
int x;
#override
bool operator==(dynamic other) {
return runtimeType == other.runtimeType && x == other.x;
}
#override
int get hashCode => x.hashCode;
}
void main() {
var c1 = MyClass(42);
var c2 = MyClass(42);
var sameC = c1;
print(identical(c1, c2)); // Prints: false
print(identical(c1, sameC)); // Prints: true
print(c1 == c2); // Prints: true
print(c1 == sameC); // Prints: true
print(c1 is MyClass); // Prints: true
print(c1 is c1); // Illegal. The right-hand-side must be a type.
print(MyClass is MyClass); // Prints: false
}
Note the last case: MyClass is MyClass is false because the left-hand-side is a type, not an instance of MyClass. (MyClass is Type would be true, however.)
In your code, T is int is incorrect because both sides are types. You do want T == int in that case. Note that T == int would check for an exact type and would not be true if one is a derived type of the other (e.g. int == num would be false).
Basically, == is equality operator and "is" is the instanceof operator of Dart (If you come from Java background, if not, it basically tells you if something is of type something).
Use == for equality, when you want to check if two objects are equal. You can implement the == operator (method) in your class to define on what basis do you want to judge if two objects are equal.
Take this example:
class Car {
String model;
String brand;
Car(this.model, this.brand);
bool operator == (otherObj) {
return (otherObj is Car && other.brand == brand); //or however you want to check
//On the above line, we use "is" to check if otherObj is of type Car
}
}
Now you can check if two cars are "equal" based on the condition that you have defined.
void main() {
final Car micra = Car("Micra", "Nissan");
print(micra == Car("Micra", "Nissan")); // true
print(micra is Car("Micra", "Nissan")); // true
}
Hence, == is something you use to decide if two objects are equal, you can override and set it as per your expectations on how two objects should be considered equal.
On the other hand, "is" basically tells you if an instance is of type object (micra is of type Car here).

Conflicting lifetime requirements when storing closure capturing returned value

EDIT:
I'm trying to create a vector of closures inside a function, add a standard closure to the vector, and then return the vector from the function. I'm getting an error about conflicting lifetimes.
Code can be executed here.
fn vec_with_closure<'a, T>(f: Box<FnMut(T) + 'a>) -> Vec<Box<FnMut(T) + 'a>>
{
let mut v = Vec::<Box<FnMut(T)>>::new();
v.push(Box::new(|&mut: t: T| {
f(t);
}));
v
}
fn main() {
let v = vec_with_closure(Box::new(|t: usize| {
println!("{}", t);
}));
for c in v.iter_mut() {
c(10);
}
}
EDIT 2:
Using Rc<RefCell<...>> together with move || and the Fn() trait as opposed to the FnMut()m as suggested by Shepmaster, helped me produce a working version of the above code. Rust playpen version here.
Here's my understanding of the problem, slightly slimmed down:
fn filter<F>(&mut self, f: F) -> Keeper
where F: Fn() -> bool + 'static //'
{
let mut k = Keeper::new();
self.subscribe(|| {
if f() { k.publish() }
});
k
}
In this method, f is a value that has been passed in by-value, which means that filter owns it. Then, we create another closure that captures f by-reference. We are then trying to save that closure somewhere, so all the references in the closure need to outlive the lifetime of our struct (I picked 'static for convenience).
However, f only lives until the end of the method, so it definitely won't live long enough. We need to make the closure own f. It would be ideal if we could use the move keyword, but that causes the closure to also move in k, so we wouldn't be able to return it from the function.
Trying to solve that led to this version:
fn filter<F>(&mut self, f: F) -> Keeper
where F: Fn() -> bool + 'static //'
{
let mut k = Keeper::new();
let k2 = &mut k;
self.subscribe(move || {
if f() { k2.publish() }
});
k
}
which has a useful error message:
error: `k` does not live long enough
let k2 = &mut k;
^
note: reference must be valid for the static lifetime...
...but borrowed value is only valid for the block
Which leads to another problem: you are trying to keep a reference to k in the closure, but that reference will become invalid as soon as k is returned from the function. When items are moved by-value, their address will change, so references are no longer valid.
One potential solution is to use Rc and RefCell:
fn filter<F>(&mut self, f: F) -> Rc<RefCell<Keeper>>
where F: Fn() -> bool + 'static //'
{
let mut k = Rc::new(RefCell::new(Keeper::new()));
let k2 = k.clone();
self.subscribe(move || {
if f() { k2.borrow_mut().publish() }
});
k
}

Passing a closure to a recursive function

I'm working on a quad tree. Using a closure to determine when a region should be split seems like a good idea.
pub fn split_recursively(&mut self, f: |Rect| -> bool) {
...
self.children.as_mut().map(|children| {
children.nw.split_recursively(f);
children.ne.split_recursively(f);
children.sw.split_recursively(f);
children.se.split_recursively(f);
});
}
The above is what I have tried, but I get
error: cannot move out of captured outer variable
children.se.split_recursively(f);
For each of the four children.
I then tried wrapping the closure in a RefCell
fn split_recursively(&mut self, f: &RefCell<|Rect| -> bool>)
and calling it
let should_split = (*f.borrow_mut())(self.rect);
but Rust doesn't like that either:
error: closure invocation in a & reference
I know how to work around this by passing a function pointer and &mut to a rng (because splitting should in part depend on randomness), but a closure would be more elegant. Is there any way to get this to work with a closure? If not, would it be possible with unboxed closures?
Unboxed closures do work but they are currently feature-gated and add complexity.
#![feature(unboxed_closures)]
use std::ops::Fn;
fn main() {
let f = |&: x: uint| println!("Recurse {}!", x);
recursive(&f, 0);
}
fn recursive(f: &Fn<(uint,), ()>, depth: uint) {
if depth < 10 {
f.call((depth,)); // Cannot be invoked like f(depth) currently.
recursive(f, depth + 1);
} else {
()
}
}
(Rust-Playpen link)
Searching around, I found an easy solution. Just call the closure in a closure:
fn main() {
let f = |x: uint| println!("Recurse {}!", x);
recursive(f, 0);
}
fn recursive(f: |uint| -> (), depth: uint) {
if depth < 10 {
f(depth);
recursive(|x| f(x), depth + 1);
} else {
()
}
}
(Rust-Playpen link)
This has a performance impact compared to unboxed closures but it's definitely easier to read. Your choice.

Resources