Peeking at stdin using match - parsing

I'm trying to port a translator/parser example from an old compiler textbook from C into Rust.
I have the following code:
use std::io::Read;
fn lexan() {
let mut input = std::io::stdin().bytes().peekable();
loop {
match input.peek() {
Some(&ch) => {
match ch {
_ => println!("{:?}", input.next()),
}
}
None => break,
}
}
}
At this point I'm not actively trying to parse the input, just get my head around how match works. The aim is to add parse branches to the inner match. Unfortunately this fails to compile because I appear to fail in understanding the semantics of match:
error[E0507]: cannot move out of borrowed content
--> src/main.rs:7:18
|
7 | Some(&ch) => {
| ^--
| ||
| |hint: to prevent move, use `ref ch` or `ref mut ch`
| cannot move out of borrowed content
From what I understand, this error is because I don't own the return value of the match. The thing is, I don't believe that I'm using the return value of either match. I thought perhaps input.next() may have been the issue, but the same error occurs with or without this part (or indeed, the entire println! call).
What am I missing here? It's been some time since I looked at Rust (and never in a serious level of effort), and most of the search results for things of this nature appear to be out of date.

It's got nothing to do with the return value of match, or even match itself::
use std::io::Read;
fn lexan() {
let mut input = std::io::stdin().bytes().peekable();
if let Some(&ch) = input.peek() {}
}
The issue is that you are attempting to bind the result of Peekable::peek while dereferencing it (that's what the & in &ch does). In this case, the return type is an Option<&Result<u8, std::io::Error>> because the Bytes iterator returns errors from the underlying stream. Since this type does not implement Copy, trying to dereference the type requires that you transfer ownership of the value. You cannot do so as you don't own the original value — thus the error message.
The piece that causes the inability to copy is the error type of the Result. Because of that, you can match one level deeper:
match input.peek() {
Some(&Ok(ch)) => {
match ch {
_ => println!("{:?}", input.next()),
}
}
Some(&Err(_)) => panic!(),
None => break,
}
Be aware that this code is pretty close to being uncompilable though. The result of peek will be invalidated when next is called, so many small changes to this code will trigger the borrow checker to fail the code. I'm actually a bit surprised the above worked on the first go.
If you didn't care about errors at all, you could do
while let Some(&Ok(ch)) = input.peek() {
match ch {
_ => println!("{:?}", input.next()),
}
}
Unfortunately, you can't split the middle, as this would cause the borrow of input to last during the call to next:
while let Some(x) = input.peek() {
match *x {
Ok(ch) => {
match ch {
_ => println!("{:?}", input.next()),
}
}
Err(_) => {}
}
// Could still use `x` here, compiler doesn't currently see that we don't
}

Related

Using `ANY` is not working when picking up patterns using the Rust PEST parser

I am using the PEST parser and I am testing a simple example to get familiar with the syntax. I am trying to get every instance of ++ throughout the string but I am running into some issues. I think it may be an issue with the ANY keyword but I am not sure. Can anyone help point me in the right direction as to what is going wrong?
Here is my grammar.pest file
incrementing = {(prefix ~ ANY+ ~ "++" ~ suffix)}
prefix = {(NEWLINE | WHITESPACE)*}
suffix = {(NEWLINE | WHITESPACE)*}
WHITESPACE = _{ " " }
Here is my test case
//parses a file a matching rule and returns all instances of the rule
fn parse_file_contents_for_rule(rule: Rule, file_contents: &str) -> Option<Pairs<Rule>> {
SolgaParser::parse(rule, file_contents).ok()
}
fn parse_incrementing(file_contents: &str) {
//parse the file for the rule
let targets = parse_file_contents_for_rule(Rule::incrementing, file_contents);
//if there are matches
if targets.is_some() {
//iterate through all of the matches
for target in targets.unwrap().into_iter() {
println!("{}", target.as_str());
}
}
}
#[test]
fn test_parse_incrementing() {
let file_contents = r#"
index++;
a_thing++;
another_thing++;
should_not_match;
should_match++;
"#;
parse_incrementing(file_contents);
}
In your example, ANY+ is probably matching till the end of the line, so the ++ pattern is never matched, and therefore the whole incrementing rule is never matched.
Try changing it to (!"+" ~ ANY)+

Deleting a node from singly linked list has the error "cannot move out of borrowed content"

I am making a singly-linked list. When you delete a node, the previous node's next should become the current node's next (prev->next = curr->next;) and return data if the index matches. Otherwise, the previous node becomes the current node and the current node becomes the next node (prev = curr; curr = curr->next;):
struct Node<T> {
data: T,
next: Option<Box<Node<T>>>,
}
struct LinkedList<T> {
head: Option<Box<Node<T>>>,
}
impl LinkedList<i64> {
fn remove(&mut self, index: usize) -> i64 {
if self.len() == 0 {
panic!("LinkedList is empty!");
}
if index >= self.len() {
panic!("Index out of range: {}", index);
}
let mut count = 0;
let mut head = &self.head;
let mut prev: Option<Box<Node<i64>>> = None;
loop {
match head {
None => {
panic!("LinkedList is empty!");
}
Some(c) => {
// I have borrowed here
if count == index {
match prev {
Some(ref p) => {
p.next = c.next;
// ^ cannot move out of borrowed content
}
_ => continue,
}
return c.data;
} else {
count += 1;
head = &c.next;
prev = Some(*c);
// ^^ cannot move out of borrowed content
}
}
}
}
}
fn len(&self) -> usize {
unimplemented!()
}
}
fn main() {}
error[E0594]: cannot assign to field `p.next` of immutable binding
--> src/main.rs:31:33
|
30 | Some(ref p) => {
| ----- consider changing this to `ref mut p`
31 | p.next = c.next;
| ^^^^^^^^^^^^^^^ cannot mutably borrow field of immutable binding
error[E0507]: cannot move out of borrowed content
--> src/main.rs:31:42
|
31 | p.next = c.next;
| ^ cannot move out of borrowed content
error[E0507]: cannot move out of borrowed content
--> src/main.rs:40:37
|
40 | prev = Some(*c);
| ^^ cannot move out of borrowed content
Playground Link for more info.
How can I do this? Is my approach wrong?
Before you start, go read Learning Rust With Entirely Too Many Linked Lists. People think that linked lists are easy because they've been taught them in languages that either don't care if you introduce memory unsafety or completely take away that agency from the programmer.
Rust does neither, which means that you have to think about things you might never have thought of before.
There are a number of issues with your code. The one that you ask about, "cannot move out of borrowed content" is already well-covered by numerous other questions, so there's no reason to restate all those good answers:
Cannot move out of borrowed content
Cannot move out of borrowed content when trying to transfer ownership
error[E0507]: Cannot move out of borrowed content
TL;DR: You are attempting to move ownership of next from out of a reference; you cannot.
p.next = c.next;
You are attempting to modify an immutable reference:
let mut head = &self.head;
You allow for people to remove one past the end, which doesn't make sense to me:
if index >= self.len()
You iterate the entire tree not once, but twice before iterating it again to perform the removal:
if self.len() == 0
if index >= self.len()
All of that pales in comparison to the fact that your algorithm is flawed in the eyes of Rust because you attempt to introduce mutable aliasing. If your code were able to compile, you'd have a mutable reference to previous as well as a mutable reference to current. However, you can get a mutable reference to current from previous. This would allow you to break Rust's memory safety guarantees!
Instead, you can only keep track of current and, when the right index is found, break it apart and move the pieces:
fn remove(&mut self, index: usize) -> T {
self.remove_x(index)
.unwrap_or_else(|| panic!("index {} out of range", index))
}
fn remove_x(&mut self, mut index: usize) -> Option<T> {
let mut head = &mut self.head;
while index > 0 {
head = match { head }.as_mut() {
Some(n) => &mut n.next,
None => return None,
};
index -= 1;
}
match head.take().map(|x| *x) {
Some(Node { data, next }) => {
*head = next;
Some(data)
}
None => None,
}
}
See also:
Cannot obtain a mutable reference when iterating a recursive structure: cannot borrow as mutable more than once at a time
How do I get an owned value out of a `Box`?
Playground Link for more info.
There are numerous problems with the rest of your code, such as the fact that the result of your insert method is unlike any I've ever seen before.
How I'd write it.

Why does returning early not finish outstanding borrows?

I'm trying to write a function which pushes an element onto the end of a sorted vector only if the element is larger than the last element already in the vector, otherwise returns an error with a ref to the largest element. This doesn't seem to violate any borrowing rules as far as I cant tell, but the borrow checker doesn't like it. I don't understand why.
struct MyArray<K, V>(Vec<(K, V)>);
impl<K: Ord, V> MyArray<K, V> {
pub fn insert_largest(&mut self, k: K, v: V) -> Result<(), &K> {
{
match self.0.iter().next_back() {
None => (),
Some(&(ref lk, _)) => {
if lk > &k {
return Err(lk);
}
}
};
}
self.0.push((k, v));
Ok(())
}
}
error[E0502]: cannot borrow `self.0` as mutable because it is also borrowed as immutable
--> src/main.rs:15:9
|
6 | match self.0.iter().next_back() {
| ------ immutable borrow occurs here
...
15 | self.0.push((k, v));
| ^^^^^^ mutable borrow occurs here
16 | Ok(())
17 | }
| - immutable borrow ends here
Why doesn't this work?
In response to Paolo Falabella's answer.
We can translate any function with a return statement into one without a return statement as follows:
fn my_func() -> &MyType {
'inner: {
// Do some stuff
return &x;
}
// And some more stuff
}
Into
fn my_func() -> &MyType {
let res;
'outer: {
'inner: {
// Do some stuff
res = &x;
break 'outer;
}
// And some more stuff
}
res
}
From this, it becomes clear that the borrow outlives the scope of 'inner.
Is there any problem with instead using the following rewrite for the purpose of borrow-checking?
fn my_func() -> &MyType {
'outer: {
'inner: {
// Do some stuff
break 'outer;
}
// And some more stuff
}
panic!()
}
Considering that return statements preclude anything from happening afterwards which might otherwise violate the borrowing rules.
If we name lifetimes explicitly, the signature of insert_largest becomes fn insert_largest<'a>(&'a mut self, k: K, v: V) -> Result<(), &'a K>. So, when you create your return type &K, its lifetime will be the same as the &mut self.
And, in fact, you are taking and returning lk from inside self.
The compiler is seeing that the reference to lk escapes the scope of the match (as it is assigned to the return value of the function, so it must outlive the function itself) and it can't let the borrow end when the match is over.
I think you're saying that the compiler should be smarter and realize that the self.0.push can only ever be reached if lk was not returned. But it is not. And I'm not even sure how hard it would be to teach it that sort of analysis, as it's a bit more sophisticated than the way I understand the borrow checker reasons today.
Today, the compiler sees a reference and basically tries to answer one question ("how long does this live?"). When it sees that your return value is lk, it assigns lk the lifetime it expects for the return value from the fn's signature ('a with the explicit name we gave it above) and calls it a day.
So, in short:
should an early return end the mutable borrow on self? No. As said the borrow should extend outside of the function and follow its return value
is the borrow checker a bit too strict in the code that goes from the early return to the end of the function? Yes, I think so. The part after the early return and before the end of the function is only reachable if the function has NOT returned early, so I think you have a point that the borrow checked might be less strict with borrows in that specific area of code
do I think it's feasible/desirable to change the compiler to enable that pattern? I have no clue. The borrow checker is one of the most complex pieces of the Rust compiler and I'm not qualified to give you an answer on that. This seems related to (and might even be a subset of) the discussion on non-lexical borrow scopes, so I encourage you to look into it and possibly contribute if you're interested in this topic.
For the time being I'd suggest just returning a clone instead of a reference, if possible. I assume returning an Err is not the typical case, so performance should not be a particular worry, but I'm not sure how the K:Clone bound might work with the types you're using.
impl <K, V> MyArray<K, V> where K:Clone + Ord { // 1. now K is also Clone
pub fn insert_largest(&mut self, k: K, v: V) ->
Result<(), K> { // 2. returning K (not &K)
match self.0.iter().next_back() {
None => (),
Some(&(ref lk, _)) => {
if lk > &k {
return Err(lk.clone()); // 3. returning a clone
}
}
};
self.0.push((k, v));
Ok(())
}
}
Why does returning early not finish outstanding borrows?
Because the current implementation of the borrow checker is overly conservative.
Your code works as-is once non-lexical lifetimes are enabled, but only with the experimental "Polonius" implementation. Polonius is what enables conditional tracking of borrows.
I've also simplified your code a bit:
#![feature(nll)]
struct MyArray<K, V>(Vec<(K, V)>);
impl<K: Ord, V> MyArray<K, V> {
pub fn insert_largest(&mut self, k: K, v: V) -> Result<(), &K> {
if let Some((lk, _)) = self.0.iter().next_back() {
if lk > &k {
return Err(lk);
}
}
self.0.push((k, v));
Ok(())
}
}

Moving and returning a mutable pointer

I am working through this Rust tutorial, and I'm trying to solve this problem:
Implement a function, incrementMut that takes as input a vector of integers and modifies the values of the original list by incrementing each value by one.
This seems like a fairly simple problem, yes?
I have been trying to get a solution to compile for a while now, and I'm beginning to lose hope. This is what I have so far:
fn main() {
let mut p = vec![1i, 2i, 3i];
increment_mut(p);
for &x in p.iter() {
print!("{} ", x);
}
println!("");
}
fn increment_mut(mut x: Vec<int>) {
for &mut i in x.iter() {
i += 1;
}
}
This is what the compiler says when I try to compile:
Compiling tut2 v0.0.1 (file:///home/nate/git/rust/tut2)
/home/nate/git/rust/tut2/src/main.rs:5:12: 5:13 error: use of moved value: `p`
/home/nate/git/rust/tut2/src/main.rs:5 for &x in p.iter() {
^
/home/nate/git/rust/tut2/src/main.rs:3:16: 3:17 note: `p` moved here because it has type `collections::vec::Vec<int>`, which is non-copyable
/home/nate/git/rust/tut2/src/main.rs:3 increment_mut(p);
^
error: aborting due to previous error
Could not compile `tut2`.
To learn more, run the command again with --verbose.
I also tried a version with references:
fn main() {
let mut p = vec![1i, 2i, 3i];
increment_mut(&p);
for &x in p.iter() {
print!("{} ", x);
}
println!("");
}
fn increment_mut(x: &mut Vec<int>) {
for &mut i in x.iter() {
i += 1i;
}
}
And the error:
Compiling tut2 v0.0.1 (file:///home/nate/git/rust/tut2)
/home/nate/git/rust/tut2/src/main.rs:3:16: 3:18 error: cannot borrow immutable dereference of `&`-pointer as mutable
/home/nate/git/rust/tut2/src/main.rs:3 increment_mut(&p);
^~
error: aborting due to previous error
Could not compile `tut2`.
To learn more, run the command again with --verbose.
I feel like I'm missing some core idea about memory ownership in Rust, and it's making solving trivial problems like this very difficult, could someone shed some light on this?
There are a few mistakes in your code.
increment_mut(&p), given a p that is Vec<int>, would require the function increment_mut(&Vec<int>); &-references and &mut-references are completely distinct things syntactically, and if you want a &mut-reference you must write &mut p, not &p.
You need to understand patterns and how they operate; for &mut i in x.iter() will not do what you intend it to: what it will do is take the &int that each iteration of x.iter() produces, dereference it (the &), copying the value (because int satisfies Copy, if you tried it with a non-Copy type like String it would not compile), and place it in the mutable variable i (mut i). That is, it is equivalent to for i in x.iter() { let mut i = *i; … }. The effect of this is that i += 1 is actually just incrementing a local variable and has no effect on the vector. You can fix this by using iter_mut, which produces &mut int rather than &int, and changing the &mut i pattern to just i and the i += 1 to *i += 1, meaning “change the int inside the &mut int.
You can also switch from using &mut Vec<int> to using &mut [int] by calling .as_mut_slice() on your vector. This is a better practice; you should practically never need a reference to a vector as that is taking two levels of indirection where only one is needed. Ditto for &String—it’s exceedingly rare, you should in such cases work with &str.
So then:
fn main() {
let mut p = vec![1i, 2i, 3i];
increment_mut(p.as_mut_slice());
for &x in p.iter() {
print!("{} ", x);
}
println!("");
}
fn increment_mut(x: &mut [int]) {
for i in x.iter_mut() {
*i += 1;
}
}

Instantiating a struct with stdin data in Rust

I am very, very new to Rust and trying to implement some simple things to get the feel for the language. Right now, I'm stumbling over the best way to implement a class-like struct that involves casting a string to an int. I'm using a global-namespaced function and it feels wrong to my Ruby-addled brain.
What's the Rustic way of doing this?
use std::io;
struct Person {
name: ~str,
age: int
}
impl Person {
fn new(input_name: ~str) -> Person {
Person {
name: input_name,
age: get_int_from_input(~"Please enter a number for age.")
}
}
fn print_info(&self) {
println(fmt!("%s is %i years old.", self.name, self.age));
}
}
fn get_int_from_input(prompt_message: ~str) -> int {
println(prompt_message);
let my_input = io::stdin().read_line();
let my_val =
match from_str::<int>(my_input) {
Some(number_string) => number_string,
_ => fail!("got to put in a number.")
};
return my_val;
}
fn main() {
let first_person = Person::new(~"Ohai");
first_person.print_info();
}
This compiles and has the desired behaviour, but I am at a loss for what to do here--it's obvious I don't understand the best practices or how to implement them.
Edit: this is 0.8
Here is my version of the code, which I have made more idiomatic:
use std::io;
struct Person {
name: ~str,
age: int
}
impl Person {
fn print_info(&self) {
println!("{} is {} years old.", self.name, self.age);
}
}
fn get_int_from_input(prompt_message: &str) -> int {
println(prompt_message);
let my_input = io::stdin().read_line();
from_str::<int>(my_input).expect("got to put in a number.")
}
fn main() {
let first_person = Person {
name: ~"Ohai",
age: get_int_from_input("Please enter a number for age.")
};
first_person.print_info();
}
fmt!/format!
First, Rust is deprecating the fmt! macro, with printf-based syntax, in favor of format!, which uses syntax similar to Python format strings. The new version, Rust 0.9, will complain about the use of fmt!. Therefore, you should replace fmt!("%s is %i years old.", self.name, self.age) with format!("{} is {} years old.", self.name, self.age). However, we have a convenience macro println!(...) that means exactly the same thing as println(format!(...)), so the most idiomatic way to write your code in Rust would be
println!("{} is {} years old.", self.name, self.age);
Initializing structs
For a simple type like Person, it is idiomatic in Rust to create instances of the type by using the struct literal syntax:
let first_person = Person {
name: ~"Ohai",
age: get_int_from_input("Please enter a number for age.")
};
In cases where you do want a constructor, Person::new is the idiomatic name for a 'default' constructor (by which I mean the most commonly used constructor) for a type Person. However, it would seem strange for the default constructor to require initialization from user input. Usually, I think you would have a person module, for example (with person::Person exported by the module). In this case, I think it would be most idiomatic to use a module-level function fn person::prompt_for_age(name: ~str) -> person::Person. Alternatively, you could use a static method on Person -- Person::prompt_for_age(name: ~str).
&str vs. ~str in function parameters
I've changed the signature of get_int_from_input to take a &str instead of ~str. ~str denotes a string allocated on the exchange heap -- in other words, the heap that malloc/free in C, or new/delete in C++ operate on. Unlike in C/C++, however, Rust enforces the requirement that values on the exchange heap can only be owned by one variable at a time. Therefore, taking a ~str as a function parameter means that the caller of the function can't reuse the ~str argument that it passed in -- it would have to make a copy of the ~str using the .clone method.
On the other hand, &str is a slice into the string, which is just a reference to a range of characters in the string, so it doesn't require a new copy of the string to be allocated when a function with a &str parameter is called.
The reason to use &str rather than ~str for prompt_message in get_int_from_input is that the function doesn't need to hold onto the message past the end of the function. It only uses the prompt message in order to print it (and println takes a &str, not a ~str). Once you change the function to take &str, you can call it like get_int_from_input("Prompt") instead of get_int_from_input(~"Prompt"), which avoids the unnecessary allocation of "Prompt" on the heap (and similarly, you can avoid having to clone s in the code below):
let s: ~str = ~"Prompt";
let i = get_int_from_input(s.clone());
println(s); // Would complain that `s` is no longer valid without cloning it above
// if `get_int_from_input` takes `~str`, but not if it takes `&str`.
Option<T>::expect
The Option<T>::expect method is the idiomatic shortcut for the match statement you have, where you want to either return x if you get Some(x) or fail with a message if you get None.
Returning without return
In Rust, it is idiomatic (following the example of functional languages like Haskell and OCaml) to return a value without explicitly writing a return statement. In fact, the return value of a function is the result of the last expression in the function, unless the expression is followed by a semicolon (in which case it returns (), a.k.a. unit, which is essentially an empty placeholder value -- () is also what is returned by functions without an explicit return type, such as main or print_info).
Conclusion
I'm not a great expert on Rust by any means. If you want help on anything related to Rust, you can try, in addition to Stack Overflow, the #rust IRC channel on irc.mozilla.org or the Rust subreddit.
This isn't really rust-specifc, but try to split functionality into discrete units. Don't mix the low-level tasks of putting strings on the terminal and getting strings from the terminal with the more directly relevant (and largely implementation dependent) tasks of requesting a value, and verify it. When you do that, the design decisions you should make start to arise on their own.
For instance, you could write something like this (I haven't compiled it, and I'm new to rust myself, so they're probably at LEAST one thing wrong with this :) ).
fn validated_input_prompt<T>(prompt: ~str) {
println(prompt);
let res = io::stdin().read_line();
loop {
match res.len() {
s if s == 0 => { continue; }
s if s > 0 {
match T::from_str(res) {
Some(t) -> {
return t
},
None -> {
println("ERROR. Please try again.");
println(prompt);
}
}
}
}
}
}
And then use it as:
validated_input_prompt<int>("Enter a number:")
or:
validated_input_prompt<char>("Enter a Character:")
BUT, to make the latter work, you'd need to implement FromStr for chars, because (sadly) rust doesn't seem to do it by default. Something LIKE this, but again, I'm not really sure of the rust syntax for this.
use std::from_str::*;
impl FromStr for char {
fn from_str(s: &str) -> Option<Self> {
match len(s) {
x if x >= 1 => {
Option<char>.None
},
x if x == 0 => {
None,
},
}
return s[0];
}
}
A variation of telotortium's input reading function that doesn't fail on bad input. The loop { ... } keyword is preferred over writing while true { ... }. In this case using return is fine since the function is returning early.
fn int_from_input(prompt: &str) -> int {
println(prompt);
loop {
match from_str::<int>(io::stdin().read_line()) {
Some(x) => return x,
None => println("Oops, that was invalid input. Try again.")
};
}
}

Resources