Why does returning early not finish outstanding borrows? - return

I'm trying to write a function which pushes an element onto the end of a sorted vector only if the element is larger than the last element already in the vector, otherwise returns an error with a ref to the largest element. This doesn't seem to violate any borrowing rules as far as I cant tell, but the borrow checker doesn't like it. I don't understand why.
struct MyArray<K, V>(Vec<(K, V)>);
impl<K: Ord, V> MyArray<K, V> {
pub fn insert_largest(&mut self, k: K, v: V) -> Result<(), &K> {
{
match self.0.iter().next_back() {
None => (),
Some(&(ref lk, _)) => {
if lk > &k {
return Err(lk);
}
}
};
}
self.0.push((k, v));
Ok(())
}
}
error[E0502]: cannot borrow `self.0` as mutable because it is also borrowed as immutable
--> src/main.rs:15:9
|
6 | match self.0.iter().next_back() {
| ------ immutable borrow occurs here
...
15 | self.0.push((k, v));
| ^^^^^^ mutable borrow occurs here
16 | Ok(())
17 | }
| - immutable borrow ends here
Why doesn't this work?
In response to Paolo Falabella's answer.
We can translate any function with a return statement into one without a return statement as follows:
fn my_func() -> &MyType {
'inner: {
// Do some stuff
return &x;
}
// And some more stuff
}
Into
fn my_func() -> &MyType {
let res;
'outer: {
'inner: {
// Do some stuff
res = &x;
break 'outer;
}
// And some more stuff
}
res
}
From this, it becomes clear that the borrow outlives the scope of 'inner.
Is there any problem with instead using the following rewrite for the purpose of borrow-checking?
fn my_func() -> &MyType {
'outer: {
'inner: {
// Do some stuff
break 'outer;
}
// And some more stuff
}
panic!()
}
Considering that return statements preclude anything from happening afterwards which might otherwise violate the borrowing rules.

If we name lifetimes explicitly, the signature of insert_largest becomes fn insert_largest<'a>(&'a mut self, k: K, v: V) -> Result<(), &'a K>. So, when you create your return type &K, its lifetime will be the same as the &mut self.
And, in fact, you are taking and returning lk from inside self.
The compiler is seeing that the reference to lk escapes the scope of the match (as it is assigned to the return value of the function, so it must outlive the function itself) and it can't let the borrow end when the match is over.
I think you're saying that the compiler should be smarter and realize that the self.0.push can only ever be reached if lk was not returned. But it is not. And I'm not even sure how hard it would be to teach it that sort of analysis, as it's a bit more sophisticated than the way I understand the borrow checker reasons today.
Today, the compiler sees a reference and basically tries to answer one question ("how long does this live?"). When it sees that your return value is lk, it assigns lk the lifetime it expects for the return value from the fn's signature ('a with the explicit name we gave it above) and calls it a day.
So, in short:
should an early return end the mutable borrow on self? No. As said the borrow should extend outside of the function and follow its return value
is the borrow checker a bit too strict in the code that goes from the early return to the end of the function? Yes, I think so. The part after the early return and before the end of the function is only reachable if the function has NOT returned early, so I think you have a point that the borrow checked might be less strict with borrows in that specific area of code
do I think it's feasible/desirable to change the compiler to enable that pattern? I have no clue. The borrow checker is one of the most complex pieces of the Rust compiler and I'm not qualified to give you an answer on that. This seems related to (and might even be a subset of) the discussion on non-lexical borrow scopes, so I encourage you to look into it and possibly contribute if you're interested in this topic.
For the time being I'd suggest just returning a clone instead of a reference, if possible. I assume returning an Err is not the typical case, so performance should not be a particular worry, but I'm not sure how the K:Clone bound might work with the types you're using.
impl <K, V> MyArray<K, V> where K:Clone + Ord { // 1. now K is also Clone
pub fn insert_largest(&mut self, k: K, v: V) ->
Result<(), K> { // 2. returning K (not &K)
match self.0.iter().next_back() {
None => (),
Some(&(ref lk, _)) => {
if lk > &k {
return Err(lk.clone()); // 3. returning a clone
}
}
};
self.0.push((k, v));
Ok(())
}
}

Why does returning early not finish outstanding borrows?
Because the current implementation of the borrow checker is overly conservative.
Your code works as-is once non-lexical lifetimes are enabled, but only with the experimental "Polonius" implementation. Polonius is what enables conditional tracking of borrows.
I've also simplified your code a bit:
#![feature(nll)]
struct MyArray<K, V>(Vec<(K, V)>);
impl<K: Ord, V> MyArray<K, V> {
pub fn insert_largest(&mut self, k: K, v: V) -> Result<(), &K> {
if let Some((lk, _)) = self.0.iter().next_back() {
if lk > &k {
return Err(lk);
}
}
self.0.push((k, v));
Ok(())
}
}

Related

Are dependency injection containers that return references to values allocated in a local function fundamentally impossible in Rust?

In my mind one of the ideal traits for a dependency injection container would look like:
pub trait ResolveOwn<T> {
fn resolve(&self) -> T;
}
I don't know how to implement this for certain T. I keep stubbing my toes on variations and cousins of this error:
error[E0515]: cannot return value referencing local variable `X`
I'm used to dependency injection in C# where returning values referencing local variables is precisely how you implement the equivalent of that resolve function.
Here's an illustration that focuses on this aspect of dependency injection:
struct ComplexThing<'a>(&'a i32);
struct Module();
impl Module {
fn resolve_foo(&self) -> i32 {
todo!()
}
pub fn resolve_complex_thing_1(&self) -> ComplexThing {
let foo = self.resolve_foo();
ComplexThing(&foo)
}
}
error[E0515]: cannot return value referencing local variable `foo`
--> src/lib.rs:12:9
|
12 | ComplexThing(&foo)
| ^^^^^^^^^^^^^----^
| | |
| | `foo` is borrowed here
| returns a value referencing data owned by the current function
See? There's that error.
My first instinct (again, coming from C#) is to give the local variable a place to live in the returned value, because the local variable is created here but it needs to live at least as long as the returned value. Hmm... that sounds sort of like returning a closure. Let's see how that goes...
pub fn resolve_complex_thing_2<'a>(&'a self) -> impl FnOnce() -> ComplexThing<'a> {
let foo = self.resolve_foo();
move || ComplexThing(&foo)
}
error[E0515]: cannot return value referencing local data `foo`
--> src/lib.rs:12:17
|
12 | move || ComplexThing(&foo)
| ^^^^^^^^^^^^^----^
| | |
| | `foo` is borrowed here
| returns a value referencing data owned by the current function
No joy. It doesn't work to package this closure up into a prettier type (like some impl of Into<ComplexThing<'a>>) because it's fundamentally about returning a value referencing local data.
My next instinct is to somehow jam the local data into some kind of weak cache inside my Module and then get a reference from there (undoubtedly unsafely). And then the weak cache will need to solve half of the hard problems in Computer Science (hint: the other hard problem is naming things). That's starting to sound an awful lot like... oh no. Garbage collection!
I also thought about inverting the flow of control. It's hideous and still doesn't work:
impl Module {
pub fn use_foo<T>(&self, f: impl FnOnce(i32) -> T) -> T {
(f)(42)
}
pub fn use_complex_thing<'a, T>(&'a self, f: impl FnOnce(ComplexThing<'a>) -> T) -> T {
self.use_foo(
|foo| (f)(ComplexThing(&foo)),
)
}
}
error[E0597]: `foo` does not live long enough
--> src/lib.rs:12:36
|
10 | pub fn use_complex_thing<'a, T>(&'a self, f: impl FnOnce(ComplexThing<'a>) -> T) -> T {
| -- lifetime `'a` defined here
11 | self.use_foo(
12 | |foo| (f)(ComplexThing(&foo)),
| -----------------^^^^--
| | | |
| | | `foo` dropped here while still borrowed
| | borrowed value does not live long enough
| argument requires that `foo` is borrowed for `'a`
My last instinct is to hack around the restriction against moving a value with active borrows, because then I could trick the compiler. My attempts at implementing that resulted in a type that's impossible to use correctly — it ended up requiring knowledge that only the compiler has and seemed to introduce undefined behavior at every turn. I won't bother reproducing that code here.
It seems like it's impossible to return any owned instances of types containing (non-singleton) references.
Assuming that's true, that means there are entire classes of types that simply cannot be created with a dependency injection container in Rust.
Surely I'm missing something?
You can try making a drop guard along with Box::leak to leak a reference to live long enough, then have custom behavior on Drop to reclaim the leaked memory. Note that this will require you to do everything through the drop guard:
use std::marker::PhantomData;
use std::mem::ManuallyDrop;
struct ComplexThing<'a>(&'a i32);
struct Module;
pub struct DropGuard<'a, T: 'a, V: 'a> {
// do NOT make these fields pub
// direct manipulation of these is very unsafe
container: ManuallyDrop<T>,
value: *mut V,
// I'm not sure this is needed but better safe than sorry
_value: PhantomData<&'a mut V>,
}
impl<'a, T: 'a, V: 'a> DropGuard<'a, T, V> {
pub fn new<F: FnOnce(&'a mut V) -> T>(value: Box<V>, gen: F) -> Self {
// leak the value so it lives long enough
let leaked = Box::leak(value);
// get a pointer to know what to drop
let leaked_ptr: *mut _ = leaked;
DropGuard {
container: ManuallyDrop::new(gen(leaked)),
value: leaked_ptr,
_value: PhantomData,
}
}
}
// so you can actually use it
// no DerefMut since dropping the container without dropping the guard is weird
impl<'a, T: 'a, V: 'a> std::ops::Deref for DropGuard<'a, T, V> {
type Target = T;
fn deref(&self) -> &Self::Target {
&self.container
}
}
impl<'a, T: 'a, V: 'a> Drop for DropGuard<'a, T, V> {
fn drop(&mut self) {
// drop the container first
// this should be safe since self.container is never referenced again
// the value its borrowing is still valid (due to not being dropped yet)
// and there should be no references to it (due to this struct being dropped)
unsafe {
ManuallyDrop::drop(&mut self.container);
}
// now drop the pointer
// this should be safe since it was created with Box::leak
// and the container borrowing it has already been dropped
// and no more references should have survived
std::mem::drop(unsafe { Box::from_raw(self.value) });
}
}
impl Module {
pub fn resolve_foo(&self) -> i32 {
5
}
pub fn resolve_complex_thing_1(&self) -> DropGuard<ComplexThing, i32> {
DropGuard::new(Box::new(self.resolve_foo()), |i32_ref| {
ComplexThing(i32_ref)
})
}
}
fn main() {
let module = Module;
let guard = module.resolve_complex_thing_1();
println!("{:?}", guard.0);
}
Playground link
Another way that also cleans up the typing is to use a trait:
use std::marker::PhantomData;
use std::mem::ManuallyDrop;
struct ComplexThing<'a>(&'a i32);
struct Module;
// not sure if this trait should be unsafe
// but again, better safe than sorry
pub unsafe trait Guardable {
type Value;
}
unsafe impl Guardable for ComplexThing<'_> {
type Value = i32;
}
pub struct DropGuard<'a, T: 'a + Guardable> {
// do NOT make these fields pub
// direct manipulation of these is very unsafe
container: ManuallyDrop<T>,
value: *mut T::Value,
// I'm not sure this is needed but better safe than sorry
_value: PhantomData<&'a mut T::Value>,
}
impl<'a, T: 'a + Guardable> DropGuard<'a, T> {
pub fn new<F: FnOnce(&'a mut T::Value) -> T>(value: Box<T::Value>, gen: F) -> Self {
// leak the value so it lives long enough
let leaked = Box::leak(value);
// get a pointer to know what to drop
let leaked_ptr: *mut _ = leaked;
DropGuard {
container: ManuallyDrop::new(gen(leaked)),
value: leaked_ptr,
_value: PhantomData,
}
}
}
// so you can actually use it
// no DerefMut since dropping the container without dropping the guard is weird
impl<'a, T: 'a + Guardable> std::ops::Deref for DropGuard<'a, T> {
type Target = T;
fn deref(&self) -> &Self::Target {
&self.container
}
}
impl<'a, T: 'a + Guardable> Drop for DropGuard<'a, T> {
fn drop(&mut self) {
// drop the container first
// this should be safe since self.container is never referenced again
// the value its borrowing is still valid (due to not being dropped yet)
// and there should be no references to it (due to this struct being dropped)
unsafe {
ManuallyDrop::drop(&mut self.container);
}
// now drop the pointer
// this should be safe since it was created with Box::leak
// and the container borrowing it has already been dropped
// and no more references should have survived
std::mem::drop(unsafe { Box::from_raw(self.value) });
}
}
impl Module {
pub fn resolve_foo(&self) -> i32 {
5
}
pub fn resolve_complex_thing_1(&self) -> DropGuard<ComplexThing> {
DropGuard::new(Box::new(self.resolve_foo()), |i32_ref| {
ComplexThing(i32_ref)
})
}
}
fn main() {
let module = Module;
let guard = module.resolve_complex_thing_1();
println!("{:?}", guard.0);
}
Playground link
Since every container should only have one valid DropGuard value type, you can put that in an associated type in a trait, so now you can work with DropGuard<ComplexThing> instead of DropGuard<ComplexThing, i32>, and this also prevents you from having bogus values in the DropGuard.
Disclaimer: I'm fairly new to Rust; this answer is based on limited experience and may be un-nuanced.
As a general principle of Rust program design — not specific to dependency injection — you should plan not to use references except for things that are one of:
temporary, i.e. confined to the life of some stack frame (or technically longer than that in the case of async functions, but you get the idea, I hope)
compile-time constants or lazily initialized singletons, i.e. &'static references
The reason is that Rust does not — without various trickery — support lifetimes that are not one of those two cases.
Any structures which are needed for longer durations than that should be designed to not contain non-'static references — and instead use owned values. In other words, let your DI be like
pub trait ResolveOwn<T: 'static> {
// ^^^^^^^^^^
fn resolve(&self) -> T;
}
Don't actually add that lifetime constraint: it doesn't buy you anything, and might be inconvenient (for example, in a test that wants to inject things referring to the test — which will work fine since they live longer than the entire DI container — or if the application's actual main() has something to share, similarly). But plan as if it were there.
Given this constraint, how can you implement things that seem to want references?
In the simplest cases, just use owned values and don't worry about any extra cloning required unless it proves to be a performance issue.
Use Rc or Arc for reference-counted smart pointers that keep things alive as long as necessary.
If some T really requires references, but only into its own data, use a safe self-referential struct helper like ouroboros. (I believe this is similar to but more general than the suggestion in Aplet123's answer to this question.)
All of these strategies are independent of the DI container: they're ways to make a type satisfy the 'static lifetime bound ("contains no references except 'static ones").

With closures as parameter and return values, is Fn or FnMut more idiomatic?

Continuing from How do I write combinators for my own parsers in Rust?, I stumbled into this question concerning bounds of functions that consume and/or yield functions/closures.
From these slides, I learned that to be convenient for consumers, you should try to take functions as FnOnce and return as Fn where possible. This gives the caller most freedom what to pass and what to do with the returned function.
In my example, FnOnce is not possible because I need to call that function multiple times. While trying to make it compile I arrived at two possibilities:
pub enum Parsed<'a, T> {
Some(T, &'a str),
None(&'a str),
}
impl<'a, T> Parsed<'a, T> {
pub fn unwrap(self) -> (T, &'a str) {
match self {
Parsed::Some(head, tail) => (head, &tail),
_ => panic!("Called unwrap on nothing."),
}
}
pub fn is_none(&self) -> bool {
match self {
Parsed::None(_) => true,
_ => false,
}
}
}
pub fn achar(character: char) -> impl Fn(&str) -> Parsed<char> {
move |input|
match input.chars().next() {
Some(c) if c == character => Parsed::Some(c, &input[1..]),
_ => Parsed::None(input),
}
}
pub fn some_v1<T>(parser: impl Fn(&str) -> Parsed<T>) -> impl Fn(&str) -> Parsed<Vec<T>> {
move |input| {
let mut re = Vec::new();
let mut pos = input;
loop {
match parser(pos) {
Parsed::Some(head, tail) => {
re.push(head);
pos = tail;
}
Parsed::None(_) => break,
}
}
Parsed::Some(re, pos)
}
}
pub fn some_v2<T>(mut parser: impl FnMut(&str) -> Parsed<T>) -> impl FnMut(&str) -> Parsed<Vec<T>> {
move |input| {
let mut re = Vec::new();
let mut pos = input;
loop {
match parser(pos) {
Parsed::Some(head, tail) => {
re.push(head);
pos = tail;
}
Parsed::None(_) => break,
}
}
Parsed::Some(re, pos)
}
}
#[test]
fn try_it() {
assert_eq!(some_v1(achar('#'))("##comment").unwrap(), (vec!['#', '#'], "comment"));
assert_eq!(some_v2(achar('#'))("##comment").unwrap(), (vec!['#', '#'], "comment"));
}
playground
Now I don't know which version is to be preferred. Version 1 takes Fn which is less general, but version 2 needs its parameter mutable.
Which one is more idiomatic/should be used and what is the rationale behind?
Update: Thanks jplatte for the suggestion on version one. I updated the code here, that case I find even more interesting.
Comparing some_v1 and some_v2 as you wrote them I would say version 2 should definitely be preferred because it is more general. I can't think of a good example for a parsing closure that would implement FnMut but not Fn, but there's really no disadvantage to parser being mut - as noted in the first comment on your question this doesn't constrain the caller in any way.
However, there is a way in which you can make version 1 more general (not strictly more general, just partially) than version 2, and that is by returning impl Fn(&str) -> … instead of impl FnMut(&str) -> …. By doing that, you get two functions that each are less constrained than the other in some way, so it might even make sense to keep both:
Version 1 with the return type change would be more restrictive in its argument (the callable can't mutate its associated data) but less restrictive in its return type (you guarantee that the returned callable doesn't mutate its associated data)
Version 2 would be less restrictive in its argument (the callable is allowed to mutate its associated data) but more restrictive in its return type (the returned callable might mutate its associated data)

How do I create & use a list of callback functions?

In Rust, I'm trying to create a list of callbacks functions to invoke later:
use std::vec::Vec;
fn add_to_vec<T: FnMut() -> ()>(v: &Vec<Box<FnMut() -> ()>>, f: T) {
v.push(Box::new(f));
}
fn call_b() {
println!("Call b.");
}
#[test]
fn it_works() {
let calls: Vec<Box<FnMut() -> ()>> = Vec::new();
add_to_vec(&calls, || { println!("Call a."); });
add_to_vec(&calls, call_b);
for c in calls.drain() {
c();
}
}
I'm mostly following the advice here on how to store a closure, however, I'm still seeing some errors:
src/lib.rs:6:12: 6:23 error: the parameter type `T` may not live long enough [E0311]
src/lib.rs:6 v.push(Box::new(f));
^~~~~~~~~~~
src/lib.rs:6:23: 6:23 help: consider adding an explicit lifetime bound for `T`
src/lib.rs:5:68: 7:2 note: the parameter type `T` must be valid for the anonymous lifetime #1 defined on the block at 5:67...
src/lib.rs:5 fn add_to_vec<T: FnMut() -> ()>(v: &Vec<Box<FnMut() -> ()>>, f: T) {
src/lib.rs:6 v.push(Box::new(f));
src/lib.rs:7 }
src/lib.rs:6:12: 6:23 note: ...so that the type `T` will meet its required lifetime bounds
src/lib.rs:6 v.push(Box::new(f));
^~~~~~~~~~~
I've tried changing the function signature to:
fn add_to_vec<'a, T: FnMut() -> ()>(v: &Vec<Box<FnMut() -> ()>>, f: &'a T) {
… but this gets me:
src/lib.rs:6:12: 6:23 error: the trait `core::ops::Fn<()>` is not implemented for the type `&T` [E0277]
src/lib.rs:6 v.push(Box::new(f));
^~~~~~~~~~~
error: aborting due to previous error
src/lib.rs:6:12: 6:23 error: the trait `core::ops::Fn<()>` is not implemented for the type `&T` [E0277]
src/lib.rs:6 v.push(Box::new(f));
^~~~~~~~~~~
src/lib.rs:18:24: 18:51 error: mismatched types:
expected `&_`,
found `[closure src/lib.rs:18:24: 18:51]`
(expected &-ptr,
found closure) [E0308]
src/lib.rs:18 add_to_vec(&calls, || { println!("Call a."); });
^~~~~~~~~~~~~~~~~~~~~~~~~~~
(The last error I can correct by adding a &; while I think this is something I should need, because add_to_vec is going to end up owning the closure, and thus needs to borrow it, I'm not entirely sure.)
There are a few problems with your code. Here’s a fully fixed version to begin with:
use std::vec::Vec;
fn add_to_vec<'a, T: FnMut() + 'a>(v: &mut Vec<Box<FnMut() + 'a>>, f: T) {
v.push(Box::new(f));
}
fn call_b() {
println!("Call b.");
}
#[test]
fn it_works() {
let mut calls: Vec<Box<FnMut()>> = Vec::new();
add_to_vec(&mut calls, || { println!("Call a."); });
add_to_vec(&mut calls, call_b);
for mut c in calls.drain() {
c();
}
}
The lifetime issue is that the boxed function objects must have a common base lifetime; if you just write the generic constraint T: FnMut(), it is assumed to only need to live as long as the function call and not any longer. Therefore two things need to be added to it all: the generic parameter T must be constrained to a specified lifetime, and in order to store it inside the vector, the trait object type must similarly be constrained, as Box<FnMut() + 'a>. That way they both match up and memory safety is ensured and so the compiler lets it through. The -> () part of FnMut() -> () is superfluous, by the way.
The remaining fixes that need to be made are the insertion of a few mut; in order to push to the vector, you naturally need a mutable reference, hence the & to &mut changes, and in order to take mutable references to calls and c the bindings must be made mut.

Return a closure from a function

Note that this question pertains to a version of Rust before 1.0 was released
Do I understand correctly that it is now impossible to return a closure from a function, unless it was provided to the function in its arguments? It is very useful approach, for example, when I need the same block of code, parameterized differently, in different parts of program. Currently the compiler does not allow something like this, naturally:
fn make_adder(i: int) -> |int| -> int {
|j| i + j
}
The closure is allocated on the stack and is freed upon returning from a function, so it is impossible to return it.
Will it be possible to make this work in future? I heard that dynamically-sized types would allow this.
This can't ever work for a stack closure; it needs to either have no environment or own its environment. The DST proposals do include the possibility of reintroducing a closure type with an owned environment (~Fn), which would satisfy your need, but it is not clear yet whether that will happen or not.
In practice, there are other ways of doing this. For example, you might do this:
pub struct Adder {
n: int,
}
impl Add<int, int> for Adder {
#[inline]
fn add(&self, rhs: &int) -> int {
self.n + *rhs
}
}
fn make_adder(i: int) -> Adder {
Adder {
n: int,
}
}
Then, instead of make_adder(3)(4) == 7, it would be make_adder(3) + 4 == 7, or make_adder(3).add(&4) == 7. (That it is Add<int, int> that it is implementing rather than just an impl Adder { fn add(&self, other: int) -> int { self.n + other } is merely to allow you the convenience of the + operator.)
This is a fairly silly example, as the Adder might just as well be an int in all probability, but it has its possibilities.
Let us say that you want to return a counter; you might wish to have it as a function which returns (0, func), the latter element being a function which will return (1, func), &c. But this can be better modelled with an iterator:
use std::num::{Zero, One};
struct Counter<T> {
value: T,
}
impl<T: Add<T, T> + Zero + One + Clone> Counter<T> {
fn new() -> Counter<T> {
Counter { value: Zero::zero() }
}
}
impl<T: Add<T, T> + Zero + One + Clone> Iterator<T> for Counter<T> {
#[inline]
fn next(&mut self) -> Option<T> {
let mut value = self.value.clone();
self.value += One::one();
Some(value)
}
// Optional, just for a modicum of efficiency in some places
#[inline]
fn size_hint(&self) -> (uint, Option<uint>) {
(uint::max_value, None)
}
}
Again, you see the notion of having an object upon which you call a method to mutate its state and return the desired value, rather than creating a new callable. And that's how it is: for the moment, where you might like to be able to call object(), you need to call object.method(). I'm sure you can live with that minor inconvenience that exists just at present.

Instantiating a struct with stdin data in Rust

I am very, very new to Rust and trying to implement some simple things to get the feel for the language. Right now, I'm stumbling over the best way to implement a class-like struct that involves casting a string to an int. I'm using a global-namespaced function and it feels wrong to my Ruby-addled brain.
What's the Rustic way of doing this?
use std::io;
struct Person {
name: ~str,
age: int
}
impl Person {
fn new(input_name: ~str) -> Person {
Person {
name: input_name,
age: get_int_from_input(~"Please enter a number for age.")
}
}
fn print_info(&self) {
println(fmt!("%s is %i years old.", self.name, self.age));
}
}
fn get_int_from_input(prompt_message: ~str) -> int {
println(prompt_message);
let my_input = io::stdin().read_line();
let my_val =
match from_str::<int>(my_input) {
Some(number_string) => number_string,
_ => fail!("got to put in a number.")
};
return my_val;
}
fn main() {
let first_person = Person::new(~"Ohai");
first_person.print_info();
}
This compiles and has the desired behaviour, but I am at a loss for what to do here--it's obvious I don't understand the best practices or how to implement them.
Edit: this is 0.8
Here is my version of the code, which I have made more idiomatic:
use std::io;
struct Person {
name: ~str,
age: int
}
impl Person {
fn print_info(&self) {
println!("{} is {} years old.", self.name, self.age);
}
}
fn get_int_from_input(prompt_message: &str) -> int {
println(prompt_message);
let my_input = io::stdin().read_line();
from_str::<int>(my_input).expect("got to put in a number.")
}
fn main() {
let first_person = Person {
name: ~"Ohai",
age: get_int_from_input("Please enter a number for age.")
};
first_person.print_info();
}
fmt!/format!
First, Rust is deprecating the fmt! macro, with printf-based syntax, in favor of format!, which uses syntax similar to Python format strings. The new version, Rust 0.9, will complain about the use of fmt!. Therefore, you should replace fmt!("%s is %i years old.", self.name, self.age) with format!("{} is {} years old.", self.name, self.age). However, we have a convenience macro println!(...) that means exactly the same thing as println(format!(...)), so the most idiomatic way to write your code in Rust would be
println!("{} is {} years old.", self.name, self.age);
Initializing structs
For a simple type like Person, it is idiomatic in Rust to create instances of the type by using the struct literal syntax:
let first_person = Person {
name: ~"Ohai",
age: get_int_from_input("Please enter a number for age.")
};
In cases where you do want a constructor, Person::new is the idiomatic name for a 'default' constructor (by which I mean the most commonly used constructor) for a type Person. However, it would seem strange for the default constructor to require initialization from user input. Usually, I think you would have a person module, for example (with person::Person exported by the module). In this case, I think it would be most idiomatic to use a module-level function fn person::prompt_for_age(name: ~str) -> person::Person. Alternatively, you could use a static method on Person -- Person::prompt_for_age(name: ~str).
&str vs. ~str in function parameters
I've changed the signature of get_int_from_input to take a &str instead of ~str. ~str denotes a string allocated on the exchange heap -- in other words, the heap that malloc/free in C, or new/delete in C++ operate on. Unlike in C/C++, however, Rust enforces the requirement that values on the exchange heap can only be owned by one variable at a time. Therefore, taking a ~str as a function parameter means that the caller of the function can't reuse the ~str argument that it passed in -- it would have to make a copy of the ~str using the .clone method.
On the other hand, &str is a slice into the string, which is just a reference to a range of characters in the string, so it doesn't require a new copy of the string to be allocated when a function with a &str parameter is called.
The reason to use &str rather than ~str for prompt_message in get_int_from_input is that the function doesn't need to hold onto the message past the end of the function. It only uses the prompt message in order to print it (and println takes a &str, not a ~str). Once you change the function to take &str, you can call it like get_int_from_input("Prompt") instead of get_int_from_input(~"Prompt"), which avoids the unnecessary allocation of "Prompt" on the heap (and similarly, you can avoid having to clone s in the code below):
let s: ~str = ~"Prompt";
let i = get_int_from_input(s.clone());
println(s); // Would complain that `s` is no longer valid without cloning it above
// if `get_int_from_input` takes `~str`, but not if it takes `&str`.
Option<T>::expect
The Option<T>::expect method is the idiomatic shortcut for the match statement you have, where you want to either return x if you get Some(x) or fail with a message if you get None.
Returning without return
In Rust, it is idiomatic (following the example of functional languages like Haskell and OCaml) to return a value without explicitly writing a return statement. In fact, the return value of a function is the result of the last expression in the function, unless the expression is followed by a semicolon (in which case it returns (), a.k.a. unit, which is essentially an empty placeholder value -- () is also what is returned by functions without an explicit return type, such as main or print_info).
Conclusion
I'm not a great expert on Rust by any means. If you want help on anything related to Rust, you can try, in addition to Stack Overflow, the #rust IRC channel on irc.mozilla.org or the Rust subreddit.
This isn't really rust-specifc, but try to split functionality into discrete units. Don't mix the low-level tasks of putting strings on the terminal and getting strings from the terminal with the more directly relevant (and largely implementation dependent) tasks of requesting a value, and verify it. When you do that, the design decisions you should make start to arise on their own.
For instance, you could write something like this (I haven't compiled it, and I'm new to rust myself, so they're probably at LEAST one thing wrong with this :) ).
fn validated_input_prompt<T>(prompt: ~str) {
println(prompt);
let res = io::stdin().read_line();
loop {
match res.len() {
s if s == 0 => { continue; }
s if s > 0 {
match T::from_str(res) {
Some(t) -> {
return t
},
None -> {
println("ERROR. Please try again.");
println(prompt);
}
}
}
}
}
}
And then use it as:
validated_input_prompt<int>("Enter a number:")
or:
validated_input_prompt<char>("Enter a Character:")
BUT, to make the latter work, you'd need to implement FromStr for chars, because (sadly) rust doesn't seem to do it by default. Something LIKE this, but again, I'm not really sure of the rust syntax for this.
use std::from_str::*;
impl FromStr for char {
fn from_str(s: &str) -> Option<Self> {
match len(s) {
x if x >= 1 => {
Option<char>.None
},
x if x == 0 => {
None,
},
}
return s[0];
}
}
A variation of telotortium's input reading function that doesn't fail on bad input. The loop { ... } keyword is preferred over writing while true { ... }. In this case using return is fine since the function is returning early.
fn int_from_input(prompt: &str) -> int {
println(prompt);
loop {
match from_str::<int>(io::stdin().read_line()) {
Some(x) => return x,
None => println("Oops, that was invalid input. Try again.")
};
}
}

Resources