Why does Rust not have a return value in the main function, and how to return a value anyway? - return

In Rust the main function is defined like this:
fn main() {
}
This function does not allow for a return value though. Why would a language not allow for a return value and is there a way to return something anyway? Would I be able to safely use the C exit(int) function, or will this cause leaks and whatnot?

As of Rust 1.26, main can return a Result:
use std::fs::File;
fn main() -> Result<(), std::io::Error> {
let f = File::open("bar.txt")?;
Ok(())
}
The returned error code in this case is 1 in case of an error. With File::open("bar.txt").expect("file not found"); instead, an error value of 101 is returned (at least on my machine).
Also, if you want to return a more generic error, use:
use std::error::Error;
...
fn main() -> Result<(), Box<dyn Error>> {
...
}

std::process::exit(code: i32) is the way to exit with a code.
Rust does it this way so that there is a consistent explicit interface for returning a value from a program, wherever it is set from. If main starts a series of tasks then any of these can set the return value, even if main has exited.
Rust does have a way to write a main function that returns a value, however it is normally abstracted within stdlib. See the documentation on writing an executable without stdlib for details.

As was noted by others, std::process::exit(code: i32) is the way to go here
More information about why is given in RFC 1011: Process Exit. Discussion about the RFC is in the pull request of the RFC.

The reddit thread on this has a "why" explanation:
Rust certainly could be designed to do this. It used to, in fact.
But because of the task model Rust uses, the fn main task could start a bunch of other tasks and then exit! But one of those other tasks may want to set the OS exit code after main has gone away.
Calling set_exit_status is explicit, easy, and doesn't require you to always put a 0 at the bottom of main when you otherwise don't care.

Try:
use std::process::ExitCode;
fn main() -> ExitCode {
ExitCode::from(2)
}
Take a look in doc
or:
use std::process::{ExitCode, Termination};
pub enum LinuxExitCode { E_OK, E_ERR(u8) }
impl Termination for LinuxExitCode {
fn report(self) -> ExitCode {
match self {
LinuxExitCode::E_OK => ExitCode::SUCCESS,
LinuxExitCode::E_ERR(v) => ExitCode::from(v)
}
}
}
fn main() -> LinuxExitCode {
LinuxExitCode::E_ERR(3)
}

You can set the return value with std::os::set_exit_status.

Related

Why does `set` method defined on `Cell<T>` explicitly drops the old value? (Rust)

Interested why does set method defined on Cell, on the last line explicitly drops old value.
Shouldn't it be implicitly dropped (memory freed) anyways when the function returns?
use std::mem;
use std::cell::UnsafeCell;
pub struct Cell<T> {
value: UnsafeCell<T>
}
impl<T> Cell<T> {
pub fn set(&self, val: T) {
let old = self.replace(val);
drop(old); // Is this needed?
} // old would drop here anyways?
pub fn replace(&self, val: T) -> T {
mem::replace(unsafe { &mut *self.value.get() }, val)
}
}
So why not have set do this only:
pub fn set(&self, val: T) {
self.replace(val);
}
or std::ptr::read does something I don't understand.
It is not needed, but calling drop explicitly can help make code easier to read in some cases. If we only wrote it as a call to replace, it would look like a wrapper function for replace and a reader might lose the context that it does an additional action on top of calling the replace method (dropping the previous value). At the end of the day though it is somewhat subjective on which version to use and it makes no functional difference.
That being said, the real reason is that it did not always drop the previous value when set. Cell<T> previously implemented set to overwrite the existing value via unsafe pointer operations. It was later modified in rust-lang/rust#39264: Extend Cell to non-Copy types so that the previous value would always be dropped. The writer (wesleywiser) likely wanted to more explicitly show that the previous value was being dropped when a new value is written to the cell so the pull request would be easier to review.
Personally, I think this is a good usage of drop since it helps to convey what we intend to do with the result of the replace method.

Returning a constructed recursive data structure in Rust

I've just started learning Rust and am coming from a functional programming background. I'm trying to create a parser in rust and have defined this recursive data structure
enum SimpleExpression<'a> {
Number(u64),
FunctionCall(Function, &'a SimpleExpression<'a>, &'a SimpleExpression<'a>)
}
I initially had defined it as
enum SimpleExpression {
Number(u64),
FunctionCall(Function, SimpleExpression, SimpleExpression)
}
but got a complaint from the compiler saying this type had an infinite size. I'm not used to having to worry about managing memory so this confused me for a second but now it makes a lot of sense. Rust cannot allocate memory for a data structure where the size is not defined. So changing the SimpleExpression to &'a SimpleExpression<'a> makes sense.
The second problem I came across was when implementing the parsing function (excuse the verboseness)
fn parse_simple_expr(input: &str) -> Option<(SimpleExpression, usize)> {
match parse_simple_expr(&input) {
Some((left, consumed1)) => match parse_function_with_whitespace(&input[consumed1..]) {
Some((func, consumed2)) => match parse_simple_expr(&input[consumed1+consumed2..]) {
Some((right, consumed3)) => Some((
SimpleExpression::FunctionCall(func, &left.clone(), &right.clone()),
consumed1 + consumed2 + consumed3
)),
None => None
},
None => None
},
None => None
}
}
But basically what is wrong with this function is I am creating SimpleExpression objects inside the function and then trying to return references to them from the function. The problem here of course is that the objects will be dropped when the function returns and Rust does not allow dangling references so I get the error cannot return value referencing temporary value on &left.clone() and &right.clone().
It makes sense to me why this does not work but I am wondering if there is another way to execute this pattern to be able to create a recursive object and return it from a function. Or is there some fundamental reason this will never work in which case are there any good alternatives? Since my code is confusing I've also provided a simpler example of a recursive structure but that has the same limitations in case that helps to better understand the issue.
enum List<'a> {
End,
Next((char, &'a List<'a>))
}
fn create_linked(input: &str) -> List {
match input.chars().next() {
Some(c) => List::Next((c, &create_linked(&input[1..]))),
None => List::End
}
}

Why is the value moved into the closure here rather than borrowed?

The Error Handling chapter of the Rust Book contains an example on how to use the combinators of Option and Result. A file is read and through application of a series of combinators the contents are parsed as an i32 and returned in a Result<i32, String>.
Now, I got confused when I looked at the code. There, in one closure to an and_then a local String value is created an subsequently passed as a return value to another combinator.
Here is the code example:
use std::fs::File;
use std::io::Read;
use std::path::Path;
fn file_double<P: AsRef<Path>>(file_path: P) -> Result<i32, String> {
File::open(file_path)
.map_err(|err| err.to_string())
.and_then(|mut file| {
let mut contents = String::new(); // local value
file.read_to_string(&mut contents)
.map_err(|err| err.to_string())
.map(|_| contents) // moved without 'move'
})
.and_then(|contents| {
contents.trim().parse::<i32>()
.map_err(|err| err.to_string())
})
.map(|n| 2 * n)
}
fn main() {
match file_double("foobar") {
Ok(n) => println!("{}", n),
Err(err) => println!("Error: {}", err),
}
}
The value I am referring to is contents. It is created and later referenced in the map combinator applied to the std::io::Result<usize> return value of Read::read_to_string.
The question: I thought that not marking the closure with move would borrow any referenced value by default, which would result in the borrow checker complaining, that contents does not live long enough. However, this code compiles just fine. That means, the String contents is moved into, and subequently out of, the closure. Why is this done without the explicit move?
I thought that not marking the closure with move would borrow any referenced value by default,
Not quite. The compiler does a bit of inspection on the code within the closure body and tracks how the closed-over variables are used.
When the compiler sees that a method is called on a variable, then it looks to see what type the receiver is (self, &self, &mut self). When a variable is used as a parameter, the compiler also tracks if it is by value, reference, or mutable reference. Whatever the most restrictive requirement is will be what is used by default.
Occasionally, this analysis is not complete enough — even though the variable is only used as a reference, we intend for the closure to own the variable. This usually occurs when returning a closure or handing it off to another thread.
In this case, the variable is returned from the closure, which must mean that it is used by value. Thus the variable will be moved into the closure automatically.
Occasionally the move keyword is too big of a hammer as it moves all of the referenced variables in. Sometimes you may want to just force one variable to be moved in but not others. In that case, the best solution I know of is to make an explicit reference and move the reference in:
fn main() {
let a = 1;
let b = 2;
{
let b = &b;
needs_to_own_a(move || a_function(a, b));
}
}

Can I create an "unsafe closure"?

I have some code that, when simplified, looks like:
fn foo() -> Vec<u8> {
unsafe {
unsafe_iterator().map(|n| wrap_element(n)).collect()
}
}
The iterator returns items that would be invalidated if the underlying data changed. Sadly, I'm unable to rely on the normal Rust mechanism of mut here (I'm doing some... odd things).
To rectify the unsafe-ness, I traverse the iterator all at once and make copies of each item (via wrap_element) and then throw it all into a Vec. This works because nothing else has a chance to come in and modify the underlying data.
The code works as-is now, but since I use this idiom a few times, I wanted to DRY up my code a bit:
fn zap<F>(f: F) -> Vec<u8>
where F: FnOnce() -> UnsafeIter
{
f().map(|n| wrap_element(n)).collect()
}
fn foo() -> Vec<u8> {
zap(|| unsafe { unsafe_iterator() }) // Unsafe block
}
My problem with this solution is that the call to unsafe_iterator is unsafe, and it's the wrap_element / collect that makes it safe again. The way that the code is structured does not convey that at all.
I'd like to somehow mark my closure as being unsafe and then it's zaps responsibility to make it safe again.
It's not possible to create an unsafe closure in the same vein as an unsafe fn, since closures are just anonymous types with implementations of the Fn, FnMut, and/or FnOnce family of traits. Since those traits do not have unsafe methods, it's not possible to create a closure which is unsafe to call.
You could create a second set of closure traits with unsafe methods, then write implementations for those, but you would lose much of the closure sugar.

Is it necessary to use else branch in async expressions?

I want to write the following code:
let someAsync () = async {
if 1 > 2 then return true // Error "this expression is expected to have type unit ..."
// I want to place much code here
return false
}
F# for some reason thinks that I need to write it like that:
let someAsync () = async {
if 1 > 2 then return true
else
// Much code here (indented!)
return false
}
In latter case no error message is produced. But in my view both pieces of code are equivalent. Is there any chance I could avoid unnecessary nesting and indentation?
UPD. What I am asking is possible indeed! Please take a look at example, see section Real world example
I will quote the code:
let validateName(arg:string) = imperative {
if (arg = null) then return false // <- HERE IT IS
let idx = arg.IndexOf(" ")
if (idx = -1) then return false // <- HERE IT IS
// ......
return true
}
So, it is possible, the only question is if it is possible to implement somehow in async, via an extension to module or whatever.
I think that situation is described here: Conditional Expressions: if... then...else (F#)
(...) if the type of the then branch is any type other than unit,
there must be an else branch with the same return type.
Your first code does not have else branch, which caused an error.
There is an important difference between the async computation builder and my imperative builder.
In async, you cannot create a useful computation that does not return a value. This means that Async<'T> represents a computation that will eventually produce a value of type 'T. In this case, the async.Zero method has to return unit and has a signature:
async.Zero : unit -> Async<unit>
For imperiatve builder, the type Imperative<'T> represents a computation that may or may not return a value. If you look at the type declaration, it looks as follows:
type Imperative<'T> = unit -> option<'T>
This means that the Zero operation (which is used when you write if without else) can be computation of any type. So, imperative.Zero method returns a computation of any type:
imperative.Zero : unit -> Imperative<'T>
This is a fundamental difference which also explains why you can create if without else branch (because the Zero method can create computation of any type). This is not possible for async, because Zero can only create unit-returning values.
So the two computations have different structures. In particular, "imperative" computations have monoidal structure and async workflows do not. In more details, you can find the explanation in our F# Computation Zoo paper

Resources