How to parse an octal string as a float in Rust? - parsing

I need to take an octal string, such as "42.1", and get a float from it (34.125). What's the best way to do this in Rust? I see there previously was a from_str_radix function, but it's now removed.

use std::fmt;
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct ParseFloatError {
_private: (),
}
impl ParseFloatError {
fn new() -> ParseFloatError {
ParseFloatError { _private: () }
}
}
impl fmt::Display for ParseFloatError {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "Could not parse float")
}
}
pub fn parse_float_radix(s: &str, radix: u32) -> Result<f64, ParseFloatError> {
let s2 = s.replace(".", "");
let i = i64::from_str_radix(&s2, radix).map_err(|_| ParseFloatError::new())?;
let count = s.split('.').count();
let fraction_len: usize;
match count {
0 => unreachable!(),
1 => fraction_len = 0,
2 => fraction_len = s.split('.').last().unwrap().len(),
_ => return Err(ParseFloatError::new()),
}
let f = (i as f64) / f64::from(radix).powi(fraction_len as i32);
Ok(f)
}
fn main() {
println!("{}", parse_float_radix("42.1", 8).unwrap());
}
It first parses the input as an integer and then divides it by radix^number_of_fractional_digits.
It doesn't support scientific notation or special values like infinity or NaN. It also fails if the intermediate integer overflows.

Since posting this question, a crate has appeared that solves this: lexical. Compiling with the radix feature enables a parse_radix function, which can parse strings into floats with radices from 2 to 36.

Related

Parsing an f64 variable into a usize variable in Rust

I have currently been dabbling in the Rust programming language and decided a good way to test my skills was to program an application that would find the median of any given list of numbers.
Eventually I got into the Final stretch of code and stumbled into a problem.
I needed to parse an f64 variable into a usize variable.
However, I don't know how to go about doing this (Wow what a surprise!).
Take a look at the second function, calc_med() in my code. The variable n2 is supposed to take n and parse it into a usize. The code is not finished yet, but if you can see any more problems with the code please let me know.
use std::io;
use std::sync::Mutex;
#[macro_use]
extern crate lazy_static;
lazy_static! {
static ref v1: Mutex<Vec<f64>> = Mutex::new(Vec::new());
}
fn main() {
loop {
println!("Enter: ");
let mut inp: String = String::new();
io::stdin().read_line(&mut inp).expect("Failure");
let upd_inp: f64 = match inp.trim().parse() {
Ok(num) => num,
Err(_) => if inp.trim() == String::from("q") {
break;
} else if inp.trim() == String::from("d"){
break
{
println!("Done!");
calc_med();
}
} else {
continue;
}
};
v1.lock().unwrap().push(upd_inp);
v1.lock().unwrap().sort_by(|a, b| a.partial_cmp(b).unwrap());
println!("{:?}", v1.lock().unwrap());
}
}
fn calc_med() { // FOR STACKOVERFLOW: THIS FUNCTION
let n: f64 = ((v1.lock().unwrap().len()) as f64 + 1.0) / 2.0;
let n2: usize = n.to_usize().expect("Failure");
let median: f64 = v1[n2];
println!("{}", median)
}

How to typecast fixed size byte array as struct?

I want to reinterpret a stack allocated byte array as a stack allocated (statically guaranteed) struct without doing any work - just to tell the compiler that "Yes, I promise they are the same size and anything". How do I do that?
I tried transmute, but it doesn't compile.
fn from_u8_fixed_size_array<T>(arr: [u8; size_of::<T>()]) -> T {
unsafe { mem::transmute(arr) }
}
cannot transmute between types of different sizes, or dependently-sized types E0512
Note: source type: `[u8; _]` (this type does not have a fixed size)
Note: target type: `T` (this type does not have a fixed size)
There is also this variant of such a function, that compiles, but it requires T to be Copy:
fn from_u8_fixed_size_array(arr: [u8; size_of::<T>()]) -> T {
unsafe { *(&arr as *const [u8; size_of::<T>()] as *const T) }
}
With Rust 1.64 I have a compilation error on [u8; size_of::<T>()] (cannot perform const operation using T).
I tried with a const generic parameter but the problem is still the same (I cannot introduce a where clause to constrain this constant to match size_of::<T>()).
Since the array is passed by value and the result is a value, some bytes have to be copied ; this implies a kind of memcpy().
I suggest using a slice instead of an array and checking the size at runtime.
If you are ready to deal with undefined behaviour, you might consider the second version which does not copy anything: it just reinterprets the storage as is.
I'm not certain I would do that, however...
Edit
The original code was compiled with nightly and a specific feature.
We can simply use transmute_copy() to get the array by value and emit a value.
And, I think the functions themselves should be qualified with unsafe instead of just some of their operations, because nothing guaranties (statically) that these conversions are correct.
#![feature(generic_const_exprs)] // nightly required
unsafe fn from_u8_slice_v1<T>(arr: &[u8]) -> T {
let mut result = std::mem::MaybeUninit::<T>::uninit();
let src = &arr[0] as *const u8;
let dst = result.as_mut_ptr() as *mut u8;
let count = std::mem::size_of::<T>();
assert_eq!(count, arr.len());
std::ptr::copy_nonoverlapping(src, dst, count);
result.assume_init()
}
unsafe fn from_u8_slice_v2<T>(arr: &[u8]) -> &T {
let size = std::mem::size_of::<T>();
let align = std::mem::align_of::<T>();
assert_eq!(size, arr.len());
let addr = &arr[0] as *const _ as usize;
assert_eq!(addr % align, 0);
&*(addr as *const T) // probably UB
}
unsafe fn from_u8_fixed_size_array<T>(
arr: [u8; std::mem::size_of::<T>()]
) -> T {
std::mem::transmute_copy(&arr)
}
fn main() {
let a = [1, 2];
println!("{:?}", a);
let i1 = unsafe { from_u8_slice_v1::<i16>(&a) };
println!("{:?}", i1);
let i2 = unsafe { from_u8_slice_v2::<i16>(&a) };
println!("{:?}", i2);
let i3 = unsafe { from_u8_fixed_size_array::<i16>(a) };
println!("{:?}", i3);
}
/*
[1, 2]
513
513
513
*/

Why do structs share the same address when created, and have different addresses from creating when dropped

I am attempting to log a structs address when creating the struct and when it is dropped, when I run the below code not only do both structs log the same address, both structs log a different address when being dropped. Is there a correct way to do this?
struct TestStruct {
val: i32
}
impl TestStruct {
fn new(val: i32) -> Self {
let x = TestStruct{val};
println!("creating struct {:p}", &x as *const _);
x
}
}
impl Drop for TestStruct {
fn drop(&mut self) {
println!("destroying struct {:p}", &self as *const _)
}
}
fn main() {
let s1 = TestStruct::new(1);
let s2 = TestStruct::new(2);
}
Output:
creating struct 0x7ffef1f96e44
creating struct 0x7ffef1f96e44
destroying struct 0x7ffef1f96e38
destroying struct 0x7ffef1f96e38
In new() you're printing the address of x, when new() returns x is moved, so that is no longer the actual address, which is why you see the same address repeated.
See also "Is a returned value moved or not?".
In drop(), you are actually printing the address of the &Self and not Self itself. You need to change &self as *const _ to just self as self is already a reference. Now it correctly prints the two different addresses.
If you then instead try to print the address of s1 and s2 in main() then the addresses match.
impl TestStruct {
fn new(val: i32) -> Self {
let x = TestStruct { val };
x
}
}
impl Drop for TestStruct {
fn drop(&mut self) {
println!("destroying struct {:p}", self);
}
}
fn main() {
let s1 = TestStruct::new(1);
println!("creating struct {:p}", &s1);
let s2 = TestStruct::new(2);
println!("creating struct {:p}", &s2);
}
Output:
creating struct 0xb8682ff59c <- s1
creating struct 0xb8682ff5f4 <- s2
destroying struct 0xb8682ff5f4 <- s2
destroying struct 0xb8682ff59c <- s1

Rust - How to parse UTF-8 alphabetical characters in nom?

I am trying to parse character sequences of alphabetical characters, including german umlauts (ä ö ü) and other alphabetical characters from the UTF-8 charset.
This is the parser I tried first:
named!(
parse(&'a str) -> Self,
map!(
alpha1,
|s| Self { chars: s.into() }
)
);
But it only works for ASCII alphabetical characters (a-zA-Z).
I tried to perform the parsing char by char:
named!(
parse(&str) -> Self,
map!(
take_while1!(nom::AsChar::is_alpha),
|s| Self { chars: s.into() }
)
);
But this won't even parse "hello", but result in an Incomplete(Size(1)) error:
How do you parse UTF-8 alphabetical characters in nom?
A snippet from my code:
extern crate nom;
#[derive(PartialEq, Debug, Eq, Clone, Hash, Ord, PartialOrd)]
pub struct Word {
chars: String,
}
impl From<&str> for Word {
fn from(s: &str) -> Self {
Self {
chars: s.into(),
}
}
}
use nom::*;
impl Word {
named!(
parse(&str) -> Self,
map!(
take_while1!(nom::AsChar::is_alpha),
|s| Self { chars: s.into() }
)
);
}
#[test]
fn parse_word() {
let words = vec![
"hello",
"Hi",
"aha",
"Mathematik",
"mathematical",
"erfüllen"
];
for word in words {
assert_eq!(Word::parse(word).unwrap().1, Word::from(word));
}
}
When I run this test,
cargo test parse_word
I get:
thread panicked at 'called `Result::unwrap()` on an `Err` value: Incomplete(Size(1))', ...
I know that chars are already UTF-8 encoded in Rust (thank heavens, almighty), but it seems that the nom library is not behaving as I would expect. I am using nom 5.1.0
First nom 5 use function for parsing, I advice to use this form because error message are much better and the code is much cleaner.
You requierement is odd, you could just take the full input make it a string and over:
impl Word {
fn parse(input: &str) -> IResult<&str, Self> {
Ok((
&input[input.len()..],
Self {
chars: input.to_string(),
},
))
}
}
But I guess your purpose is to parse a word, so here a example of what you could do:
#[derive(PartialEq, Debug, Eq, Clone, Hash, Ord, PartialOrd)]
pub struct Word {
chars: String,
}
impl From<&str> for Word {
fn from(s: &str) -> Self {
Self { chars: s.into() }
}
}
use nom::{character::complete::*, combinator::*, multi::*, sequence::*, IResult};
impl Word {
fn parse(input: &str) -> IResult<&str, Self> {
let (input, word) =
delimited(space0, recognize(many1_count(none_of(" \t"))), space0)(input)?;
Ok((
input,
Self {
chars: word.to_string(),
},
))
}
}
#[test]
fn parse_word() {
let words = vec![
"hello",
" Hi",
"aha ",
" Mathematik ",
" mathematical",
"erfüllen ",
];
for word in words {
assert_eq!(Word::parse(word).unwrap().1, Word::from(word.trim()));
}
}
You could also make a custom function that use is_alphabetic() instead of none_of(" \t") but this require make a custom error for nom and is currently in my opinion very annoying to do.
On this Github Issue a fellow contributor quickly whipped up a library (nom-unicode) to handle this nicely:
use nom_unicode::complete::{alphanumeric1};
impl Word {
named!(
parse(&'a str) -> Self,
map!(
alphanumeric1,
|w| Self::new(w)
)
);
}

Is there any way to have boxed and by-move closures?

I need a closure that captures by-value and is called at most once, but I cannot have the function using the closure monomorphise on every passed closure, because the closures and functions are mutually recursive and the monomorphisation phase fails. I tried something like:
fn closure_user(closure: Box<FnOnce(usize) -> bool>) -> bool {
closure(3)
}
fn main() {
let big_data = vec![1, 2, 3, 4];
closure_user(Box::new(|x| {
let _ = big_data.into_iter();
false
}));
}
error[E0161]: cannot move a value of type dyn std::ops::FnOnce(usize) -> bool: the size of dyn std::ops::FnOnce(usize) -> bool cannot be statically determined
--> src/main.rs:2:5
|
2 | closure(3)
| ^^^^^^^
The unboxed version is:
fn closure_user<F>(closure: F) -> bool
where
F: FnOnce(usize) -> bool,
{
closure(42)
}
fn main() {
let big_data = vec![1, 2, 3, 4];
closure_user(|x| {
let _ = big_data.into_iter();
false
});
}
It seems that it is impossible to box and unbox the closure as a FnOnce trait object. Is there any way to have boxed (no type parameter) and by-move (one call only) closures?
As of Rust 1.35, this is now possible using your original syntax:
fn closure_user(closure: Box<dyn FnOnce(usize) -> bool>) -> bool {
closure(3)
}
fn main() {
let big_data = vec![1, 2, 3, 4];
closure_user(Box::new(|x| {
let _ = big_data.into_iter();
false
}));
}
It is possible, but for now you have to do it through the unstable std::thunk::Thunk:
use std::thunk::{Invoke, Thunk};
fn closure_user(closure: Thunk<usize, bool>) -> bool {
closure.invoke(3)
}
fn main() {
let big_data = vec![1, 2, 3, 4];
closure_user(Thunk::with_arg(|x| {
let _ = big_data.into_iter();
false
}));
}
This is due to limitations on the current type system - it's not possible to move out from a trait object - and should be addressed soon. For more information, see the blog post Purging Proc.

Resources