How to pass the same mutable Peekable to different functions in Rust - parsing

I want to write a parser.
It seems practical to me to have a mutable Iterator that I can pass around to different parser functions.
I've tried to illustrated a simplified approach, which compiles but is not ideal yet.
fn main() {
let tokens = vec!["fIrSt".to_string(), "SeConD".to_string(), "tHiRd".to_string(), "FoUrTh".to_string()];
let parsed = parse_input(tokens);
println!("{}", parsed);
}
fn parse_input(tokens: Vec<String>) -> String {
let mut tokens_iter = tokens.iter();
let upps = parse_upper(&mut tokens_iter);
let lowers = parse_lower(&mut tokens_iter);
upps + &lowers
}
fn parse_upper(tokens_iter: &mut Iterator<Item=&String>) -> String {
let mut result = String::new();
let token_1 = tokens_iter.next().unwrap().to_uppercase();
let token_2 = tokens_iter.next().unwrap().to_uppercase();
result.push_str(&token_1);
result.push_str(&token_2);
result
}
fn parse_lower(tokens_iter: &mut Iterator<Item=&String>) -> String {
let mut result = String::new();
let token_1 = tokens_iter.next().unwrap().to_lowercase();
let token_2 = tokens_iter.next().unwrap().to_lowercase();
result.push_str(&token_1);
result.push_str(&token_2);
result
}
How the example works:
Let's say I have some input, that has already been tokenized. Here it is represented by the tokens vector (Vec<String>).
Inside the outer parse_input function, the Vec gets transformed into an Iterator and then passed into different, specific parser functions. Here: parse_upper and parse_lower. In real life those could be "parse_if_statement" or "parse_while_loop" but which part of the Iterator gets worked on is not relevant for the question.
What is relevant is, that every call to next advances the cursor on the Iterator. So that every function consumes the pieces it needs.
This example compiles and gives the output: FIRSTSECONDthirdfourth
I would like to be able to peek() into the Iterator, before I pass it to a function. This is necessary to determine which function should actually be called. But everything I have tried with using a Peekable instead of an Iterator resulted in total lifetime and borrow chaos.
Any suggestions on how to pass a Peekable instead of an Iterator in this case?
Maybe using a Peekable as function parameter is a bad idea in the first place. Or maybe my Iterator approach is already wrong. All suggestions/hints are welcome.

Related

Can't understand the logic of F# mutable variable inside function body

I'm learning F# and get stuck with the concept of mutable keyword.
Please see the below example:
let count =
let mutable a = 1
fun () -> a <- a + 1; a
val count: unit -> int
Which increases by 1 every time it's called with (). But next code does not:
let count =
let mutable a = 1
a <- a + 1
a
val count: int
Which is always 2.
In the book I'm studying with, it says with the first example, "The initialization of mutable value a is done only once, when the function has called first time."
When I started learning FP with haskell, the way it handled side effects like this totally burnt my brain, but F# mutable is destroying my brain again, with a different way. What's the difference between above two snippets? And, what's the true meaning and condition of above sentence, about the initialization of mutable value?
Your second example
let count =
let mutable a = 1
a <- a + 1
a
Defines a mutable variable initialised to 1, then assigns a new value (a + 1) to it using the <- operator before returning the updated value on the last line. Since a has type int and this is returned from the function the return type of the function is also int.
The first example
let count =
let mutable a = 1
fun () -> a <- a + 1; a
also declares an int a initialised to 1. However instead of returning it directly it returns a function which closes over a. Each time this function is called, a is incremented and the updated value returned. It could be equivalently written as:
let count =
let mutable a = 1
let update () =
a <- a + 1
a
update
fun () -> ... defines a lambda expression. This version returns a 1-argument function reflected in the different return type of unit -> int.
The first example of count initializes a mutable variable, and returns a closure around this variable. Every time you call that closure, the variable is increased, and its new value returned.
The second example of count is just an initialization block that sets the variable, increases it once, and returns its value. Referring to count again only returns the already computed value again.

Extracting a file extension from a given path in Rust idiomatically

I am trying to extract the extension of a file from a given String path.
The following piece of code works, but I was wondering if there is a cleaner and more idiomatic Rust way to achieve this:
use std::path::Path;
fn main() {
fn get_extension_from_filename(filename: String) -> String {
//Change it to a canonical file path.
let path = Path::new(&filename).canonicalize().expect(
"Expecting an existing filename",
);
let filepath = path.to_str();
let name = filepath.unwrap().split('/');
let names: Vec<&str> = name.collect();
let extension = names.last().expect("File extension can not be read.");
let extens: Vec<&str> = extension.split(".").collect();
extens[1..(extens.len())].join(".").to_string()
}
assert_eq!(get_extension_from_filename("abc.tar.gz".to_string()) ,"tar.gz" );
assert_eq!(get_extension_from_filename("abc..gz".to_string()) ,".gz" );
assert_eq!(get_extension_from_filename("abc.gz".to_string()) , "gz");
}
In idiomatic Rust the return type of a function that can fail should be an Option or a Result. In general, functions should also accept slices instead of Strings and only create a new String where necessary. This reduces excessive copying and heap allocations.
You can use the provided extension() method and then convert the resulting OsStr to a &str:
use std::path::Path;
use std::ffi::OsStr;
fn get_extension_from_filename(filename: &str) -> Option<&str> {
Path::new(filename)
.extension()
.and_then(OsStr::to_str)
}
assert_eq!(get_extension_from_filename("abc.gz"), Some("gz"));
Using and_then is convenient here because it means you don't have to unwrap the Option<&OsStr> returned by extension() and deal with the possibility of it being None before calling to_str. I also could have used a lambda |s| s.to_str() instead of OsStr::to_str - it might be a matter of preference or opinion as to which is more idiomatic.
Notice that both the argument &str and the return value are references to the original string slice created for the assertion. The returned slice cannot outlive the original slice that it is referencing, so you may need to create an owned String from this result if you need it to last longer.
What's more idiomatic than using Rust's builtin method for it?
Path::new(&filename).extension()

Swift String from imported unsigned char 2D array

I am using a 3rd party C library in my iOS application, which I am in the process of converting from Objective-C to Swift. I hit an obstacle when attempting to read one of the structs returned by the C library in Swift.
The struct looks similar to this:
typedef unsigned int LibUint;
typedef unsigned char LibUint8;
typedef struct RequestConfiguration_ {
LibUint8 names[30][128];
LibUint numberNames;
LibUint currentName;
} RequestConfiguration;
Which is imported into Swift as a Tuple containing 30 Tuples of 128 LibUint8 values. After a long time of trial and error using nested withUnsafePointer calls, I eventually began searching for solutions to iterating a Tuple in Swift.
What I ended up using is the following functions:
/**
* Perform iterator on every children of the type using reflection
*/
func iterateChildren<T>(reflectable: T, #noescape iterator: (String?, Any) -> Void) {
let mirror = Mirror(reflecting: reflectable)
for i in mirror.children {
iterator(i.label, i.value)
}
}
/**
* Returns a String containing the characters within the Tuple
*/
func libUint8TupleToString<T>(tuple: T) -> String {
var result = [CChar]()
let mirror = Mirror(reflecting: tuple)
for child in mirror.children {
let char = CChar(child.value as! LibUint8)
result.append(char)
// Null reached, skip the rest.
if char == 0 {
break;
}
}
// Always null terminate; faster than checking if last is null.
result.append(CChar(0))
return String.fromCString(result) ?? ""
}
/**
* Returns an array of Strings by decoding characters within the Tuple
*/
func libUint8StringsInTuple<T>(tuple: T, length: Int = 0) -> [String] {
var idx = 0
var strings = [String]()
iterateChildren(tuple) { (label, value) in
guard length > 0 && idx < length else { return }
let str = libUint8TupleToString(value)
strings.append(str)
idx++
}
return strings
}
Usage
func handleConfiguration(config: RequestConfiguration) {
// Declaration types are added for clarity
let names: [String] = libUint8StringsInTuple(config.names, config.numberNames)
let currentName: String = names[config.currentName]
}
My solution uses reflection to iterate the first Tuple, and reflection to iterate the second, because I was getting incorrect strings when using withUnsafePointer for the nested Tuples, which I assume is due to signage. Surely there must be a way to read the C strings in the array, using an UnsafePointer alike withUsafePointer(&struct.cstring) { String.fromCString(UnsafePointer($0)) }.
To be clear, I'm looking for the fastest way to read these C strings in Swift, even if that involves using Reflection.
Here is a possible solution:
func handleConfiguration(var config: RequestConfiguration) {
let numStrings = Int(config.numberNames)
let lenStrings = sizeofValue(config.names.0)
let names = (0 ..< numStrings).map { idx in
withUnsafePointer(&config.names) {
String.fromCString(UnsafePointer<CChar>($0) + idx * lenStrings) ?? ""
}
}
let currentName = names[Int(config.currentName)]
print(names, currentName)
}
It uses the fact that
LibUint8 names[30][128];
are 30*128 contiguous bytes in memory. withUnsafePointer(&config.names)
calls the closure with $0 as a pointer to the start of that
memory location, and
UnsafePointer<CChar>($0) + idx * lenStrings
is a pointer to the start of the idx-th subarray. The above code requires
that each subarray contains a NUL-terminated UTF-8 string.
The solution suggested by Martin R looks good to me and, as far as I can see from my limited testing, does work. However, as Martin pointed out, it requires that the strings be NUL-terminated UTF-8. Here are two more possible approaches. These follow the principle of handling the complexity of C data structures in C instead of dealing with it in Swift. Which of these approaches you choose depends on what specifically you are doing with RequestConfiguration in your app. If you are not comfortable programming in C, then a pure Swift approach, like the one suggested by Martin, might be a better choice.
For the purposes of this discussion, we will assume that the 3rd party C library has the following function for retrieving RequestConfiguration:
const RequestConfiguration * getConfig();
Approach 1: Make the RequestConfiguration object available to your Swift code, but extract names from it using the following C helper function:
const unsigned char * getNameFromConfig(const RequestConfiguration * rc, unsigned int nameIdx)
{
return rc->names[nameIdx];
}
Both this function's signature and the RequestConfiguration type must be available to the Swift code via the bridging header. You can then do something like this in Swift:
var cfg : UnsafePointer<RequestConfiguration> = getConfig()
if let s = String.fromCString(UnsafePointer<CChar>(getNameFromConfig(cfg, cfg.memory.currentName)))
{
print(s)
}
This approach is nice if you need the RequestConfiguration object available to Swift in order to check the number of names in multiple places, for example.
Approach 2: You just need to be able to get the name at a given position. In this case the RequestConfiguration type does not even need to be visible to Swift. You can write a helper C function like this:
const unsigned char * getNameFromConfig1(unsigned int idx)
{
const RequestConfiguration * p = getConfig();
return p->names[idx];
}
and use it in Swift as follows:
if let s = String.fromCString(UnsafePointer<CChar>(getNameFromConfig1(2)))
{
print(s)
}
This will print the name at position 2 (counting from 0). Of course, with this approach you might also want to have C helpers that return the count of names as well as the current name index.
Again, with these 2 approaches it is assumed the strings are NUL-terminated UTF-8. There are other approaches possible, these are just examples.
Also please note that the above assumes that you access RequestConfiguration as read-only. If you also want to modify it and make the changes visible to the 3rd party library C code, then it's a different ballgame.

How to build WriteBuf from array

I am serializing two values in to an array and I am trying to go through a WriteBuf but I am getting the error that
error: the trait `std::io::Write` is not implemented for the type `[_; 12]`
error: type `std::io::buffered::BufWriter<&mut [_; 12]>` does not implement any method in scope named `write_be_u32`
error: type `std::io::buffered::BufWriter<&mut [_; 12]>` does not implement any method in scope named `write_be_f64`
Here is the minimum code to generate this error:
use std::io::{ BufWriter, Write };
fn main(){
let packed_data = [0; 12];
let timestamp : u32 = 100;
let value : f64 = 9.9;
let writer = BufWriter::new(&mut packed_data);
writer.write_be_u32(timestamp);
writer.write_be_f64(value);
println!("Packed data looks like {:?}", packed_data);
}
Am I no borrowing the slice correctly? Am I note using the proper module to define the Write trait for my buffer?
Here is a playpen for this code: http://is.gd/ol8qND
I see a few potential problems with your code:
packed_data isn't mutable.
You use packed_data at the end of main while writer holds a mutable reference to it.
I don't think that either of those things are causing the error. I did however find something that works:
use std::io::{ BufWriter, Write };
fn main() {
let mut packed_data = [0; 12];
{
let packed_data_ref: &mut [u8] = &mut packed_data;
let mut writer = BufWriter::new(packed_data_ref);
writer.write(&[1, 2, 3, 4]).unwrap();
} // `writer` gets deallocated and releases the mutable reference
println!("Packed data looks like {:?}", packed_data);
}
[playpen]
So I guess the issue is that you need a &[u8] rather than a &[u8; 12]. I have no idea why. I hope this at least helps though.

F# lazy eval from stream reader?

I'm running into a bug in my code that makes me think that I don't really understand some of the details about F# and lazy evaluation. I know that F# evaluates eagerly and therefore am somewhat perplexed by the following function:
// Open a file, then read from it. Close the file. return the data.
let getStringFromFile =
File.OpenRead("c:\\eo\\raw.txt")
|> fun s -> let r = new StreamReader(s)
let data = r.ReadToEnd
r.Close()
s.Close()
data
When I call this in FSI:
> let d = getStringFromFile();;
System.ObjectDisposedException: Cannot read from a closed TextReader.
at System.IO.__Error.ReaderClosed()
at System.IO.StreamReader.ReadToEnd()
at <StartupCode$FSI_0134>.$FSI_0134.main#()
Stopped due to error
This makes me think that getStringFromFile is being evaluated lazily--so I'm totally confused. I'm not getting something about how F# evaluates functions.
For a quick explanation of what's happening, lets start here:
let getStringFromFile =
File.OpenRead("c:\\eo\\raw.txt")
|> fun s -> let r = new StreamReader(s)
let data = r.ReadToEnd
r.Close()
s.Close()
data
You can re-write the first two lines of your function as:
let s = File.OpenRead(#"c:\eo\raw.txt")
Next, you've omitted the parentheses on this method:
let data = r.ReadToEnd
r.Close()
s.Close()
data
As a result, data has the type unit -> string. When you return this value from your function, the entire result is unit -> string. But look what happens in between assigning your variable and returning it: you closed you streams.
End result, when a user calls the function, the streams are already closed, resulting in the error you're seeing above.
And don't forget to dispose your objects by declaring use whatever = ... instead of let whatever = ....
With that in mind, here's a fix:
let getStringFromFile() =
use s = File.OpenRead(#"c:\eo\raw.txt")
use r = new StreamReader(s)
r.ReadToEnd()
You don't read from your file. You bind method ReadToEnd of your instance of StreamReader to the value data and then call it when you call getStringFromFile(). The problem is that the stream is closed at this moment.
I think you have missed the parentheses and here's the correct version:
// Open a file, then read from it. Close the file. return the data.
let getStringFromFile =
File.OpenRead("c:\\eo\\raw.txt")
|> fun s -> let r = new StreamReader(s)
let data = r.ReadToEnd()
r.Close()
s.Close()
data

Resources