Why this Rust code allocates buffers on same memory region? - memory

I don't understand the behaviour of this piece of code... I'm writing an RTOS an this issue is halting me. I really don't get why the code acts this way.
Here is some code I tested on the playground that shows the issue.
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=cc6cc0ec8bfe76f65e1baaa67caaf9e6
use core::fmt;
use core::fmt::Display;
struct StackPointer(*const usize);
impl Display for StackPointer {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "{}", self.0 as usize)
}
}
struct Stack<const WORDS: usize> {
pub sp: StackPointer,
pub mem: [usize; WORDS],
}
impl<const WORDS: usize> Stack<WORDS> {
pub fn new() -> Self {
let mem = [0; WORDS];
let sp = StackPointer(mem.as_ptr() as *const usize);
Self {
mem,
sp,
}
}
}
struct PCB<const WORDS: usize> {
pub stack: Stack<WORDS>,
}
impl<const WORDS: usize> PCB<WORDS> {
pub fn new() -> Self {
Self {
stack: Stack::new(),
}
}
}
fn main() {
let pcb1 = PCB::<128>::new();
let pcb2 = PCB::<128>::new();
let pcb3 = PCB::<128>::new();
println!("sp1: {}, sp2: {}, sp3: {}", pcb1.stack.sp, pcb2.stack.sp, pcb3.stack.sp);
}

I don't understand the behaviour of this piece of code... I'm writing an RTOS an this issue is halting me. I really don't get why the code acts this way.
Because you're writing broken code.
let mem = [0; WORDS];
this reserves WORDS words on the stack (incidentally why is it usize?)
let sp = StackPointer(mem.as_ptr() as *const usize);
this takes a pointer to a location in the current stackframe, where you've put your array.
Self {
mem,
sp,
}
this then blissfully copies the data out of the current stackframe and into the parent stackframe, while keeping a pointer to the now-popped stackframe.
So on each call to PCB::<128>::new(); you're going to create a stackframe, allocate an array into that stackframe, take a pointer to that array (in the stackframe), then pop the stackframe.
All the stackframes being in the same location (on top of main's stackframe) they're at roughly the same offset, hence the array is at the same offset in all calls, and all your nonsensical StackPointer store data to the same location, which will be filled with nonsense as soon as you call an other function.

Related

Cannot free dynamic memory in async rust task

Our Rust application appeared to have a memory leak and I've distilled down the issue to the code example below. I still can't see where the problem is.
My expectation is that on the (500,000 + 1)'th message the memory of the application would return to low levels. Instead I observe the following:
before sending 500,000 messages the memory usage is 124KB
after sending 500,000 message the memory usage climbs to 27MB
after sending 500,000 + 1 message the memory usage drops to 15.5MB
After trying many things, I cannot find where the 15.5MB is hiding. The only way to free the memory is to kill the application. Valgrind did not detect any memory leaks. A work around, solution, or point in the right direction would all be much appreciated.
A demo project with the code below can be found here: https://github.com/loriopatrick/mem-help
Notes
If I remove self.items.push(data); memory usage does not increase so I don't think it's an issue with Sender/Receiver
Wrapping items: Vec<String> in an Arc<Mutex<..>> made no observable memory difference
The task where the memory should be managed
struct Processor {
items: Vec<String>,
}
impl Processor {
pub fn new() -> Self {
Processor {
items: Vec::new(),
}
}
pub async fn task(mut self, mut receiver: Receiver<String>) {
while let Some(data) = receiver.next().await {
self.items.push(data);
if self.items.len() > 500000 {
{
std::mem::replace(&mut self.items, Vec::new());
}
println!("Emptied items array");
}
}
println!("Processor task closing in 5 seconds");
tokio::time::delay_for(Duration::from_secs(5)).await;
}
}
Full runnable example
use std::time::Duration;
use tokio::stream::StreamExt;
use tokio::runtime::Runtime;
use tokio::sync::mpsc::{channel, Receiver, Sender};
struct Processor {
items: Vec<String>,
}
impl Processor {
pub fn new() -> Self {
Processor {
items: Vec::new(),
}
}
pub async fn task(mut self, mut receiver: Receiver<String>) {
while let Some(data) = receiver.next().await {
self.items.push(data);
if self.items.len() > 500000 {
{
std::mem::replace(&mut self.items, Vec::new());
}
println!("Emptied items array");
}
}
println!("Processor task closing in 5 seconds");
tokio::time::delay_for(Duration::from_secs(5)).await;
}
}
pub fn main() {
{
let mut runtime: Runtime = tokio::runtime::Builder::new()
.threaded_scheduler()
.core_threads(1)
.enable_all()
.build()
.expect("Failed to build runtime");
let (mut sender, receiver) = channel(1024);
let p = Processor::new();
runtime.spawn(async move {
println!("Before send, waiting 5 seconds");
tokio::time::delay_for(Duration::from_secs(5)).await;
for i in 0..500000 {
sender.send("Hello".to_string()).await;
}
println!("Sent 500,000 items, waiting 5 seconds");
tokio::time::delay_for(Duration::from_secs(5)).await;
sender.send("Hello".to_string()).await;
println!("Send message to clear items");
tokio::time::delay_for(Duration::from_secs(3)).await;
println!("Closing sender in 5 seconds");
tokio::time::delay_for(Duration::from_secs(5)).await;
});
runtime.block_on(async move {
{
p.task(receiver).await;
}
println!("Task is done, waiting 5 seconds");
tokio::time::delay_for(Duration::from_secs(5)).await;
});
}
println!("Runtime closed, waiting 5 seconds");
std::thread::sleep(Duration::from_secs(5));
}
Cargo.toml
[package]
name = "mem-help"
version = "0.1.0"
authors = ["Patrick Lorio <dev#plorio.com>"]
edition = "2018"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
futures = "0.3.1"
tokio = { version = "0.2.6", features = ["full"] }

How can I efficiently extract the first element of a futures::Stream in a blocking manner?

I've got the following method:
pub fn load_names(&self, req: &super::MagicQueryType) -> ::grpcio::Result<::grpcio::ClientSStreamReceiver<String>> {
My goal is to get the very first element of grpcio::ClientSStreamReceiver; I don't care about the other names:
let name: String = load_names(query)?.wait().nth(0)?;
It seems inefficient to call wait() before nth(0) as I believe wait() blocks the stream until it receives all the elements.
How can I write a more efficient solution (i.e., nth(0).wait()) without triggering build errors? Rust's build errors for futures::stream::Stream look extremely confusing to me.
The Rust playground doesn't support grpcio = "0.4.4" so I cannot provide a link.
To extract the first element of a futures::Stream in a blocking manner, you should convert the Stream to an iterator by calling executor::block_on_stream and then call Iterator::next.
use futures::{executor, stream, Stream}; // 0.3.4
use std::iter;
fn example() -> impl Stream<Item = i32> {
stream::iter(iter::repeat(42))
}
fn main() {
let v = executor::block_on_stream(example()).next();
println!("{:?}", v);
}
If you are using Tokio, you can convert the Stream into a Future with StreamExt::into_future and annotate a function with #[tokio::main]:
use futures::{stream, Stream, StreamExt}; // 0.3.4
use std::iter;
use tokio; // 0.2.13
fn example() -> impl Stream<Item = i32> {
stream::iter(iter::repeat(42))
}
#[tokio::main]
async fn just_one() -> Option<i32> {
let (i, _stream) = example().into_future().await;
i
}
fn main() {
println!("{:?}", just_one());
}
See also:
How do I synchronously return a value calculated in an asynchronous Future in stable Rust?
How to select between a future and stream in Rust?

How can I check if std::io::Cursor has unconsumed data?

I am writing a low-level network app that deals with TCP sockets where I often need to process binary data streams. When some data is available, I read it into u8 array, then wrap into std::io::Cursor<&[u8]> and then pass it to handlers. In a handler, I often need to know if there is some more data in the Cursor or not.
Imagine that the handle function receives data and then processes it in chunks using the handle_chunk function. For simplicity, assume that chunk size is fixed at 10 bytes; if the data size is not divisible by 10, it's an error. This simple logic can be implemented in the following way:
fn handle(mut data: Cursor<&[u8]>) {
while !data.empty() {
if let Err(err) = handle_chunk(&mut data) {
eprintln!("Error while handling data: {}", err);
}
}
}
fn handle_chunk(data: &mut Cursor<&[u8]>) -> Result<(), String> {
// Returns Err("unexpected EOF".to_string()) if chunk is incomplete
// ...
}
However, Cursor does not have an empty() method or any other method capable of telling if there is more data to process. The working solution that I could come up with is:
fn handle(data: Cursor<&[u8]>) {
let data = data.into_inner();
let len = data.len();
let mut data = Cursor::new(data);
while (data.position() as usize) < len - 1 {
if let Err(err) = handle_chunk(&mut data) {
eprintln!("Error while handling data: {}", err);
}
}
}
This looks hacky and inelegant though. Is there a better solution? Maybe there is a different tool in the Rust standard library that fits here better than Cursor?
Your code can be simplified by using Cursor::get_ref to avoid breaking up the input and putting it back together:
fn handle(mut data: Cursor<&[u8]>) {
let len = data.get_ref().len();
while (data.position() as usize) < len - 1 {
if let Err(err) = handle_chunk(&mut data) {
eprintln!("Error while handling data: {}", err);
}
}
}
Now, you haven't shown any code that requires a Cursor. Many times, people think it's needed to convert a &[u8] to something that implements Read, but it's not. Read is implemented for &'a [u8]:
use std::io::Read;
fn handle(mut data: &[u8]) {
while !data.is_empty() {
if let Err(err) = handle_chunk(&mut data) {
eprintln!("Error while handling data: {}", err);
}
}
}
fn handle_chunk<R: Read>(mut data: R) -> Result<(), String> {
let mut b = [0; 10];
data.read_exact(&mut b).unwrap();
println!("Chunk: {:?}", b);
Ok(())
}
fn main() {
let d: Vec<u8> = (0..20).collect();
handle(&d)
}
By having mut data: &[u8] and using &mut data, the code will update the slice variable in place to advance it forward. We can't easily go backward though.
an empty() method
Rust style indicates that an empty method would be a verb — this would remove data (if it were possible). The method you want should be called is_empty, as seen on slices.

Handling streaming iterator as normal iterator by using PhantomData and unsafe

I know the code below is hacky, but could it be called safe and idiomatic Rust? Is there better way for this?
// needs to do 'rustup default nightly' to run under valgrind
// #![feature(alloc_system, global_allocator, allocator_api)]
// extern crate alloc_system;
// use alloc_system::System;
// #[global_allocator]
// static A: System = System;
struct Foo<'a> {
v: Vec<u8>,
pos: usize,
phantom: std::marker::PhantomData<&'a u8>,
}
impl<'a> Iterator for Foo<'a> {
type Item = &'a mut u8;
fn next(&mut self) -> Option<&'a mut u8> {
let r = self.v.get_mut(self.pos);
if r.is_some() {
self.pos += 1;
unsafe { Some(&mut *(r.unwrap() as *mut u8)) }
} else {
None
}
}
}
impl<'a> Foo<'a> {
fn reset(&mut self) {
self.pos = 0;
}
}
fn main() {
let mut x = Foo {
v: (1..10).collect(),
pos: 0,
phantom: std::marker::PhantomData,
};
let vp = x.v.as_ptr();
{
for i in &mut x {
println!("{}", i);
}
}
{
x.reset();
}
{
for i in &mut x {
*i *= *i;
}
}
{
x.reset();
}
{
for i in &mut x {
println!("{}", i);
}
}
assert!(vp == x.v.as_ptr());
}
Write a little bit in the comment, Valgrind told me no leak and the result is as expected under Rust 1.26.0-nightly and 1.25.0.
Related:
How do I write an iterator that returns references to itself?
Iterator returning items by reference, lifetime issue
This code is not safe. The user of the type may choose any lifetime, including 'static:
fn constructor() -> Foo<'static> {
Foo {
v: vec![42; 10],
pos: 0,
phantom: std::marker::PhantomData,
}
}
fn example() -> &'static u8 {
let mut f = constructor();
f.next().unwrap()
}
fn main() {
println!("example: {}", example());
}
Here, example returns a reference to a variable that is no longer in scope, accessing invalid memory and subverting the restrictions you must uphold.
There's an example of how you could write this code with no unsafe whatsoever in another Q&A.

"error: closure may outlive the current function" but it will not outlive it

When I try to compile the following code:
fn main() {
(...)
let mut should_end = false;
let mut input = Input::new(ctx);
input.add_handler(Box::new(|evt| {
match evt {
&Event::Quit{..} => {
should_end = true;
}
_ => {}
}
}));
while !should_end {
input.handle();
}
}
pub struct Input {
handlers: Vec<Box<FnMut(i32)>>,
}
impl Input {
pub fn new() -> Self {
Input {handlers: Vec::new()}
}
pub fn handle(&mut self) {
for a in vec![21,0,3,12,1] {
for handler in &mut self.handlers {
handler(a);
}
}
}
pub fn add_handler(&mut self, handler: Box<FnMut(i32)>) {
self.handlers.push(handler);
}
}
I get this error:
error: closure may outlive the current function, but it borrows `should_end`, which is owned by the current function
I can't simply add move to the closure, because I need to use should_end later in the main loop. I mean, I can, but since bool is Copy, it will only affect the should_end inside the closure, and thus the program loops forever.
As far as I understand, since input is created in the main function, and the closure is stored in input, it couldn't possibly outlive the current function. Is there a way to express to Rust that the closure won't outlive main? Or is there a possibility that I can't see that the closure will outlive main? In the latter case, it there a way to force it to live only as long as main?
Do I need to refactor the way I'm handling input, or is there some way I can make this work. If I need to refactor, where can I look to see a good example of this in Rust?
Here's a playpen of a simplified version. It is possible I made a mistake in it that could crash your browser. I happened to me once, so, beware.
In case it is needed, the rest of my code is available. All the relevant info should be in either main.rs or input.rs.
The problem is not your closure, but the add_handler method. Fully expanded it would look like this:
fn add_handler<'a>(&'a mut self, handler: Box<FnMut(i32) + 'static>)
As you can see, there's an implicit 'static bound on the trait object. Obviously we don't want that, so we introduce a second lifetime 'b:
fn add_handler<'a, 'b: 'a>(&'a mut self, handler: Box<FnMut(i32) + 'b>)
Since you are adding the handler object to the Input::handlers field, that field cannot outlive the scope of the handler object. Thus we also need to limit its lifetime:
pub struct Input<'a> {
handlers: Vec<Box<FnMut(i32) + 'a>>,
}
This again requires the impl to have a lifetime, which we can use in the add_handler method.
impl<'a> Input<'a> {
...
pub fn add_handler(&mut self, handler: Box<FnMut(i32) + 'a>) {
self.handlers.push(handler);
}
}
Now all that's left is using a Cell to control access to your should_end flag.
Here is an example of the fixed code:
use std::cell::Cell;
fn main() {
let should_end = Cell::new(false);
let mut input = Input::new();
input.add_handler(Box::new(|a| {
match a {
1 => {
should_end.set(true);
}
_ => {
println!("{} {}", a, should_end.get())
}
}
}));
let mut fail_safe = 0;
while !should_end.get() {
if fail_safe > 20 {break;}
input.handle();
fail_safe += 1;
}
}
pub struct Input<'a> {
handlers: Vec<Box<FnMut(i32) + 'a>>,
}
impl<'a> Input<'a> {
pub fn new() -> Self {
Input {handlers: Vec::new()}
}
pub fn handle(&mut self) {
for a in vec![21,0,3,12,1,2] {// it will print the 2, but it won't loop again
for handler in &mut self.handlers {
handler(a);
}
}
}
pub fn add_handler(&mut self, handler: Box<FnMut(i32) + 'a>) {
self.handlers.push(handler);
}
}

Resources