I notice that Rust's test has a benchmark mode that will measure execution time in ns/iter, but I could not find a way to measure memory usage.
How would I implement such a benchmark? Let us assume for the moment that I only care about heap memory at the moment (though stack usage would also certainly be interesting).
Edit: I found this issue which asks for the exact same thing.
You can use the jemalloc allocator to print the allocation statistics. For example,
Cargo.toml:
[package]
name = "stackoverflow-30869007"
version = "0.1.0"
edition = "2018"
[dependencies]
jemallocator = "0.5"
jemalloc-sys = {version = "0.5", features = ["stats"]}
libc = "0.2"
src/main.rs:
use libc::{c_char, c_void};
use std::ptr::{null, null_mut};
#[global_allocator]
static ALLOC: jemallocator::Jemalloc = jemallocator::Jemalloc;
extern "C" fn write_cb(_: *mut c_void, message: *const c_char) {
print!("{}", String::from_utf8_lossy(unsafe {
std::ffi::CStr::from_ptr(message as *const i8).to_bytes()
}));
}
fn mem_print() {
unsafe { jemalloc_sys::malloc_stats_print(Some(write_cb), null_mut(), null()) }
}
fn main() {
mem_print();
let _heap = Vec::<u8>::with_capacity (1024 * 128);
mem_print();
}
In a single-threaded program that should allow you to get a good measurement of how much memory a structure takes. Just print the statistics before the structure is created and after and calculate the difference.
(The "total:" of "allocated" in particular.)
You can also use Valgrind (Massif) to get the heap profile. It works just like with any other C program. Make sure you have debug symbols enabled in the executable (e.g. using debug build or custom Cargo configuration). You can use, say, http://massiftool.sourceforge.net/ to analyse the generated heap profile.
(I verified this to work on Debian Jessie, in a different setting your mileage may vary).
(In order to use Rust with Valgrind you'll probably have to switch back to the system allocator).
P.S. There is now also a better DHAT.
jemalloc can be told to dump a memory profile. You can probably do this with the Rust FFI but I haven't investigated this route.
As far as measuring data structure sizes is concerned, this can be done fairly easily through the use of traits and a small compiler plugin. Nicholas Nethercote in his article Measuring data structure sizes: Firefox (C++) vs. Servo (Rust) demonstrates how it works in Servo; it boils down to adding #[derive(HeapSizeOf)] (or occasionally a manual implementation) to each type you care about. This is a good way of allowing precise checking of where memory is going, too; it is, however, comparatively intrusive as it requires changes to be made in the first place, where something like jemalloc’s print_stats() doesn’t. Still, for good and precise measurements, it’s a sound approach.
Currently, the only way to get allocation information is the alloc::heap::stats_print(); method (behind #![feature(alloc)]), which calls jemalloc's print_stats().
I'll update this answer with further information once I have learned what the output means.
(Note that I'm not going to accept this answer, so if someone comes up with a better solution...)
Now there is jemalloc_ctl crate which provides convenient safe typed API. Add it to your Cargo.toml:
[dependencies]
jemalloc-ctl = "0.3"
jemallocator = "0.3"
Then configure jemalloc to be global allocator and use methods from jemalloc_ctl::stats module:
jemalloc_ctl::stats::allocated
jemalloc_ctl::stats::resident
Here is official example:
use std::thread;
use std::time::Duration;
use jemalloc_ctl::{stats, epoch};
#[global_allocator]
static ALLOC: jemallocator::Jemalloc = jemallocator::Jemalloc;
fn main() {
loop {
// many statistics are cached and only updated when the epoch is advanced.
epoch::advance().unwrap();
let allocated = stats::allocated::read().unwrap();
let resident = stats::resident::read().unwrap();
println!("{} bytes allocated/{} bytes resident", allocated, resident);
thread::sleep(Duration::from_secs(10));
}
}
There's a neat little solution someone put together here: https://github.com/discordance/trallocator/blob/master/src/lib.rs
use std::alloc::{GlobalAlloc, Layout};
use std::sync::atomic::{AtomicU64, Ordering};
pub struct Trallocator<A: GlobalAlloc>(pub A, AtomicU64);
unsafe impl<A: GlobalAlloc> GlobalAlloc for Trallocator<A> {
unsafe fn alloc(&self, l: Layout) -> *mut u8 {
self.1.fetch_add(l.size() as u64, Ordering::SeqCst);
self.0.alloc(l)
}
unsafe fn dealloc(&self, ptr: *mut u8, l: Layout) {
self.0.dealloc(ptr, l);
self.1.fetch_sub(l.size() as u64, Ordering::SeqCst);
}
}
impl<A: GlobalAlloc> Trallocator<A> {
pub const fn new(a: A) -> Self {
Trallocator(a, AtomicU64::new(0))
}
pub fn reset(&self) {
self.1.store(0, Ordering::SeqCst);
}
pub fn get(&self) -> u64 {
self.1.load(Ordering::SeqCst)
}
}
Usage: (from: https://www.reddit.com/r/rust/comments/8z83wc/comment/e2h4dp9)
// needed for Trallocator struct (as written, anyway)
#![feature(integer_atomics, const_fn_trait_bound)]
use std::alloc::System;
#[global_allocator]
static GLOBAL: Trallocator<System> = Trallocator::new(System);
fn main() {
GLOBAL.reset();
println!("memory used: {} bytes", GLOBAL.get());
{
let mut vec = vec![1, 2, 3, 4];
for i in 5..20 {
vec.push(i);
println!("memory used: {} bytes", GLOBAL.get());
}
for v in vec {
println!("{}", v);
}
}
// For some reason this does not print zero =/
println!("memory used: {} bytes", GLOBAL.get());
}
I've just started using it, and it seems to work well! Straight-forward, realtime, requires no external packages, and doesn't require changing your base memory allocator.
It's also nice that, because it's intercepting the allocate/deallocate calls, you should be able to add custom logic if desired (eg. if memory usage goes above X, print the stack-trace to see what's triggering the allocations) -- although I haven't tried this yet.
I also haven't yet tested to see how much overhead this approach adds. If someone does a test for this, let me know!
Related
I am working on a web assembly application in rust. The program relies on streaming a file and storing the data in memory. However I run into a rust_oom issue.
This issue only arises when recompiling the std library with atomic, bulk-memory, and mutable-global flags.
Reproducable via .cargo/config.toml:
[target.wasm32-unknown-unknown]
rustflags = ["-C", "target-feature=+atomics,+bulk-memory,+mutable-globals"]
[unstable]
build-std = ["panic_abort", "std"]
Compiling without these flags works fine.
The relevant rust code:
#[wasm_bindgen]
pub async fn start(r: JsValue) {
// javascript response into websys response
let resp: Response = r.dyn_into().unwrap();
// from example here https://github.com/MattiasBuelens/wasm-streams/blob/master/examples/fetch_as_stream.rs
let raw_body = resp.body().unwrap();
let body = ReadableStream::from_raw(raw_body.dyn_into().unwrap());
let mut stream = body.into_stream();
// store the data in memory
let mut v = vec![];
log("start streaming");
while let Some(Ok(chunk)) = stream.next().await {
//convert to vec<u8>
let mut x = chunk.dyn_ref::<Uint8Array>().unwrap().to_vec();
v.append(&mut x);
}
log(&format!("{}", v.len()));
log("done streaming");
}
The full error message provided
at rust_oom (wasm_memtest_bg.wasm:0xfa10)
at __rg_oom (wasm_memtest_bg.wasm:0xfdd4)
at __rust_alloc_error_handler (wasm_memtest_bg.wasm:0xfd38)
at alloc::alloc::handle_alloc_error::rt_error::hf991f317b52eeff2 (wasm_memtest_bg.wasm:0xfdb3)
at core::ops::function::FnOnce::call_once::ha90352dededaa31f (wasm_memtest_bg.wasm:0xfdc9)
at core::intrinsics::const_eval_select::h04f9b6091fe1f42f (wasm_memtest_bg.wasm:0xfdbe)
at alloc::alloc::handle_alloc_error::h82c7beb21e18f5f3 (wasm_memtest_bg.wasm:0xfda8)
at alloc::raw_vec::RawVec<T,A>::reserve::do_reserve_and_handle::h65538cba507bb5c9 (wasm_memtest_bg.wasm:0xc149)
at <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::he4014043296c78b9 (wasm_memtest_bg.wasm:0x1c85)
at wasm_bindgen_futures::task::multithread::Task::run::h77f1075b3537ddf9 (wasm_memtest_bg.wasm:0x5a0e)
Here is the full project if you want to test it out.
https://github.com/KivalM/memory-wasm
EDIT: The code required to reproduce the issue is provided in the first snippet. The full wasm application is provided in the repository so that you do not have to create one from scratch. I will include the rust code in the question.
The issue arises when loading a simple 500MB file into memory. Which should be possible via this approach. Yet Chrome(and chromium-based browsers) tend to run out of memory, whereas when the rustflags are not present the program runs just fine.
I will move the links here as they are not part of the question, but can be relevant.
https://rustwasm.github.io/2018/10/24/multithreading-rust-and-wasm.html
https://rustwasm.github.io/wasm-bindgen/examples/raytrace.html
In C, before using the scanf or gets "stdio.h" functions to get and store user input, the programmer has to manually allocate memory for the data that's read to be stored in. In Rust, the std::io::Stdin.read_line function can seemingly be used without the programmer having to manually allocate memory prior. All it needs is for there to be a mutable String variable to store the data it reads in. How does it do this seemingly without knowledge about how much memory will be required?
Well, if you want a detailed explanation, you can dig a bit into the read_line method which is part of the BufRead trait. Heavily simplified, the function look like this.
fn read_line(&mut self, target: &mut String)
loop {
// That method fills the internal buffer of the reader (here stdin)
// and returns a slice reference to whatever part of the buffer was filled.
// That buffer is actually what you need to allocate in advance in C.
let available = self.fill_buf();
match memchr(b'\n', available) {
Some(i) => {
// A '\n' was found, we can extend the string and return.
target.push_str(&available[..=i]);
return;
}
None => {
// No '\n' found, we just have to extend the string.
target.push_str(available);
},
}
}
}
So basically, that method extends the string as long as it does not find a \n character in stdin.
If you want to allocate a bit of memory in advance for the String that you pass to read_line, you can create it using String::with_capacity. This will not prevent the String to reallocate if it is not large enough though.
I have a weird issue. LUA 5.3.5, compiled on STM32F429. Free RAM is about 1Mb (memory allocation is using external SDRAM, not the more limited internal RAM on the STM32). Note that working with things like strings works fine, as well. It only seems to be division causing the problem.
This script works:
a=100
b=20
c=a+b
print(c)
This script returns "memory allocation error: block too big:"
a=100
b=20
c=a/b
print(c)
Further research is showing that the problem is not with the division, at all. It is with tostring() which is called by print(). For some reason, tostring() is trying to allocate too much memory when dealing with the result from direct division.
In lstring.c, is the following:
luaS_newlstr():
if (l >= (MAX_SIZE - sizeof(TString))/sizeof(char))
luaM_toobig(L);
When the issue occurs, l == 0xd0600f56
(interestingly, that is a memory address location in the range of the external SDRAM, rather than a valid string size).
If I modify the LUA script to do the following, it works fine:
a=100
b=20
c=math.floor(a/b)
print(c)
I checked and in both cases, c is type==number
As for the question regarding the memory allocation, we are using the dlmalloc() library, configured like this during LUA startup:
ezCmdLua = lua_newstate(ezlua_poolalloc, NULL);
int error = luaL_loadbuffer(ezCmdLua, bfr, len, "ezCmdLua");
if (!error)
{
error = lua_pcall(ezCmdLua, 0, 0, 0);
if (error) {
...
}
}
....
static void *ezlua_poolalloc (void *ud, void *ptr, size_t osize, size_t nsize) {
(void)ud; (void)osize; /* not used */
if (nsize == 0) {
dlfree(ptr);
return NULL;
}
else
return dlrealloc(ptr, nsize);
}
I have confirmed that memory allocation is working properly, and I can do things like string manipulation and printing of strings with no problem at all. In fact, when debugging this issue, the luaS_newlstr() function is called several times prior to the issue occurring, and each time l (the length of the string) is a reasonable value. That is, until I try to print the result of the division. Moving the division around in the script makes no difference (ie, adding things before it like other print statements), so I doubt the stack is being trashed.
How can one detect the OS type using Rust? I need to specify a default path specific to the OS. Should one use conditional compilation?
For example:
#[cfg(target_os = "macos")]
static DEFAULT_PATH: &str = "path2";
#[cfg(target_os = "linux")]
static DEFAULT_PATH: &str = "path0";
#[cfg(target_os = "windows")]
static DEFAULT_PATH: &str = "path1";
It's a little late, but there's a builtin way to detect the OS using the std lib. Eg:
use std::env;
println!("{}", env::consts::OS); // Prints the current OS.
The possible values are described here
Hope this help somebody in the future.
You can also use cfg! syntax extension.
if cfg!(windows) {
println!("this is windows");
} else if cfg!(unix) {
println!("this is unix alike");
}
To just get macos, you can do:
if cfg!(target_os = "macos") {
println!("cargo:rustc-link-lib=framework=CoreFoundation");
}
EDIT:
Since writing this answer, it seems the author of the os_type crate has retracted functionality that exposed OSes like Windows. Conditional compilation is probably your best bet here -- os_type only seems to detect Linux distributions now, judging from its lib.rs.
ORIGINAL ANSWER:
You could always use the os_type crate. From the front page:
extern crate os_type;
fn foo() {
match os_type::current_platform() {
os_type::OSType::OSX => /*Do something here*/,
_ => None
}
}
I am trying to use the C/C++ API of Z3 to parse fixed point constraints in the SMTLib2 format (specifically files produced by SeaHorn). However, my application crashes when parsing the string (I am using the Z3_fixedpoint_from_string method). The Z3 version I'm working with is version 4.5.1 64 bit.
The SMTLib file I try to parse works find with the Z3 binary, which I have compiled from the sources, but it runs into a segmentation fault when calling Z3_fixedpoint_from_string. I narrowed the problem down to the point that I think the issue is related to adding relations to the fixed point context. A simple example that produces a seg fault on my machine is the following:
#include "z3.h"
int main()
{
Z3_context c = Z3_mk_context(Z3_mk_config());
Z3_fixedpoint f = Z3_mk_fixedpoint(c);
Z3_fixedpoint_from_string (c, f, "(declare-rel R ())");
Z3_del_context(c);
}
Running this code with valgrind reports a lot of invalid reads and writes. So, either this is not how the API is supposed to be used, or there is a problem somewhere. Unfortunately, I could not find any examples on how to use the fixed point engine programmatically. However, calling Z3_fixedpoint_from_string (c, f, "(declare-var x Int)"); for instance works just fine.
BTW, where is Z3_del_fixedpoint()?
The fixedpoint object "f" is reference counted. the caller is responsible for taking a reference count immediately after it is created. It is easier to use C++ smart pointers to control this, similar to how we control it for other objects. The C++ API does not have a wrapper for fixedpoint objects so you would have to create your own in the style of other wrappers.
Instead of del_fixedpoint one uses reference counters.
class fixedpoint : public object {
Z3_fixedpoint m_fp;
public:
fixedpoint(context& c):object(c) { mfp = Z3_mk_fixedpoint(c); Z3_fixedpoint_inc_ref(c, m_fp); }
~fixedpoint() { Z3_fixedpoint_dec_ref(ctx(), m_fp); }
operator Z3_fixedpoint() const { return m_fp; }
void from_string(char const* s) {
Z3_fixedpoint_from_string (ctx(), m_fp, s);
}
};
int main()
{
context c;
fixedpoint f(c);
f.from_string("....");
}