How do nested functions get compiled? - memory

When I have 2 functions in D like this:
void func() {
void innerFunc() {
import std.stdio;
writeln(x);
}
int x = 5;
innerFunc();
}
When I call func this will print 5. How does it work? Where in memory does the 5 get stored? How does innerFunc know it has to print 5?

I attempt to answer this in broad terms. This type of issue arises in a number of languages that permit nested function definitions (including Ada and Pascal).
Normally, a variable like "x" is allocated on the processor stack. That's the normal process in any language that permits recursion.
When a nested function is called, a descriptor for the enclosing function's stack frame gets passed as hidden argument.
funct() then knows that x is located at some offset specified by the base pointer register.
innerFunct () knows the offset of x but has to derive the base from the hidden argument. It can't use its own base pointer value because it will be different from funct(). And, if innerFunct () called itself, the base pointer value would be different in each invocation.

Related

How to write a recursive anonymous function in Dart

Lets say I wanted to write a recursive anonymous function to calculate factorial values.
print(((int a) => a == 1? 1 : a * this(a - 1))(4));
I would expect this to print 24, which is 4! (this function is obviously prone to issues with negative numbers, but that's beside the point)
The problem is that this doesn't refer to the anonymous function in order to make a recursive call.
Is this something that's possible in dart? I've seen it in python before, where a function is assigned to a variable with the walrus operator ( := ) and is also recursive.
Here is an example that creates a list of the average value on each level of a binary tree:
return (get_levels := lambda l: ([mean(node.val for node in l)] + get_levels([child for node in l for child in [node.left, node.right] if child])) if l else [])([root])
As you can see, the lambda is called get_levels. It calculates the average of the current level, then makes a recursive call on the next level of the binary tree and appends it to the list of previous level averages.
The closest that I could come up with is this:
var getLevels;
List<double> averageOfLevels(TreeNode? root) {
return root == null ? [] : (getLevels = (List<TreeNode> level) => level.isNotEmpty ? <double>[level.map((node) => node.val).fold(0, (int l, int r) => l+r) / level.length] + getLevels([for(var node in level) ...[node.left, node.right]].whereType<TreeNode>().toList()) : <double>[])([root]);
}
But, as you can see, this required an additional line where the variable is defined ahead of time.
Is it possible to achieve something more similar to the python example using callable classes?
There's a classic Lisp/Scheme problem of how to create a recursive lambda. The same technique of creating one anonymous function that takes itself as an argument and then using another anonymous function to pass the first anonymous function to itself can be applied to Dart (albeit by sacrificing some type-safety; I can't think of a way to strongly type a Function that takes its own type as an argument). For example, a recursive factorial implementation:
void main() {
var factorial = (Function f, int x) {
return f(f, x);
}((Function self, int x) {
return (x <= 1) ? 1 : x * self(self, x - 1);
}, 4);
print('4! = $factorial'); // Prints: 4! = 24
}
All that said, this seems like a pretty contrived, academic problem. In practice, just create a named function. It can be a local function if you want to avoid polluting a global namespace. It would be far more readable and maintainable.
Is it possible to achieve something more similar to the python example using callable classes?
I'm not sure where you're going with that since Dart neither allows defining anonymous classes nor local classes, so even if you made a callable class, it would violate your request for being anonymous.

zig structs, pointers, field access

I was trying to implement vector algebra with generic algorithms and ended up playing with iterators. I have found two examples of not obvious and unexpected behaviour:
if I have pointer p to a struct (instance) with field fi, I can access the field as simply as p.fi (rather than p.*.fi)
if I have a "member" function fun(this: *Self) (where Self = #This()) and an instance s of the struct, I can call the function as simply as s.fun() (rather than (&s).fun())
My questions are:
is it documented (or in any way mentioned) somewhere? I've looked through both language reference and guide from ziglearn.org and didn't find anything
what is it that we observe in these examples? syntactic sugar for two particular cases or are there more general rules from which such behavior can be deduced?
are there more examples of weird pointers' behaviour?
For 1 and 2, you are correct. In Zig the dot works for both struct values and struct pointers transparently. Similarly, namespaced functions also do the right thing when invoked.
The only other similar behavior that I can think of is [] syntax used on arrays. You can use both directly on an array value and an array pointer interchangeably. This is somewhat equivalent to how the dot operates on structs.
const std = #import("std");
pub fn main() !void {
const arr = [_]u8{1,2,3};
const foo = &arr;
std.debug.print("{}", .{arr[2]});
std.debug.print("{}", .{foo[2]});
}
AFAIK these are the only three instances of this behavior. In all other cases if something asks for a pointer you have to explicitly provide it. Even when you pass an array to a function that accepts a slice, you will have to take the array's pointer explicitly.
The authoritative source of information is the language reference but checking it quickly, it doesn't seem to have a dedicated paragraph. Maybe there's some example that I missed though.
https://ziglang.org/documentation/0.8.0/
I first learned this syntax by going through the ziglings course, which is linked to on ziglang.org.
in exercise 43 (https://github.com/ratfactor/ziglings/blob/main/exercises/043_pointers5.zig)
// Note that you don't need to dereference the "pv" pointer to access
// the struct's fields:
//
// YES: pv.x
// NO: pv.*.x
//
// We can write functions that take pointer arguments:
//
// fn foo(v: *Vertex) void {
// v.x += 2;
// v.y += 3;
// v.z += 7;
// }
//
// And pass references to them:
//
// foo(&v1);
The ziglings course goes quite in-depth on a few language topics, so it's definitely work checking out if you're interested.
With regards to other syntax: as the previous answer mentioned, you don't need to dereference array pointers. I'm not sure about anything else (I thought function pointers worked the same, but I just ran some tests and they do not.)

Memory usage of pass by value vs. pass by reference

For the past few days am trying to learn if pass by value and pass by reference impact the memory differently. Googling this query, people kept repeating themselves about a copy being created in terms of pass by value and how the original value is affected in terms pass by reference. But I was wondering if someone could zero in on the memory part.
This question actually depends heavily on the particular language as some allow you to be explicit and define when you want to pass a variable by value and when by reference and some do it always the same way for different types of variables.
A quite popular type of behavior is to use passing by value (by default) for simple times: like int, string, long, float, double, bool etc.
Let us show the memory impact on a theoretical language:
int $myVariable = 5;
at this moment you have created a one variable in memory which takes the size required to store an integer (let us say 32 bits).
Now you want to pass it to a function:
function someFunction(int parameter)
{
printOnScreen(parameter);
}
so your code would look like:
function someFunction(int $parameter)
{
printOnScreen($parameter);
}
int $myVariable = 5; //Position A
someFunction($myVariable); //Position B
...rest of the code //Position C
Since simple types are passed by value the value is copied in memory to another storage place - therefore:
during Position A you have memory occupied by ONE int (with value 5);
during Position B you have memory occupied by TWO ints (with values of 5) as your $myVariable was copied in memory
during Position C you have again memory occupied by ONE int (with value of 5) as the second one was already destroyed as it was needed only for the time of execution of the function
This has some other implications: modifications on a variable passed by value DO NOT affect the original variable - for example:
function someFunction(int $parameter)
{
$parameter = $parameter + 1;
printOnScreen($parameter);
}
int $myVariable = 5; //Position A
someFunction($myVariable); //Position B
printOnScreen($myVariable); //Position C
During position A you set value of 5 under variable $myVariable.
During position B you pass it BY VALUE to a function which adds 1 to your passed value. YET since it was a simple type, passed by value, it actually operates on a LOCAL variable, a COPY of your variable. Therefore position C will again write just 5 (your original variable as it was not modified).
Some languages allow you to be explicit and inform that you want to pass a reference and not the value itself using a special operator -for example &. So let us again follow the same example but with explicit info that we want a reference (in function's arguments
- note the &):
function someFunction(int &$parameter)
{
$parameter = $parameter + 1;
printOnScreen($parameter);
}
int $myVariable = 5; //Position A
someFunction($myVariable); //Position B
printOnScreen($myVariable); //Position C
This time operation and memory implications will be different.
During Position A an int is created (every variable is always consisted of two elements: place in memory and a pointer, an identifier which place is it. For ease of the process let us say that pointer is always one byte). So whenever you create a variable you actually create two things:
reserved place in memory for the VALUE (in this case 32 bits as it was an int)
pointer (8 bits [1 byte])
Now during position B, the function expects A POINTER to a memory place. Which means that it will locally, for itself create only a copy of the pointer (1 byte) and not copy the actual reserved place as the new pointer WILLL POINT to the same place as the original one. This means that during operation of the function you have:
TWO POINTERS to an int in memory
ONE place reserved for VALUE of the int
Both of those pointer POINT to the same VALUE
Which means that any modification of the value will affect both.
So looking at the same example position C will not print out also 6 as inside the function we have modified the value under the SAME POINTER as $myVariable.
For COMPLEX TYPES (objects) the default action in most programming environments is to pass the reference (pointer).
So for example - if you have a class:
class Person {
public string $name;
}
and create an instance of it and set a value:
$john = new Person();
$john->name = "John Malkovic";
and later pass it to a function:
function printName(Person $instanceOfPerson)
{
printOnScreen($instanceOfPerson);
}
in terms of memory it will again create only a new POINTER in memory (1 byte) which points to the same value. So having a code like this:
function printName(Person $instanceOfPerson)
{
printOnScreen($instanceOfPerson);
}
$john = new Person(); // position A
printName($john); // position B
...rest of the code // position C
during position A you have: 1 Person (which means 1 pointer [1 byte] to a place in memory which has size to store an object of class person)
during position B you have: 2 pointers [2 bytes] but STILL one place in memory to store an object of class person's value [instance]
during position C you have again situation from position A
I hope that this clarifies the topic for you - generally there is more to cover and what I have mentioned above is just a general explanation.
Pass-by-value and pass-by-reference are language semantics concepts; they don't imply anything about the implementation. Usually, languages that have pass-by-reference implement it by passing a pointer by value, and then when you read or write to the variable inside the function, the compiler translates it into reading or writing from a dereference of the pointer. So you can imagine, for example, if you have a function that takes a parameter by reference in C++:
struct Foo { int x; }
void bar(Foo &f) {
f.x = 42;
}
Foo a;
bar(a);
it is really syntactic sugar for something like:
struct Foo { int x; }
void bar(Foo *f_ptr) {
(*f_ptr).x = 42;
}
Foo a;
bar(&a);
And so passing by reference has the same cost as passing a pointer by value, which does involve a "copy", but it's the copy of a pointer, which is a few bytes, regardless of the size of the thing pointed to.
When you talk about pass-by-value doing a "copy", that doesn't really tell you much unless you know what exactly the variable or value passed represents in the language. For example, Java only has pass-by-value. But every type in Java is either a primitive type or a reference type, and the values of reference types are "reference", i.e. pointers to objects. So you can never have a value in Java (what a variable holds or what an expression evaluates to) which "is" an "object"; objects in Java can only be manipulated through these "references" (pointers to objects). So when you ask the cost of passing a object in Java, it's actually wrong because you cannot "pass" an object in Java; you can only pass references (pointers to objects), and the copy the happens for pass-by-value, is the copy of the pointer, which is a few bytes.
So the only case where you would actually copy a big structure when passing, is if you have a language where objects or structs are values directly (not behind a reference), and you do pass-by-reference of that object/struct type. So for example, in C++, you can have objects which are values directly, or you can have pointers to them, and you can pass them by value or by reference:
struct Foo { int x; }
void bar1(Foo f1) { } // pass Foo by value; this copies the entire size of Foo
void bar2(Foo *f2) { } // pass pointer by value; this copies the size of a pointer
void bar3(Foo &f3) { } // pass Foo by reference; this copies the size of a pointer
void bar4(Foo *&f4) { } // pass pointer by reference; this copies the size of a pointer
(Of course, each of those have different semantic meanings; for example, the last one allows the code inside the function to modify the pointer variable passed to point to somewhere else. But if you are concerned about the amount copied. Only the first one is different. In Java, effectively only the second one is possible.)

Why does Rust reuse memory with same value

Example code:
fn main() {
let mut y = &5; // 1
println!("{:p}", y);
{
let x = &2; // 2
println!("{:p}", x);
y = x;
}
y = &3; // 3
println!("{:p}", y);
}
If third assignment contains &3 then code output:
0x558e7da926a0
0x558e7da926a4
0x558e7da926a8
If third assignment contains &2 (same value with second assignment) then code output:
0x558e7da926a0
0x558e7da926a4
0x558e7da926a4
If third assignment contains &5 (same value with first assignment) then code output:
0x558e7da926a0
0x558e7da926a4
0x558e7da926a0
Why does rust not free memory but reuse it if the assignment value is the same or allocate a new block of memory otherwise?
Two occurrences of the same literal number are indistinguishable. You cannot expect the address of two literals to be identical, and neither can you expect them to be different.
This allows the compiler (but in fact it is free to do otherwise) to emit one 5 data in the executable code, and have all &5 refer to it. Constants may (see comment) also have a static lifetime, in which case they are not allocated/deallocated during program execution, they always are allocated.
There are lots of tricks an optimizing compiler can use to determine if a variable can be assigned a constant value. Your findings are consistent with this, no need to run duplicate code if it is not needed.

What's the difference between filter(|x|) and filter(|&x|)?

In the Rust Book there is an example of calling filter() on an iterator:
for i in (1..100).filter(|&x| x % 2 == 0) {
println!("{}", i);
}
There is an explanation below but I'm having trouble understanding it:
This will print all of the even numbers between one and a hundred. (Note that because filter doesn't consume the elements that are being iterated over, it is passed a reference to each element, and thus the filter predicate uses the &x pattern to extract the integer itself.)
This, however does not work:
for i in (1..100).filter(|&x| *x % 2 == 0) {
println!("i={}", i);
}
Why is the argument to the closure a reference |&x|, instead of |x|? And what's the difference between using |&x| and |x|?
I understand that using a reference |&x| is more efficient, but I'm puzzled by the fact that I didn't have to dereference the x pointer by using *x.
When used as a pattern match (and closure and function arguments are also pattern matches), the & binds to a reference, making the variable the dereferenced value.
fn main() {
let an_int: u8 = 42;
// Note that the `&` is on the right side of the `:`
let ref_to_int: &u8 = &an_int;
// Note that the `&` is on the left side of the `:`
let &another_int = ref_to_int;
let () = another_int;
}
Has the error:
error: mismatched types:
expected `u8`,
found `()`
If you look at your error message for your case, it indicates that you can't dereference it, because it isn't a reference:
error: type `_` cannot be dereferenced
I didn't have to dereference the x pointer by using *x.
That's because you implicitly dereferenced it in the pattern match.
I understand that using a reference |&x| is more efficient
If this were true, then there would be no reason to use anything but references! Namely, references require extra indirection to get to the real data. There's some measurable cut-off point where passing items by value is more efficient than passing references to them.
If so, why does using |x| not throw an error? From my experience with C, I would expect to receive a pointer here.
And you do, in the form of a reference. x is a reference to (in this example) an i32. However, the % operator is provided by the trait Rem, which is implemented for all pairs of reference / value:
impl Rem<i32> for i32
impl<'a> Rem<i32> for &'a i32
impl<'a> Rem<&'a i32> for i32
impl<'a, 'b> Rem<&'a i32> for &'b i32
This allows you to not need to explicitly dereference it.
Or does Rust implicitly allocate a copy of value of the original x on the stack here?
It emphatically does not do that. In fact, it would be unsafe to do that, unless the iterated items implemented Copy (or potentially Clone, in which case it could also be expensive). This is why references are used as the closure argument.

Resources