For the past few days am trying to learn if pass by value and pass by reference impact the memory differently. Googling this query, people kept repeating themselves about a copy being created in terms of pass by value and how the original value is affected in terms pass by reference. But I was wondering if someone could zero in on the memory part.
This question actually depends heavily on the particular language as some allow you to be explicit and define when you want to pass a variable by value and when by reference and some do it always the same way for different types of variables.
A quite popular type of behavior is to use passing by value (by default) for simple times: like int, string, long, float, double, bool etc.
Let us show the memory impact on a theoretical language:
int $myVariable = 5;
at this moment you have created a one variable in memory which takes the size required to store an integer (let us say 32 bits).
Now you want to pass it to a function:
function someFunction(int parameter)
{
printOnScreen(parameter);
}
so your code would look like:
function someFunction(int $parameter)
{
printOnScreen($parameter);
}
int $myVariable = 5; //Position A
someFunction($myVariable); //Position B
...rest of the code //Position C
Since simple types are passed by value the value is copied in memory to another storage place - therefore:
during Position A you have memory occupied by ONE int (with value 5);
during Position B you have memory occupied by TWO ints (with values of 5) as your $myVariable was copied in memory
during Position C you have again memory occupied by ONE int (with value of 5) as the second one was already destroyed as it was needed only for the time of execution of the function
This has some other implications: modifications on a variable passed by value DO NOT affect the original variable - for example:
function someFunction(int $parameter)
{
$parameter = $parameter + 1;
printOnScreen($parameter);
}
int $myVariable = 5; //Position A
someFunction($myVariable); //Position B
printOnScreen($myVariable); //Position C
During position A you set value of 5 under variable $myVariable.
During position B you pass it BY VALUE to a function which adds 1 to your passed value. YET since it was a simple type, passed by value, it actually operates on a LOCAL variable, a COPY of your variable. Therefore position C will again write just 5 (your original variable as it was not modified).
Some languages allow you to be explicit and inform that you want to pass a reference and not the value itself using a special operator -for example &. So let us again follow the same example but with explicit info that we want a reference (in function's arguments
- note the &):
function someFunction(int &$parameter)
{
$parameter = $parameter + 1;
printOnScreen($parameter);
}
int $myVariable = 5; //Position A
someFunction($myVariable); //Position B
printOnScreen($myVariable); //Position C
This time operation and memory implications will be different.
During Position A an int is created (every variable is always consisted of two elements: place in memory and a pointer, an identifier which place is it. For ease of the process let us say that pointer is always one byte). So whenever you create a variable you actually create two things:
reserved place in memory for the VALUE (in this case 32 bits as it was an int)
pointer (8 bits [1 byte])
Now during position B, the function expects A POINTER to a memory place. Which means that it will locally, for itself create only a copy of the pointer (1 byte) and not copy the actual reserved place as the new pointer WILLL POINT to the same place as the original one. This means that during operation of the function you have:
TWO POINTERS to an int in memory
ONE place reserved for VALUE of the int
Both of those pointer POINT to the same VALUE
Which means that any modification of the value will affect both.
So looking at the same example position C will not print out also 6 as inside the function we have modified the value under the SAME POINTER as $myVariable.
For COMPLEX TYPES (objects) the default action in most programming environments is to pass the reference (pointer).
So for example - if you have a class:
class Person {
public string $name;
}
and create an instance of it and set a value:
$john = new Person();
$john->name = "John Malkovic";
and later pass it to a function:
function printName(Person $instanceOfPerson)
{
printOnScreen($instanceOfPerson);
}
in terms of memory it will again create only a new POINTER in memory (1 byte) which points to the same value. So having a code like this:
function printName(Person $instanceOfPerson)
{
printOnScreen($instanceOfPerson);
}
$john = new Person(); // position A
printName($john); // position B
...rest of the code // position C
during position A you have: 1 Person (which means 1 pointer [1 byte] to a place in memory which has size to store an object of class person)
during position B you have: 2 pointers [2 bytes] but STILL one place in memory to store an object of class person's value [instance]
during position C you have again situation from position A
I hope that this clarifies the topic for you - generally there is more to cover and what I have mentioned above is just a general explanation.
Pass-by-value and pass-by-reference are language semantics concepts; they don't imply anything about the implementation. Usually, languages that have pass-by-reference implement it by passing a pointer by value, and then when you read or write to the variable inside the function, the compiler translates it into reading or writing from a dereference of the pointer. So you can imagine, for example, if you have a function that takes a parameter by reference in C++:
struct Foo { int x; }
void bar(Foo &f) {
f.x = 42;
}
Foo a;
bar(a);
it is really syntactic sugar for something like:
struct Foo { int x; }
void bar(Foo *f_ptr) {
(*f_ptr).x = 42;
}
Foo a;
bar(&a);
And so passing by reference has the same cost as passing a pointer by value, which does involve a "copy", but it's the copy of a pointer, which is a few bytes, regardless of the size of the thing pointed to.
When you talk about pass-by-value doing a "copy", that doesn't really tell you much unless you know what exactly the variable or value passed represents in the language. For example, Java only has pass-by-value. But every type in Java is either a primitive type or a reference type, and the values of reference types are "reference", i.e. pointers to objects. So you can never have a value in Java (what a variable holds or what an expression evaluates to) which "is" an "object"; objects in Java can only be manipulated through these "references" (pointers to objects). So when you ask the cost of passing a object in Java, it's actually wrong because you cannot "pass" an object in Java; you can only pass references (pointers to objects), and the copy the happens for pass-by-value, is the copy of the pointer, which is a few bytes.
So the only case where you would actually copy a big structure when passing, is if you have a language where objects or structs are values directly (not behind a reference), and you do pass-by-reference of that object/struct type. So for example, in C++, you can have objects which are values directly, or you can have pointers to them, and you can pass them by value or by reference:
struct Foo { int x; }
void bar1(Foo f1) { } // pass Foo by value; this copies the entire size of Foo
void bar2(Foo *f2) { } // pass pointer by value; this copies the size of a pointer
void bar3(Foo &f3) { } // pass Foo by reference; this copies the size of a pointer
void bar4(Foo *&f4) { } // pass pointer by reference; this copies the size of a pointer
(Of course, each of those have different semantic meanings; for example, the last one allows the code inside the function to modify the pointer variable passed to point to somewhere else. But if you are concerned about the amount copied. Only the first one is different. In Java, effectively only the second one is possible.)
Related
In Dart, looking at the code below, does it 'pass by reference' for list and 'pass by value' for integers? If that's the case, what type of data will be passed by reference/value? If that isn't the case, what's the issue that causes such output?
void main() {
var foo = ['a','b'];
var bar = foo;
bar.add('c');
print(aoo); // [a, b, c]
print(bar); // [a, b, c]
var a = 3;
int b = a;
b += 2;
print(a); // 3
print(b); // 5
}
The question your asking can be answered by looking at the difference between a value and a reference type.
Dart like almost every other programming langue makes a distinction between the two. The reason for this is that you divide memory into the so called stack and the heap. The stack is fast but very limited so it cannot hold that much data. (By the way, if you have too much data stored in the stack you will get a Stack Overflow exception which is where the name of this site comes from ;) ). The heap on the other hand is slower but can hold nearly infinite data.
This is why you have value and reference types. The value types are all your primitive data types (in Dart all the data type that are written small like int, bool, double and so on). Their values are small enough to be stored directly in the stack. On the other hand you have all the other data types that may potentially be much bigger so they cannot be stored in the stack. This is why all the other so called reference types are basically stored in the heap and only an address or a reference is stored in the stack.
So when you are setting the reference type bar to foo you're essentially just copying the storage address from bar to foo. Therefore if you change the data stored under that reference it seems like your changing both values because both have the same reference. In contrast when you say b = a your not transferring the reference but the actual value instead so it is not effected if you make any changes to the original value.
I really hope I could help answering your question :)
In Dart, all type are reference types. All parameters are passed by value. The "value" of a reference type is its reference. (That's why it's possible to have two variables containing the "same object" - there is only one object, but both variables contain references to that object). You never ever make a copy of an object just by passing the reference around.
Dart does not have "pass by reference" where you pass a variable as an argument (so the called function can change the value bound to the variable, like C#'s ref parameters).
Dart does not have primitive types, at all. However (big caveat), numbers are always (pretending to be) canonicalized, so there is only ever one 1 object in the program. You can't create a different 1 object. In a way it acts similarly to other languages' primitive types, but it isn't one. You can use int as a type argument to List<int>, unlike in Java where you need to do List<Integer>, you can ask about the identity of an int like identical(1, 2), and you can call methods on integers like 1.hashCode.
If you want to clone or copy a list
var foo = ['a', 'b'];
var bar = [...foo];
bar.add('c');
print(bar); // [a, b, c]
print(foo); // [a, b]
var bar_two = []; //or init an empty list
bar_two.addAll([...bar]);
print(bar_two); // [a, b, c]
Reference link
Clone a List, Map or Set in Dart
Example code:
fn main() {
let mut y = &5; // 1
println!("{:p}", y);
{
let x = &2; // 2
println!("{:p}", x);
y = x;
}
y = &3; // 3
println!("{:p}", y);
}
If third assignment contains &3 then code output:
0x558e7da926a0
0x558e7da926a4
0x558e7da926a8
If third assignment contains &2 (same value with second assignment) then code output:
0x558e7da926a0
0x558e7da926a4
0x558e7da926a4
If third assignment contains &5 (same value with first assignment) then code output:
0x558e7da926a0
0x558e7da926a4
0x558e7da926a0
Why does rust not free memory but reuse it if the assignment value is the same or allocate a new block of memory otherwise?
Two occurrences of the same literal number are indistinguishable. You cannot expect the address of two literals to be identical, and neither can you expect them to be different.
This allows the compiler (but in fact it is free to do otherwise) to emit one 5 data in the executable code, and have all &5 refer to it. Constants may (see comment) also have a static lifetime, in which case they are not allocated/deallocated during program execution, they always are allocated.
There are lots of tricks an optimizing compiler can use to determine if a variable can be assigned a constant value. Your findings are consistent with this, no need to run duplicate code if it is not needed.
The following is base on my guess. Someone please point out the parts that I understand incorrectly.
If I have a class, of which an instance occupies 128 bits, called Class128Bits. And my program runs on a 64 bits computer.
First, I call let pointer = UnsafeMutablePointer<Calss128Bits>.allocate(capacity: 2)
the memory layout should look like this:
000-063 064 bits chaos
064-127 064 bits chaos
128-255 128 bits chaos
256-383 128 bits chaos
If I call pointer.pointee = aClass128Bits, it crashes because the pointers in the first two grids have not been initialized yet. Accessing to what they point to leads to unpredictable results.
But if I call pointer.initialize(to: aClass128Bits, count: 2), the pointers could be initialized like this:
000-063 address to offset 128
064-127 address to offset 256
128-255 a copy of aClass128Bits
256-383 a copy of aClass128Bits
Then any accesses will be safe.
However this cannot explain why UnsafeMutablePointer<Int> does not crash.
Original
The case I am facing:
The pointer to Int works fine, but the one to String crashes.
I know that I need to initialize it like this:
But I can't see the reason why I need to pass "42" twice.
In C, I might do something similar like this:
char *pointer = (char *)malloc(3 * sizeof(char));
memcpy(pointer, "42", 3);
free(pointer)
If allocate equals malloc, free equals deallocate, memcpy equals pointee{ set },
then what do initialize and deinitialize actually do?
And why does my code crash?
let pointer0 = UnsafeMutablePointer<String>.allocate(capacity: 1)
let pointer1 = UnsafeMutablePointer<Int>.allocate(capacity: 1)
let check the size of both
MemoryLayout.size(ofValue: pointer0) // 8
MemoryLayout.size(ofValue: pointer1) // 8
let check the value of .pointee
pointer0.pointee // CRASH!!!
while
pointer1.pointee // some random value
Why? The answer is as simple, as it can be. We allocated 8 bytes, independently from "associated" Type. Now is clear, that 8 bytes in memory are not enough to store any String. the underlying memory must be referenced indirectly. But there are some 8 random bytes there ... Loading what is in the memory with address represented by 8 random bytes as a String will most likely crash :-)
Why didn't it crash in the second case? Int value is 8 bytes long and the address can be represented as Int value.
let's try in the Playground
import Foundation
let pointer = UnsafeMutablePointer<CFString>.allocate(capacity: 1)
let us = Unmanaged<CFString>.passRetained("hello" as CFString)
pointer.initialize(to: us.takeRetainedValue())
print(pointer.pointee)
us.release()
// if this playground crash, try to run it again and again ... -)
print(pointer.pointee)
look what it prints to me :-)
hello
(
"<__NSCFOutputStream: 0x7fb0bdebd120>"
)
There is no miracle behind. pointer.pointee is trying to represent what is in the memory, which address is stored in our pointer, as a value of its associated type. It never crashes for Int because every 8 continues bytes somewhere in the memory can be represented as Int.
Swift use ARC, but creating the Unsafe[Mutable]Poiner doesn't allocate any memory for the instance of T, destroying it doesn't deallocate any memory for it.
Typed memory must be initialized before use and deinitialized after use. This is done using initialize and deinitialize methods respectively. Deinitialization is only required for non-trivial types. That said, including deinitialization is a good way to future-proof your code in case you change to something non-trivial
Why doesn't assignment to .pointee with Int value crash?
Initialize store the address of value
Assignment to pointee update the value at stored address
Without initializing It most likely will crash, only the probability is less by modifying only 8 bytes in memory at some random address.
trying this
import Darwin
var k = Int16.max.toIntMax()
typealias MyTupple = (Int32,Int32,Int8, Int16, Int16)
var arr: [MyTupple] = []
repeat {
let p = UnsafeMutablePointer<MyTupple>.allocate(capacity: 1)
if k == 1 {
print(MemoryLayout.size(ofValue: p), MemoryLayout.alignment(ofValue: p),MemoryLayout.stride(ofValue: p))
}
arr.append(p.pointee)
k -= 1
defer {
p.deallocate(capacity: 1)
}
} while k > 0
let s = arr.reduce([:]) { (r, v) -> [String:Int] in
var r = r
let c = r["\(v.0),\(v.1),\(v.2),\(v.3)"] ?? 0
r["\(v.0),\(v.1),\(v.2),\(v.3)"] = c + 1
return r
}
print(s)
I received
8 8 8
["0,0,-95,4104": 6472, "0,0,0,0": 26295]
Program ended with exit code: 0
It doesn't look very random, is it? That explains, why the crash with the typed pointer to Int is very unlikely.
One reason you need initialize(), and the only one as for now maybe, is
for ARC.
You'd better think with local scope variables, when seeing how ARC works:
func test() {
var refVar: RefType = initValue //<-(1)
//...
refVar = newValue //<-(2)
//...
//<-(3) just before exiting the loacl scope
}
For a usual assignment as (2), Swift generates some code like this:
swift_retain(_newValue)
swift_release(_refVar)
_refVar = _newValue
(Assume _refVar and _newValue are unmanaged pseudo vars.)
Retain means incrementing the reference count by 1, and release means decrementing the reference count by 1.
But, think what happens when the initial value assignment as at (1).
If the usual assignment code was generated, the code might crash at this line:
swift_release(_refVar)
because newly allocated region for a var may be filled with garbages, so swift_release(_refVar) cannot be safely executed.
Filling the newly region with zero (null) and release safely ignoring the null could be one solution, but it's sort of redundant and not effective.
So, Swift generates this sort of code for initial value assignment:
(for already retained values, if you know ownership model, owned by you.)
_refVar = _initValue
(for unretained values, meaning you have no ownership yet.)
swift_retain(_initValue)
_refVar = _initValue
This is initialize.
No-releasing the garbage data, and assign an initial value, retaining it if needed.
(The above explanation of "usual assignment" is a little bit simplified, Swift omits swift_retain(_newValue) when not needed.)
When exiting the local scope at (3), Swift just generates this sort of code:
swift_release(_refVar)
So, this is deinitialize.
Of course, you know retaining and releasing are not needed for primitive types like Int, so initialize and deinitialize may be donothing for such types.
And when you define a value type which includes some reference type properties, Swift generates initialize and deinitialize procedures specialized for the type.
The local scope example works for the regions allocated on the stack, and initialize() and deinitialize() of UnsafeMutablePointer works for the regions allocated in the heap.
And Swift is evolving so swift, that you might find another reason for needing initialize() and deinitialize() in the future, you'd better make it a habit to initialize() and deinitialize() all allocated UnsafeMutablePointers of any Pointee types.
From the documentation it is possible to conclude that .initialize() is a method that :
Initializes memory starting at self with the elements of source.
And .deinitialize() is a method that :
De-initializes the count Pointees starting at self, returning their
memory to an uninitialized state.
We should understand that when we are using UnsafeMutablePointer we should manage memory on our own. And methods that are described above help us to do this.
So in your case lets analyze example that you provide:
let pointer = UnsafeMutablePointer<String>.allocate(capacity: 1)
// allocate a memory space
pointer.initialize(to: "42")
// initialise memory
pointer.pointee // "42"
// reveals what is in the pointee location
pointer.pointee = "43"
// change the contents of the memory
pointer.deinitialize()
// return pointer to an unintialized state
pointer.deallocate(1)
// deallocate memory
So your code crashes because you do not initialize memory and try to set value.
Previously in objective-c when we are working with objects we always use [[MyClass alloc] init]].
In this case :
alloc:
allocates a part of memory to hold the object, and returns the
pointer.
init:
sets up the initial parameters of the object and returns it.
So basically .initialize() sets the value to the allocated memory part. When you create an object only with alloc you only set reference to empty memory part in the heap. When you call .initialize() you set value to this memory allocation in the heap.
Nice article about the pointers.
Consider below struct:
typedef struct _Index {
NSInteger category;
NSInteger item;
} Index;
If I use this struct as a property:
#property (nonatomic, assign) Index aIndex;
When I access it without any initialization right after a view controller alloc init, LLDB print it as:
(lldb) po vc.aIndex
(category = 0, item = 0)
(lldb) po &_aIndex
0x000000014e2bcf70
I am a little confused, the struct already has valid memory address, even before I want to allocate one. Does Objective-C initialize struct automatically? If it is a NSObject, I have to do alloc init to get a valid object, but for C struct, I get a valid struct even before I tried to initialize it.
Could somebody explains, and is it ok like this, not manually initializing it?
To answer the subquestion, why you cannot assign to a structure component returned from a getter:
(As a motivation this is, because I have read this Q several times.)
A. This has nothing to do with Cbjective-C. It is a behavior stated in the C standard. You can check it for simple C code:
NSMakeSize( 1.0, 2.0 ).width = 3.0; // Error
B. No, it is not an improvement of the compiler. If it would be so, a warning would be the result, not an error. A compiler developer does not have the liberty to decide what an error is. (There are some cases, in which they have the liberty, but this are explicitly mentioned.)
C. The reason for this error is quite easy:
An assignment to the expression
NSMakeSize( 1.0, 2.0 ).width
would be legal, if that expression is a l-value. A . operator's result is an l-value, if the structure is an l-value:
A postfix expression followed by the . operator and an identifier designates a member of a structure or union object. The value is that of the named member,82) and is an lvalue if the first expression is an lvalue.
ISO/IEC 9899:TC3, 6.5.2.3
Therefore it would be assignable, if the expression
NSMakeSize( 1.0, 2.0 )
is an l-value. It is not. The reason is a little bit more complex. To understand that you have to know the links between ., -> and &:
In contrast to ., -> always is an l-value.
A postfix expression followed by the -> operator and an identifier designates a member of a structure or union object. The value is that of the named member of the object to which the first expression points, and is an lvalue. 83)
Therefore - that is what footnote 83 explains – ->, &, and . has a link:
If you can calculate the address of a structure S having a component C with the & operator, the expression (&S)->C is equivalent to S.C. This requires that you can calculate the address of S. But you can never do that with a return value, even it is a simple integer …
int f(void)
{
return 1;
}
f()=5; // Error
… or a pointer …
int *f(void)
{
return NULL;
}
f()=NULL; // Error
You always get the same error: It is not assignable. Because it is a r-value. This is obvious, because it is not clear,
a) whether the way the compiler returns a value, esp. whether he does it in address space.
b) when the time the life time of the returned value is over
Going back to the structure that means that the return value is a r-value. Therefore the result of the . operator on that is a r-value. You are not allowed to assign a value to a r-value.
D. The solution
There is a solution to assign to a "returned structure". One might decide, whether it is good or not. Since -> always is an l-value, you can return a pointer to the structure. Dereferencing this pointer with the -> operator has always an l-value as result, so you can assign a value to it:
// obj.aIndex returns a pointer
obj.aIndex->category = 1;
You do not need #public for that. (What really is a bad idea.)
The semantics of the property are to copy the struct, so it doesn't need to be allocated and initialized like an Objective-C object would. It's given its own space like a primitive type is.
You will need to be careful updating it, as this won't work:
obj.aIndex.category = 1;
Instead you will need to do this:
Index index = obj.aIndex;
index.category = 1;
obj.aIndex = index;
This is because the property getter will return a copy of the struct and not a reference to it (the first snippet is like the second snippet, without the last line that assigns the copy back to the object).
So you might be better off making it a first class object, depending on how it will be used.
When I have 2 functions in D like this:
void func() {
void innerFunc() {
import std.stdio;
writeln(x);
}
int x = 5;
innerFunc();
}
When I call func this will print 5. How does it work? Where in memory does the 5 get stored? How does innerFunc know it has to print 5?
I attempt to answer this in broad terms. This type of issue arises in a number of languages that permit nested function definitions (including Ada and Pascal).
Normally, a variable like "x" is allocated on the processor stack. That's the normal process in any language that permits recursion.
When a nested function is called, a descriptor for the enclosing function's stack frame gets passed as hidden argument.
funct() then knows that x is located at some offset specified by the base pointer register.
innerFunct () knows the offset of x but has to derive the base from the hidden argument. It can't use its own base pointer value because it will be different from funct(). And, if innerFunct () called itself, the base pointer value would be different in each invocation.