Where are variables stored that aren't in the stack? - stack

So for example if I had a program which used:
example = a.top
Where is the variable example stored?
and where are the values that are popped from the stack stored?
Extra:
If I had a basic program without functions that just added some values together like (variables would be equal to the user input):
a=8
b=2
c=5
d=2
answer = a+b+c+d
Would the variables a,b,c,d,answer be stored in the stack or another area?

This question is hard to answer because it depends very much on the language you use.
Actually in C/C++/Pascal or traditional imperative languages, one will tell you they are on the stack or on the heap.
It can get complicated when you consider that some optimisations of the CPU will maybe make them stored in registers (even if you asked for a local variable) or even... nowhere: the compiler just remove the variable if is was useless.
One more possibility is when you have memory mapped data (mmap() called) you have data that may be in kernel space (some kernel page directly) or from a file (stored on the disk)
When using interpreted languages (python/perl/bash), they are handled by the interpreter itself, probably on the heap.
With functional languages, things get harder because such concepts (heap stack) have a really different meaning, one could say everything is on the stack (which has little to do with the C stack you may imagine)

Related

COBOL: What is the benefit of using paragraphs and sections instead of subprograms?

What is the benefit of using paragraphs and sections for executing pieces of code, instead of using a subprogram instead? As far as I can see paragraphs and sections are dangerous because they have an non intuitive control flow, its easy to fall through and execute stuff you never meant to execute, and there is no variable (item) scoping, therefore it encourages a style of programming where everything is visible to everything else. Its a slippery soup.
I read a lot, but I could not find anything related to the comparative benefit of paragraphs/sections vs a subprogram. I also asked online some people in some COBOL forum, but their answers were along the lines of "is this a joke" or "go learn programming"(!!!).
I do not wish to engage in a discussion of stylistic preferences, everyone writes the way that their brain works, I only want to know, is there any benefit to using paragraphs/sections for flow control? As in, are there any COBOL operations that can be done only by using paragraphs/sections? Or is it just a remnant of an early way of thinking about code?
Because no other language I know of has mimicked that, so it either has some mechanical concrete essential reason to exist in COBOL, or it is a stylistic preference of the COBOL people. Can someone illuminate me on what is happening?
These are multiple questions... the two most important ones:
Are there any COBOL operations that can be done only by using paragraphs/sections?
Yes. A likely not complete list:
USE statements in DECLARATIVES can only apply to a paragraph or a section. These are used for handling file errors and exceptions. Not all compilers support this COBOL standard feature in full.
Segmentation (primary: a program that is only partially loaded in memory) is only possible with sections; but that is to be considered a "legacy feature" (at least I don't know of people actually using it this way explicitly); see the comment of Gilbert Le Blanc for more details on this
fall-through, many other languages have this feature with a kind of a switch statement (COBOL's EVALUATE, which is not the same as a common switch but can be used similar has an explicit break and no fall-through)
GO TO DEPENDING ON (could be recoded to achieve something similar with EVALUATE and then either PERFORM, if the paragraphs are expected to fall-through, which is not uncommon, then that creates a lot of extra code)
GO TO in general and especially nice - the old obsolete ALTER statement
PERFORM statement, format 1 "out-of-line"
file state is only shared between programs when you define it as EXTERNAL, and you often want to have a file state being limited to a single program
up to COBOL85: EXIT statement (plain without anything else, actually doing nothing else then a CONTINUE would)
What is the benefit of using paragraphs and sections for executing pieces of code, instead of using a subprogram instead?
shared data (I guess you know of programs with static data or otherwise (module)global data that is shared between functions/methods and also different source code files)
much less overhead than a CALL is
consistency:
you know what's in your code, you don't know what another program does (or at least: you cannot guarantee that it will do the same some years later exactly the same)
easier to extend/change: adding another variable (and also removing part of it, change its size) to a CALL USING means that you also have to adjust the called program - and all programs that call this, even when you place the complete definition in a copybook, which is very reasonable, this means you have to recompile all programs that use this
a section/paragraph is always available (it is already loaded when the program runs), a CALLed program may not be available or lead to an exception, for example because it cannot be loaded as its parameters have changed
less stuff to code
Note: While not all compilers support this you can work around nearly all of the runtime overhead and consistency issue when you use one source files with multiple program definitions in (possibly nested) and using a static call-convention. This likely gives you the "modern" view you aim for with scope-limitation of variables, within the programs either persistent (like local-static) when defined in WORKING-STORAGE or always passed when in LINKAGE or "local-temporary" when in LOCAL-STORAGE.
Should all code of an application be in one program?
[I've added this one to not lead to bad assumptions] Of course not!
Using sub-programs and also user-defined functions (possibly even nested providing the option for "scoped" and "shared" data) is a good thing where you have a "feature boundary" (for example: access to data, user-interface, ...) or with "modern" COBOL where you have a "language boundary" (for example: direct CALLs of C/Java/whatever), but it isn't "jut for limiting a counter to a section" - in this case: either define a variable which state is not guaranteed to be available after any PERFORM or define one for the section/paragraph; in both cases it would be reasonable to use a prefix telling you this.
Using that "separate by boundary" approach also takes care of the "bad habit of everything being seen by everyone" issue (which is in any case only true for "all sections/paragraphs in the same program).
Personal side note: I would only use paragraphs where it is a "shop/team rule" (it is better to stay consistent then to do things different "just because they are better" [still providing an option to possibly change the common rule]) or for GO TO, which I normally not use.
SECTIONs and EXIT SECTION + EXIT PERFORM [CYCLE] (and very rarely GOBACK/EXIT PROGRAM) make paragraphs nearly unnecessary.
very short answer. subroutines!!
Subroutines execute in the context of the calling routine. Two virtues: no parameter passing, easy to create. In some languages, subroutines are private to (and are part of) the calling (invoking) routine (see various dialects of BASIC).
direct answer: Section and Paragraph support a different way of thinking about programming. Higher performance than call subprogram. Supports overlays. The "fall thru" aspect can be quite useful, a feature rather than a vice. They may be necessary depending on what you are doing with a specific COBOL compiler.
See also PL/1, BAL/360, architecture 360/370/...
As a veteran Cobol dinosaur, I would say asking about the benefit is not the right question. I used paragraph (or section) differently than a subprogram. The right question in my opinion is when to use them logically. If I can make an analogy, if you have a Dog java class, you will write Dog-appropriate methods within it. If there's a cat involved, you may need a helper class. In this case the helper class is the subprogram. Though, you can instead code the helper class methods inside the Dog class, but that will be bad coding.
In any other language I would recommend putting self contained functions into subroutines.
However in COBOL not so much. If the code is very likely to be used in other programs then a subroutine is a good idea. Otherwise not!
The reason being the total lack of any checks on the number type or existence of passed parameters at compile time. Small errors in call statements lead to program crashes at run time. Limiting the use of sub-routines and carefully checking the calling code for errors makes for a more reliable program.
Using paragraphs any type mismatch will be flagged at compile time, or, an automatic conversion will occur.

How does Rust store types at runtime?

A u32 takes 4 bytes of memory, a String takes 3 pointer-sized integers (for location, size, and reserved space) on the stack, plus some amount on the heap.
This to me implies that Rust doesn't know, when the code is executed, what type is stored at a particular location, because that knowledge would require more memory.
But at the same time, does it not need to know what type is stored at 0xfa3d2f10, in order to be able to interpret the bytes at that location? For example, to know that the next bytes form the spec of a String on the heap?
How does Rust store types at runtime?
It doesn't, generally.
Rust doesn't know, when the code is executed, what type is stored at a particular location
Correct.
does it not need to know what type is stored
No, the bytes in memory should be correct, and the rest of the code assumes as much. The offsets of fields in a struct are baked-in to the generated machine code.
When does Rust store something like type information?
When performing dynamic dispatch, a fat pointer is used. This is composed of a pointer to the data and a pointer to a vtable, a collection of functions that make up the interface in question. The vtable could be considered a representation of the type, but it doesn't have a lot of the information that you might think goes into "a type" (unless the trait requires it). Dynamic dispatch isn't super common in Rust as most people prefer static dispatch when it's possible, but both techniques have their benefits.
There's also concepts like TypeId, which can represent one specific type, but only of a subset of types. It also doesn't provide much capability besides "are these the same type or not".
Isn't this all terribly brittle?
Yes, it can be, which is one of the things that makes Rust so interesting.
In a language like C or C++, there's not much that safeguards the programmer from making dumb mistakes that go out and mess up those bytes floating around in memory. Making those mistakes is what leads to bugs due to memory safety. Instead of interpreting your password as a password, it's interpreted as your username and printed out to an attacker (oops!)
Rust provides safeguards against that in the form of a strong type system and tools like the borrow checker, but still all done at compile time. Unsafe Rust enables these dangerous tools with the tradeoff that the programmer is now expected to uphold all the guarantees themselves, much like if they were writing C or C++ again.
See also:
When does type binding happen in Rust?
How does Rust implement reflection?
How do I print the type of a variable in Rust?
How to introspect all available methods and members of a Rust type?

Can I somehow preserve values of local macros in Stata after the completion of the do-file?

Whenever I add new lies to the code (e.g. when computing a different estimate) I do not want to rerun the whole do-file again. However, I often need the values of certain local macros that were generated during the previous run of the do-file.
Is there a way to keep those values? Or I should switch to using more globals instead?
Yes, use global.
But note that you need to be careful with global for the exact reason you are using it: the macro remains in memory until you exit that instance of Stata, or until you reset it within the code.
Some people have very strong feelings about not using global ever (see pp5 and continuing here: http://faculty.chicagobooth.edu/matthew.gentzkow/research/ra_manual_coding.pdf). Once you learn their properties, and to not incur the small number of problems they can potentially cause, you should be fine.
Globals are by no means the only alternative.
First, consider using scalars. A scalar with a permanent name will survive beyond the end of a do-file.
Second, consider converting your do-file to a program and learning about saved results.
Third, you can always consider putting results in a new variable; it's just that it is usually bad style and wasteful on storage.
At a guess, the first is likely to be the most useful for you. Many Stata users are happy to use do-files with many dataset-specific statements. Jumping to writing fully-fledged and more general programs is a big jump and not (at first) trivial.

How to Assign a Variable Name in a #define (Boost related Mem Leak)?

I've ran Memory Validator on an application we're developing, and I've found that a Macro expressions we've defined is at the root of about 90% of the leaks. #define O_set.
Now, our macros are defined as follows:
#define O_SET_VALUE(ValueType, Value) boost::shared_ptr<ValueType>(new ValueType(Value))
.
.
#define O_set O_SET_VALUE
However, according to the Boost web site (at: http://www.boost.org/doc/libs/1_46_1/libs/smart_ptr/shared_ptr.htm):
A simple guideline that nearly
eliminates the possibility of memory
leaks is: always use a named smart
pointer variable to hold the result of
new. Every occurence of the new
keyword in the code should have the
form: shared_ptr p(new Y); It is,
of course, acceptable to use another
smart pointer in place of shared_ptr
above; having T and Y be the same
type, or passing arguments to Y's
constructor is also OK.
If you observe this guideline, it
naturally follows that you will have
no explicit deletes; try/catch
constructs will be rare.
This leads me to believe that this is indeed the major cause of our memory leaks. Or am I being naive or completely out of my depth here?
Question is, is there a way to work around the mentioned issue, with the above macro #defines?
Update:
I'm using them, for example, like this:
return O_set(int, 1);
_time_stamp(O_set(TO_DateTime, TO_DateTime())) (_time_stamp is a member of a certain class)
I'm working in Windows and used MemoryValidator for tracking the Memory Leaks - according to it there are leaks - as I state, the root of most of which (according to the stack traces) come down to that macro #define.
Smart pointers are tricky. The first thing I would do is to check your code for any 'new' statement which isn't inside either macro.
Then you have to think about how the pointers are being used; if you pass a smart pointer by reference, the reference counter isn't increased, for example.
Another thing to check is all instances of '.get()', which is a big problem if you are working with a legacy code base or other developers who don't understand the point of using smart pointers! (this is more to do with preventing random crashes than memory links persé, but worth checking)
Also, you might want to consider why you are using a macro for all smart pointer creation. Boost supply different smart pointers for different purposes. There isn't a one size fits all solution. Good old std::auto_ptr is fine for most uses, except storing in standard containers, but you knew that already.
The most obvious and overlooked aspect is, do you really need to 'new' something. C++ isn't Java, if you can avoid creating dynamic objects you are better off doing so.
If you are lucky enough to be working with a *NIX platform (you don't mention, sorry) then try the leak checking tool with Valgrind. It's very useful. There are similar tools available for windows, but often using you're software skilz is best.
Good luck.

Does how you name a variable have any impact on the memory usage of an application?

Declaring a variable name, how much (if at all) impact does the length of its name have to the total memory of the application? Is there a maximum length anyway? Or are we free to elaborate on our variables (and instances) as much as we want?
It depends on the language, actually.
If you're using C++ or C, it has no impact.
If you're using an interpreted language, you're passing the source code around, so it can have a dramatic impact.
If you're using a compiled language that compiles to an intermediate language, such as Java or any of the .NET languages, then typically the variable names, class names, method names, etc are all part of the IL. Having longer method names will have an impact. However, if you later run through an obfuscator, this goes away, since the obfuscator will rename everything to (typically) very short names. This is why obfuscation often gives performance impacts.
However, I will strongly suggest using long, descriptive variable/method/class names. This makes your code understandable, maintainable, and readable - in the long run, that far outweighs any slight perf. benefit.
It has no impact in a compiled language.
In compiled languages, almost certainly not; everything becomes a symbol in a symbol table. In interpreted languages, the answer is also no, with a few extremely rare exceptions (in certain older versions of Python there would be a difference, for example).
MSVC++ truncates variable names to 255 characters. Variable name length has no impact on compiled code size.
As stated by others, variable names disappear in compiled languages. I believe that local variable names in .Net may be discarded. But generally speaking, even in an interpreted language, the memory consumption of variable names is negligible, especially in light of the advantages of good variable names.
Actually in ASP.NET long variable names for controls and master pages do add to the size of the generated HTML. This will add some insignificant extra memory to buffer the output stream, but the effect will be most noticed in the extra few hundred bytes your sending over the network.
In Python, the names appear to be collected into a number of simple tables; each name appears exactly once in each code object.The names have no impact on performance.
For statistical purposes, I looked at a 20 line function that was a solution to Project Euler problem 15. This function created a 292-byte code object. It used 7 distinct names in the name table. You'd have to use 41-character variable names to double the size of the byte-code file.
That would be the only impact -- insanely large names might slow down load time for your module.

Resources