Can I somehow preserve values of local macros in Stata after the completion of the do-file? - local

Whenever I add new lies to the code (e.g. when computing a different estimate) I do not want to rerun the whole do-file again. However, I often need the values of certain local macros that were generated during the previous run of the do-file.
Is there a way to keep those values? Or I should switch to using more globals instead?

Yes, use global.
But note that you need to be careful with global for the exact reason you are using it: the macro remains in memory until you exit that instance of Stata, or until you reset it within the code.
Some people have very strong feelings about not using global ever (see pp5 and continuing here: http://faculty.chicagobooth.edu/matthew.gentzkow/research/ra_manual_coding.pdf). Once you learn their properties, and to not incur the small number of problems they can potentially cause, you should be fine.

Globals are by no means the only alternative.
First, consider using scalars. A scalar with a permanent name will survive beyond the end of a do-file.
Second, consider converting your do-file to a program and learning about saved results.
Third, you can always consider putting results in a new variable; it's just that it is usually bad style and wasteful on storage.
At a guess, the first is likely to be the most useful for you. Many Stata users are happy to use do-files with many dataset-specific statements. Jumping to writing fully-fledged and more general programs is a big jump and not (at first) trivial.

Related

COBOL: What is the benefit of using paragraphs and sections instead of subprograms?

What is the benefit of using paragraphs and sections for executing pieces of code, instead of using a subprogram instead? As far as I can see paragraphs and sections are dangerous because they have an non intuitive control flow, its easy to fall through and execute stuff you never meant to execute, and there is no variable (item) scoping, therefore it encourages a style of programming where everything is visible to everything else. Its a slippery soup.
I read a lot, but I could not find anything related to the comparative benefit of paragraphs/sections vs a subprogram. I also asked online some people in some COBOL forum, but their answers were along the lines of "is this a joke" or "go learn programming"(!!!).
I do not wish to engage in a discussion of stylistic preferences, everyone writes the way that their brain works, I only want to know, is there any benefit to using paragraphs/sections for flow control? As in, are there any COBOL operations that can be done only by using paragraphs/sections? Or is it just a remnant of an early way of thinking about code?
Because no other language I know of has mimicked that, so it either has some mechanical concrete essential reason to exist in COBOL, or it is a stylistic preference of the COBOL people. Can someone illuminate me on what is happening?
These are multiple questions... the two most important ones:
Are there any COBOL operations that can be done only by using paragraphs/sections?
Yes. A likely not complete list:
USE statements in DECLARATIVES can only apply to a paragraph or a section. These are used for handling file errors and exceptions. Not all compilers support this COBOL standard feature in full.
Segmentation (primary: a program that is only partially loaded in memory) is only possible with sections; but that is to be considered a "legacy feature" (at least I don't know of people actually using it this way explicitly); see the comment of Gilbert Le Blanc for more details on this
fall-through, many other languages have this feature with a kind of a switch statement (COBOL's EVALUATE, which is not the same as a common switch but can be used similar has an explicit break and no fall-through)
GO TO DEPENDING ON (could be recoded to achieve something similar with EVALUATE and then either PERFORM, if the paragraphs are expected to fall-through, which is not uncommon, then that creates a lot of extra code)
GO TO in general and especially nice - the old obsolete ALTER statement
PERFORM statement, format 1 "out-of-line"
file state is only shared between programs when you define it as EXTERNAL, and you often want to have a file state being limited to a single program
up to COBOL85: EXIT statement (plain without anything else, actually doing nothing else then a CONTINUE would)
What is the benefit of using paragraphs and sections for executing pieces of code, instead of using a subprogram instead?
shared data (I guess you know of programs with static data or otherwise (module)global data that is shared between functions/methods and also different source code files)
much less overhead than a CALL is
consistency:
you know what's in your code, you don't know what another program does (or at least: you cannot guarantee that it will do the same some years later exactly the same)
easier to extend/change: adding another variable (and also removing part of it, change its size) to a CALL USING means that you also have to adjust the called program - and all programs that call this, even when you place the complete definition in a copybook, which is very reasonable, this means you have to recompile all programs that use this
a section/paragraph is always available (it is already loaded when the program runs), a CALLed program may not be available or lead to an exception, for example because it cannot be loaded as its parameters have changed
less stuff to code
Note: While not all compilers support this you can work around nearly all of the runtime overhead and consistency issue when you use one source files with multiple program definitions in (possibly nested) and using a static call-convention. This likely gives you the "modern" view you aim for with scope-limitation of variables, within the programs either persistent (like local-static) when defined in WORKING-STORAGE or always passed when in LINKAGE or "local-temporary" when in LOCAL-STORAGE.
Should all code of an application be in one program?
[I've added this one to not lead to bad assumptions] Of course not!
Using sub-programs and also user-defined functions (possibly even nested providing the option for "scoped" and "shared" data) is a good thing where you have a "feature boundary" (for example: access to data, user-interface, ...) or with "modern" COBOL where you have a "language boundary" (for example: direct CALLs of C/Java/whatever), but it isn't "jut for limiting a counter to a section" - in this case: either define a variable which state is not guaranteed to be available after any PERFORM or define one for the section/paragraph; in both cases it would be reasonable to use a prefix telling you this.
Using that "separate by boundary" approach also takes care of the "bad habit of everything being seen by everyone" issue (which is in any case only true for "all sections/paragraphs in the same program).
Personal side note: I would only use paragraphs where it is a "shop/team rule" (it is better to stay consistent then to do things different "just because they are better" [still providing an option to possibly change the common rule]) or for GO TO, which I normally not use.
SECTIONs and EXIT SECTION + EXIT PERFORM [CYCLE] (and very rarely GOBACK/EXIT PROGRAM) make paragraphs nearly unnecessary.
very short answer. subroutines!!
Subroutines execute in the context of the calling routine. Two virtues: no parameter passing, easy to create. In some languages, subroutines are private to (and are part of) the calling (invoking) routine (see various dialects of BASIC).
direct answer: Section and Paragraph support a different way of thinking about programming. Higher performance than call subprogram. Supports overlays. The "fall thru" aspect can be quite useful, a feature rather than a vice. They may be necessary depending on what you are doing with a specific COBOL compiler.
See also PL/1, BAL/360, architecture 360/370/...
As a veteran Cobol dinosaur, I would say asking about the benefit is not the right question. I used paragraph (or section) differently than a subprogram. The right question in my opinion is when to use them logically. If I can make an analogy, if you have a Dog java class, you will write Dog-appropriate methods within it. If there's a cat involved, you may need a helper class. In this case the helper class is the subprogram. Though, you can instead code the helper class methods inside the Dog class, but that will be bad coding.
In any other language I would recommend putting self contained functions into subroutines.
However in COBOL not so much. If the code is very likely to be used in other programs then a subroutine is a good idea. Otherwise not!
The reason being the total lack of any checks on the number type or existence of passed parameters at compile time. Small errors in call statements lead to program crashes at run time. Limiting the use of sub-routines and carefully checking the calling code for errors makes for a more reliable program.
Using paragraphs any type mismatch will be flagged at compile time, or, an automatic conversion will occur.

Erlang: Compute data structure literal (constant) at compile time?

This may be a naive question, and I suspect the answer is "yes," but I had no luck searching here and elsewhere on terms like "erlang compiler optimization constants" etc.
At any rate, can (will) the erlang compiler create a data structure that is constant or literal at compile time, and use that instead of creating code that creates the data structure over and over again? I will provide a simple toy example.
test() -> sets:from_list([usd, eur, yen, nzd, peso]).
Can (will) the compiler simply stick the set there at the output of the function instead of computing it every time?
The reason I ask is, I want to have a lookup table in a program I'm developing. The table is just constants that can be calculated (at least theoretically) at compile time. I'd like to just compute the table once, and not have to compute it every time. I know I could do this in other ways, such as compute the thing and store it in the process dictionary for instance (or perhaps an ets or mnesia table). But I always start simple, and to me the simplest solution is to do it like the toy example above, if the compiler optimizes it.
If that doesn't work, is there some other way to achieve what I want? (I guess I could look into parse transforms if they would work for this, but that's getting more complicated than I would like?)
THIS JUST IN. I used compile:file/2 with an 'S' option to produce the following. I'm no erlang assembly expert, but it looks like the optimization isn't performed:
{function, test, 0, 5}.
{label,4}.
{func_info,{atom,exchange},{atom,test},0}.
{label,5}.
{move,{literal,[usd,eur,yen,nzd,peso]},{x,0}}.
{call_ext_only,1,{extfunc,sets,from_list,1}}.
No, erlang compiler doesn't perform partial evaluation of calls to external modules which set is. You can use ct_expand module of famous parse_trans to achieve this effect.
providing that set is not native datatype for erlang, and (as matter of fact) it's just a library, written in erlang, I don't think it's feasibly for compiler to create sets at compile time.
As you could see, sets are not optimized in erlang (as any other library written in erlang).
The way of solving your problem is to compute the set once and pass it as a parameter to the functions or to use ETS/Mnesia.

Are there any downsides to passing in an Erlang record as a function argument?

Are there any downsides to passing in an Erlang record as a function argument?
There is no downside, unless the caller function and the called function were compiled with different 'versions' of the record.
Some functions from erlangs standard library do indeed use records in their interfaces (I can't recall which ones, right now--but there are a few), but in my humble opinion, the major turnoff is, that the user will have to include your header file, just to use your function.
That seems un-erlangy to me (you don't ever do that normally, unless you're using said functions from the stdlib), creates weird inter-dependencies, and is harder to use from the shell (I wouldn't know from the top of my head how to load & use records from the shell -- I usually just "cheat" by constructing the tuple manually...)
Also, handling records is a bit different from the stuff you usually do, since their keys per default take the atom 'undefined' as value, au contraire to how you usually do it with proplists, for instance (a value that wasn't set just isn't there) -- this might cause some confusion for people who do not normally work a lot with records.
So, all-in-all, I'd usually prefer a proplist or something similar, unless I have a very good reason to use a record. I do usually use records, though, for internal state of for example a gen_server or a gen_fsm; It's somewhat easier to update that way.
I think the biggest downside is that it's not idiomatic. Have you ever seen an API that required you to construct a record and pass it in?
Why would you want to do something that's going to feel foreign to any erlang programmer? There's a convention already in use for optional named arguments to functions. Inventing yet another way without good cause is pointless.

With Delphi are you more likely to re-use temporary variables than with other languages?

Since Delphi makes you go all the way up to the var section of a method to declare a local variable, do you find yourself breaking "Curly's Law" (re-using variables) more often than you did in college?(unless of course, you programmed Pascal in college).
If so, what do you do to break yourself of that habit, especially in functions where you need to get and/or set large numbers of properties. Is there a threshold where it is acceptable to declare TempInt : Integer and TempStr : String. (Do you use an 'e' in Temp sometimes and not other times?)
I hardly ever reuse variables. I hate to say never, but it is close to never.
Here is why:
Small methods (It's good practice to keep methods and property-getters/setters as concise as possible).
When only one thing is done, no need to reuse variables
The var section is always on the screen.
The compiler reuses the storage as necessary, so reuse is only a lazy coder crutch with no performance improvements.
Newer versions of Delphi have CTRL+SHIFT+V to declare a variable if I am feeling lazy.
Reusing variables makes debugging more difficult - more time and effort is spent on maintenance then development (for any serious application) so always do things to make maintenance easier, even if it makes development a little harder.
Prefer user defined types, so a Account Balance is a specific type, not just a Currency. This means variables are less reusable anyway.
For loop variables (a common reused variable) are used less now that we can use for in and skip the iterator all together.
My variables have descriptive names, so it would not make sense to use them out of context.
Generally speaking, I like having all the variables at the top for the same reason I like having an interface section on my units. It is kind of like having an abstract on a paper - give me a general idea of what is going on without having to read the whole paper. Delphi could benefit from having the ability to declare variables at "inner scope" like within a for loop or other begin / end blocks, but I don't know how much that would distract from the cleanliness and readability of Delphi code.
This is just a matter of discipline. Yes, Delphi would probably be better served by inline variable declaration, but that's not really a big deal. Just be sure to name your variables in a descriptive way, and then it will just feel awkward to use them incorrectly. And, as Stephan Eggermont said, if your methods are really getting that long, then that's a whole different code smell.
Not really. As I make my methods really small the var section is not far away. As my method size has reduced a lot since university, I'd say I break it less often.
I definitely do tend to re-use local variables like 'Findex' (or just plain 'i') if the routine has several distinct iterative sections to it. Not really the best practice I guess, but I'd like to think it's only really obvious where I do it, and obviously the usage doesn't overlap.
It's not usually a big deal to go back to the top of the routine and key-in the new variables, though I didn't know about Ctrl-Shift-V (will be trying that later!).
It'll be interesting to see what everyone else says. :-)
I don't tend to reuse local vars as a general safety rule. I do love the new "var" live template stuff in d2007+. Just type var[tab] and the helper pops up. Also check out Ctrl-Shift-D (others mentioned Ctrl-Shift-V for local vars) to declare a field.
Declaring variables is very simple - some times they would get automatically created ('for' loop template), other times you can just use 'Declare Variable' refactoring (or 'Add Local Var' if you are using MMX - as you should).
You can develop your own style of coding that uses variables as required. I generally use unique vars (90%) with a few temp vars (10%) when required.
It depends on the nature of the var. If it is a var to help support other code (counter for loops, building SQL strings, etc.) then a temp var you can re-use is helpful. In this case your temp vars are useful as "disposable" vars in sections of code. Just add a comment to your var declarations indicating the temp vars.
i.e. //temp vars are re-used as required in this procedure --> clear/re-initialise them after/before use.
Other than that I avoid temp vars & never use them to hold critical data. A unique var should be used then to avoid confusion & make readability/maintenance of code clearer.
I think delphi makes the exception with the overuse of temp variables. Most of the time when i'm creating a function/procedure where i know i will need loops or temp strings, first thing i do is to create a var i,j:integer; tmp:string; and add more as needed :)
As a long time Delphi user (since 1.0) this is the major thing I hate about Pascal. all other modern languages support definition at the point of use, yet Delphi persists with the var section, and Delphi programmers persist in ridiculous hand-waving antics to justify it.
Well Curly did have a good point. I'm a sinner in that respect occasionally. Usually just a temp string variable for convenience more than anything.
To be honest I've never really thought about it... until now. I have no issue with the VAR section being where it is as that's been a habit formed since Delphi 1.0.
To answer the question, I only re-use a temp variable, usually a string, and usually only to gain some slight performance improvements. Don't have an issue with that.
I probably would have found this to be a bigger problem if hadn't had CTRL-SHIFT-V as a shortcut to the VAR section. I'm not writing GIGANTIC methods here, but sometimes they get a little out of hand (and I can justify this of course) and it helps a lot. I'm not sure if that shortcut comes from cnTools or GExperts, but they're both pretty useful and I'd recommend them both.

Is it good practice to use a Dynamic Array in an object field?

I am refactoring some existing Delphi code into a class.
The current code uses a global variable defined as a dynamic array array of byte. At initialization time the code figures out the size of the array and uses SetLength to allocate it. It is convenient both as the buffer to obtain the data and as the runtime container for a later processing.
I want to move this variable as one of the object attributes.
But I am not sure if it is ok to maintain its type. Is it considered good practice?
The alternative I am considering is to tranform it to a dynamic container like a TList. I will keep the very same code for obtaining the data, with a local dynamic array but moving it to the container for the rest of its lifespan. Is it worth the effort? I know that elegance always pay off at the end, but I don't really see the value of the effort at this moment. Any thoughts?
Dynamic arrays are great, but really only for fixed dimentions. If they have to grow, especially in single record increments, this can cause eventual errors from the memory manager (and possible performance issues) since the array has to be reallocated and copied to the new bigger destination. TList does at least have a 'growing' mechanism that is called less frequently.
I know that elegance always pay off at the end,
Is that so? Note that changing working code always includes the risk to break something. It must IMHO be decided in every situation if the gained elegance is worth the risk.
In your case, if you add and remove items during runtime, I would use a TList since it is much easier for these operations. If you just initialize the length once and the arrays is constant after initialization you can just keep the dynamic array. There's definitely no "good practice" saying that you shouldn't use dynamic arrays.

Resources