Is it good practice to use a Dynamic Array in an object field? - delphi

I am refactoring some existing Delphi code into a class.
The current code uses a global variable defined as a dynamic array array of byte. At initialization time the code figures out the size of the array and uses SetLength to allocate it. It is convenient both as the buffer to obtain the data and as the runtime container for a later processing.
I want to move this variable as one of the object attributes.
But I am not sure if it is ok to maintain its type. Is it considered good practice?
The alternative I am considering is to tranform it to a dynamic container like a TList. I will keep the very same code for obtaining the data, with a local dynamic array but moving it to the container for the rest of its lifespan. Is it worth the effort? I know that elegance always pay off at the end, but I don't really see the value of the effort at this moment. Any thoughts?

Dynamic arrays are great, but really only for fixed dimentions. If they have to grow, especially in single record increments, this can cause eventual errors from the memory manager (and possible performance issues) since the array has to be reallocated and copied to the new bigger destination. TList does at least have a 'growing' mechanism that is called less frequently.

I know that elegance always pay off at the end,
Is that so? Note that changing working code always includes the risk to break something. It must IMHO be decided in every situation if the gained elegance is worth the risk.
In your case, if you add and remove items during runtime, I would use a TList since it is much easier for these operations. If you just initialize the length once and the arrays is constant after initialization you can just keep the dynamic array. There's definitely no "good practice" saying that you shouldn't use dynamic arrays.

Related

Delphi object without Create? [duplicate]

In Delphi sane people use a class to define objects.
In Turbo Pascal for Windows we used object and today you can still use object to create an object.
The difference is that a object lives on the stack and a class lives on the heap.
And of course the object is depreciated.
Putting all that aside:
is there a benefit to be had, speed wise by using object instead of class?
I know that object is broken in Delphi 2009, but I've got a special use case1) where speed matters and I'm trying to find if using object will make my thing faster without making it buggy
This code base is in Delphi 7, but I may port it to Delphi 2007, haven't decided yet.
1) Conway's game of life
Long comment
Thanks all for pointing me in the right direction.
Let me explain a bit more. I'm trying to do a faster implementation of hashlife, see also here or here for simple sourcecode
The current record holder is golly, but golly uses a straight translation of Bill Gospher original lisp code (which is brilliant as an algorithm, but not optimized at the micro level at all). Hashlife enables you to calculate a generation in O(log(n)) time.
It does this by using a space/time trade off. And for this reason hashlife needs a lot of memory, gigabytes are not unheard of. In return you can calculate generation 2^128 (340282366920938463463374607431770000000) using generation 2^127 (170141183460469231731687303715880000000) in o(1) time.
Because hashlife needs to compute hashes for all sub-patterns that occur in a larger pattern, allocation of objects needs to be fast.
Here's the solution I've settled upon:
Allocation optimization
I allocate one big block of physical memory (user settable) lets say 512MB. Inside this blob I allocate what I call cheese stacks. This is a normal stack where I push and pop, but a pop can also be from the middle of the stack. If that happens I mark it on the free list (this is a normal stack). When pushing I check the free list first if nothing is free I push as normal. I'll be using records as advised it looks like the solution with the least amount of overhead.
Because of the way hashlife works, very little popping takes place and a lot of pushes. I keep separate stacks for structures of different sizes, making sure to keep memory access aligned on 4/8/16 byte boundaries.
Other optimizations
recursion removal
cache optimization
use of inline
precalculation of hashes (akin to rainbow tables)
detection of pathological cases and use of fall-back algorithm
use of GPU
For using normal OOP programming, you should always use the class kind. You'll have the most powerful object model in Delphi, including interface and generics (in later Delphi versions).
1. Records, pointers and objects
Records can be evil (slow hidden copy if you forgot to declare a parameter as const, record hidden slow cleanup code, a fillchar would make any string in record become a memory leak...), but they are sometimes very convenient to access a binary structure (e.g. some "smallish value"), via a pointer.
A dynamic array of tiny records (e.g. with one integer and one double field) will be much faster than a TList of small classes; with our TDynArray wrapper, you will have high-level access to the records, with serialization, sorting, hashing and such.
If using pointers, you must know what you are doing. It's definitively more preferable to stick with classes, and TPersistent if you want to use the magical "VCL component ownership model".
Inheritance is not allowed for records. You'll need either to use a "variant record" (using the case keyword in its type definition), either use nested records. When using C-like API, you'll sometimes have to use object-oriented structures. Using nested records or variant records is IMHO much less clear than the good old "object" inheritance model.
2. When to use object
But there are some places where objects are a good way of accessing already existing data.
Even the object model is better than the new record model, because it handles simple inheritance.
In a Blog entry last summer, I posted some possibilities to still use objects:
A memory mapped file, which I want to parse very quickly: a pointer to such an object is just great, and you still have methods at hand; I use this for TFileHeader or TFileInfo which map the .zip header, in SynZip.pas;
A Win32 structure, as defined by a API call, in which I put handy methods for easy access to the data (for that you may use record but if there is some object orientation in the struct - which is very common - you'll have to nest records, which is not the very handy);
A temporary structure defined on the stack, just used during a procedure: I use this for TZStream in SynZip.pas, or for our RTTI related classes, which map the Delphi generated RTTI in an Object-Oriented way not as the TypeInfo which is function/procedure oriented. By mapping the RTTI memory content directly, our code is faster than using the new RTTI classes created on the heap. We don't instanciate any memory, which, for an ORM framework like ours, is good for its speed. We need a lot of RTTI info, but we need it quick, we need it directly.
3. How object implementation is broken in modern Delphi
The fact that object is broken in modern Delphi is a shame, IMHO.
Normally, if you define a record on the stack, containing some reference-counted variables (like a string), it will be initialized by some compiler magic code, at the begin level of the method/function:
type TObj = object Int: integer; Str: string; end;
procedure Test;
var O: TObj
begin // here, an _InitializeRecord(#O,TypeInfo(TObj)) call is made
O.Str := 'test';
(...)
end; // here, a _FinalizeRecord(#O,TypeInfo(TObj)) call is made
Those _InitializeRecord and _FinalizeRecord will "prepare" then "release" the O.Str variable.
With Delphi 2010, I found out that sometimes, this _InitializeRecord() was not always made.
If the record has only some no public fields, the hidden calls are sometimes not generated by the compiler.
Just build the source again, and there will be...
The only solution I found out was using the record keyword instead of object.
So here is how the resulting code looks like:
/// used to store and retrieve Words in a sorted array
// - is defined either as an object either as a record, due to a bug
// in Delphi 2010 compiler (at least): this structure is not initialized
// if defined as a record on the stack, but will be as an object
TSortedWordArray = {$ifdef UNICODE}record{$else}object{$endif}
public
Values: TWordDynArray;
Count: integer;
/// add a value into the sorted array
// - return the index of the new inserted value into the Values[] array
// - return -(foundindex+1) if this value is already in the Values[] array
function Add(aValue: Word): PtrInt;
/// return the index if the supplied value in the Values[] array
// - return -1 if not found
function IndexOf(aValue: Word): PtrInt; {$ifdef HASINLINE}inline;{$endif}
end;
The {$ifdef UNICODE}record{$else}object{$endif} is awful... but the code generation error didn't occur since..
The resulting modifications in the source code are not huge, but a bit disappointing. I found out that older version of the IDE (e.g. Delphi 6/7) are not able to parse such declaration, so the class hierarchy will be broken in the editor... :(
Backward compatibility should include regression tests. A lot of Delphi users stay to this product because of the existing code. Breaking features are very problematic for the Delphi future, IMHO: if you have to rewrite a lot of code, which shouldn't you just switch the project to C# or Java?
Object was not the Delphi 1 method of setting up objects; it was the short-lived Turbo Pascal method of setting up objects, which was replaced by the Delphi TObject model in Delphi 1. It was kept around for backwards compatibility, but it should be avoided for a few reasons:
As you noted, it's broken in more recent versions. And AFAIK there are no plans to fix it.
It's a conceptualy wrong object model. The entire point of Object Oriented Programming, the one thing that really distinguishes it from procedural programming, is Liskov substitution (inheritance and polymorphism), and inheritance and value types don't mix.
You lose support for a lot of features that require TObject descendants.
If you really need value types that don't need to be dynamically allocated and initialized, you can use records instead. You can't inherit from them, but you can't do that very well with object either so you're not losing anything here.
As for the rest of the question, there aren't all that many speed benefits. The TObject model is plenty fast, especially if you're using the FastMM memory manager to speed up creation and destruction of objects, and if your objects contain lots of fields they can even be faster than records in a lot of cases, because they're passed by reference and don't have to be copied around for each function call.
When given a choice between "fast and possibly broken" and "fast and correct," always choose the latter.
Old-style objects offer no speed incentive over plain old records, so wherever you might be tempted to use old-style objects, you can use records instead without the risk of having uninitialized compiler-managed types or broken virtual methods. If your version of Delphi doesn't support records with methods, then just use standalone procedures instead.
Way back in older versions of Delphi which did not support records with methods then using object was the way to get your objects allocated on the stack. Very occasionally that would yield worthwhile performance benefits. Nowadays record is better. The only feature missing from record is the ability to inherit from another record.
You give up a lot when you change from class to record so only consider it if the performance benefits are overwhelming.

Can I somehow preserve values of local macros in Stata after the completion of the do-file?

Whenever I add new lies to the code (e.g. when computing a different estimate) I do not want to rerun the whole do-file again. However, I often need the values of certain local macros that were generated during the previous run of the do-file.
Is there a way to keep those values? Or I should switch to using more globals instead?
Yes, use global.
But note that you need to be careful with global for the exact reason you are using it: the macro remains in memory until you exit that instance of Stata, or until you reset it within the code.
Some people have very strong feelings about not using global ever (see pp5 and continuing here: http://faculty.chicagobooth.edu/matthew.gentzkow/research/ra_manual_coding.pdf). Once you learn their properties, and to not incur the small number of problems they can potentially cause, you should be fine.
Globals are by no means the only alternative.
First, consider using scalars. A scalar with a permanent name will survive beyond the end of a do-file.
Second, consider converting your do-file to a program and learning about saved results.
Third, you can always consider putting results in a new variable; it's just that it is usually bad style and wasteful on storage.
At a guess, the first is likely to be the most useful for you. Many Stata users are happy to use do-files with many dataset-specific statements. Jumping to writing fully-fledged and more general programs is a big jump and not (at first) trivial.

is table.remove() the same as p[#p] = nil and which is quicker?

As the title says. If I have a table p in lua, is using
table.remove(p)
the same as
p[#p] = nil
if so which is quicker - I'd guess the second, but would like some reassurance.
By the 'same as' I mean does the internal array storage shrink using assignment to nil? I can't seem to find this documented anywhere. Does setting the last element in an array to nil, or the last 10 elements in an array to nil mean the array will be shrunk, or does it always keep the storage and never shrink the array again?
I've assumed the array is contiguous, i.e. it has values stored in each array entry up to #p.
Setting the last element to nil will not be a function call. So in that way, it will certainly be faster than table.remove. How much that matters is up to you.
By the 'same as' I mean does the internal array storage shrink using assignment to nil? I can't seem to find this documented anywhere.
It isn't documented; this allows the implementation to change. All that Lua promises is that setting it to nil will decrease the size returned by subsequent calls to #p. Anything more than that is up to Lua, and is certainly subject to change without warning. It's nothing you should rely on.
However, I would respectfully suggest that if you're thinking about these kinds of micro-optimizations, you probably shouldn't be using a scripting language. A scripting language should be used for code where performance is not important enough to spend a great deal on.
p[#p] = nil will be faster, and identical for the case where table.remove is the last position
As an extra note, table.remove(func_call()) may do unexpected things if the function call returns multiple values.
Going from the implementation of the Lua 5.1 VM as described in Lua Performance Tips by Roberto Ierusalimschy (chief architect of Lua), the table's allocated storage doesn't change until the next time the table is rehashed - and, as stated time and time again, you really shouldn't be thinking about this unless you have hard profiling data showing it's a significant problem.
As for the difference between table.remove(t) and t[#t] = nil, see my answer to What's the difference between `table.insert(t, i)` and `t[#t+1] = i`.
I think it is better to be consistent with adding are removing elements from array type tables. table.insert and table.remove make your code consistent and easier to read.
Re: tables Lua doesn't resize the table until you have added the first new element to it after having previously removed elements.
table.remove(p) returns the value that was removed. p[#p] = nil does not return anything.

What datatype/structure to store file list info?

I have an application that searches files on the computer (configurable path, type etc). Currently it adds information to a database as soon as a matching file is found. Rather than that I want to hold the information in memory for further manipulation before inserting to database. The list may contain a lot of items. I consider performance as important factor. I may need iterating thru the items, so a structure that can be coded easily is another key issue. and how can I achieve php style associative arrays for this job?
If you're using Delphi 2009, you can use a TDictionary. It takes two generic parameters. The first should be a string, for the filename, and the second would be whatever data type you're associating with. It also has three built-in enumerators, one for key-value pairs, one for keys only and one for values only, which makes iterating easy.
Another solution would be to use just a standard TStringList.
As long as it's sorted and has some duplicate setting other than dupAccept, you can use indexof or indexofname to find items in the list quickly.
It also has the Objects addition which allows you to store object information attached to the name. Starting with D2009, TStringList has the OwnsObject property which allows you to delegate object cleanup to the TStringList. Prior to D2009 you have to handle that yourself.
Much of this will depend on how you are going to use the list and to what scale. If you are going to use it as a stack, or queue, then a TList would work fine. If your needing to search through the list for a specific item then you will need something that allows faster retrieval. TDictionary (2009) or TStringList (pre 2009) would be the most likely choice.
Dynamic arrays are also a possiblity, but if you use them you will want to minimize the use of SetLength as each time it is called it will re-allocate memory. TList manages this for you, which is why I suggested using a TList. if you KNOW how many you will deal with in advance, then use a dynamic array, and set its length on the onset.
If you have more items than will fit in memory then your choices also change. At that point I would either use a database table, or a tFileStream to store the records to be processed, then seek to the beginning of the table/stream for processing.
Try using the AVL-Tree by http://sourceforge.net/projects/alcinoe/ as your associative Array. It has an iterate-method for fast iteration. You may need to derive from his baseclass and implement your own comparator, but it's easy to use.
Examples are included.

With Delphi are you more likely to re-use temporary variables than with other languages?

Since Delphi makes you go all the way up to the var section of a method to declare a local variable, do you find yourself breaking "Curly's Law" (re-using variables) more often than you did in college?(unless of course, you programmed Pascal in college).
If so, what do you do to break yourself of that habit, especially in functions where you need to get and/or set large numbers of properties. Is there a threshold where it is acceptable to declare TempInt : Integer and TempStr : String. (Do you use an 'e' in Temp sometimes and not other times?)
I hardly ever reuse variables. I hate to say never, but it is close to never.
Here is why:
Small methods (It's good practice to keep methods and property-getters/setters as concise as possible).
When only one thing is done, no need to reuse variables
The var section is always on the screen.
The compiler reuses the storage as necessary, so reuse is only a lazy coder crutch with no performance improvements.
Newer versions of Delphi have CTRL+SHIFT+V to declare a variable if I am feeling lazy.
Reusing variables makes debugging more difficult - more time and effort is spent on maintenance then development (for any serious application) so always do things to make maintenance easier, even if it makes development a little harder.
Prefer user defined types, so a Account Balance is a specific type, not just a Currency. This means variables are less reusable anyway.
For loop variables (a common reused variable) are used less now that we can use for in and skip the iterator all together.
My variables have descriptive names, so it would not make sense to use them out of context.
Generally speaking, I like having all the variables at the top for the same reason I like having an interface section on my units. It is kind of like having an abstract on a paper - give me a general idea of what is going on without having to read the whole paper. Delphi could benefit from having the ability to declare variables at "inner scope" like within a for loop or other begin / end blocks, but I don't know how much that would distract from the cleanliness and readability of Delphi code.
This is just a matter of discipline. Yes, Delphi would probably be better served by inline variable declaration, but that's not really a big deal. Just be sure to name your variables in a descriptive way, and then it will just feel awkward to use them incorrectly. And, as Stephan Eggermont said, if your methods are really getting that long, then that's a whole different code smell.
Not really. As I make my methods really small the var section is not far away. As my method size has reduced a lot since university, I'd say I break it less often.
I definitely do tend to re-use local variables like 'Findex' (or just plain 'i') if the routine has several distinct iterative sections to it. Not really the best practice I guess, but I'd like to think it's only really obvious where I do it, and obviously the usage doesn't overlap.
It's not usually a big deal to go back to the top of the routine and key-in the new variables, though I didn't know about Ctrl-Shift-V (will be trying that later!).
It'll be interesting to see what everyone else says. :-)
I don't tend to reuse local vars as a general safety rule. I do love the new "var" live template stuff in d2007+. Just type var[tab] and the helper pops up. Also check out Ctrl-Shift-D (others mentioned Ctrl-Shift-V for local vars) to declare a field.
Declaring variables is very simple - some times they would get automatically created ('for' loop template), other times you can just use 'Declare Variable' refactoring (or 'Add Local Var' if you are using MMX - as you should).
You can develop your own style of coding that uses variables as required. I generally use unique vars (90%) with a few temp vars (10%) when required.
It depends on the nature of the var. If it is a var to help support other code (counter for loops, building SQL strings, etc.) then a temp var you can re-use is helpful. In this case your temp vars are useful as "disposable" vars in sections of code. Just add a comment to your var declarations indicating the temp vars.
i.e. //temp vars are re-used as required in this procedure --> clear/re-initialise them after/before use.
Other than that I avoid temp vars & never use them to hold critical data. A unique var should be used then to avoid confusion & make readability/maintenance of code clearer.
I think delphi makes the exception with the overuse of temp variables. Most of the time when i'm creating a function/procedure where i know i will need loops or temp strings, first thing i do is to create a var i,j:integer; tmp:string; and add more as needed :)
As a long time Delphi user (since 1.0) this is the major thing I hate about Pascal. all other modern languages support definition at the point of use, yet Delphi persists with the var section, and Delphi programmers persist in ridiculous hand-waving antics to justify it.
Well Curly did have a good point. I'm a sinner in that respect occasionally. Usually just a temp string variable for convenience more than anything.
To be honest I've never really thought about it... until now. I have no issue with the VAR section being where it is as that's been a habit formed since Delphi 1.0.
To answer the question, I only re-use a temp variable, usually a string, and usually only to gain some slight performance improvements. Don't have an issue with that.
I probably would have found this to be a bigger problem if hadn't had CTRL-SHIFT-V as a shortcut to the VAR section. I'm not writing GIGANTIC methods here, but sometimes they get a little out of hand (and I can justify this of course) and it helps a lot. I'm not sure if that shortcut comes from cnTools or GExperts, but they're both pretty useful and I'd recommend them both.

Resources