I'm writing a small tool to do some manipulation of SWF files, using Delphi XE2. So far, I'm simply following the SWF specification, and now I've hit a small problem in implementing a data structure to represent shapes.
SWF shapes contain a number of shape records. Shape records may be edge records or non-edge records, and each of those two types have two additional subtypes.
Specifically, on page 135 of the specification, the two non-edge record types are described; EndShapeRecord and StyleChangeRecord. In the SWF file, the way to differentiate between these is to check if all five flag bits (after TypeFlag) are 0; if they are, it's an EndShapeRecord, otherwise it's a StyleChangeRecord.
To help me process the shape records later on, I would like to unify edge and non-edge records into a single record type, using a variant record. Distinguishing between the different kinds of records is easy enough; a nested variant record allows me to easily to this the edge records apart, and for the non-edge records, I can declare the 5 flags from the StyleChangeRecord and write a function IsEndRecord.
However, in the interest of making my source code reflect the specification as closely as possible, I'd like to go one step further. The presence of the other fields in a StyleChangeRecord are predicated on the values of these 5 flags, so I would like to be able to declare 5 variant records, one per flag, which contain the fields added by each flag. (I realize this will not affect the memory usage in any way, but that's not the point.)
Unfortunately, Delphi doesn't seem to allow more than one variant part per "level", and attempting to define these 5 variant parts at the same level just throws a ton of syntax errors.
TShapeRecord = record
case EdgeRecord: Boolean of
False: (
case StateMoveTo: Boolean of
True: (
MoveBits: Byte;
MoveDeltaX: Int32;
MoveDeltaY: Int32;
);
case StateLineStyle: Boolean of // << Errors start here
True: (LineStyle: UInt16);
//Additional flags
);
//Fields for edge records
end;
In slightly simpler terms, the goal is to be able to formulate a record like so:
TNonEdgeRecord = record
case StateMoveTo: Boolean of
True: (
MoveBits: Byte;
MoveDeltaX: Int32;
MoveDeltaY: Int32;
);
case StateLineStyle: Boolean of
True: (LineStyle: UInt16);
end;
...without removing the variant parts of the record, and without nesting them (since nesting would imply an incorrect relation from a syntactical point of view).
Is there some other way I can declare multiple (non-nested) variant parts in a record, or should I just go back to not using variant records for the inner part?
No. The Borland branch of Pascal only allows variant parts at the end of a record.
Nesting is the only way.
For some interesting examples and observations, see this article by Rudy Velthuis:
http://rvelthuis.de/articles/articles-convert.html (search for the "union" part)
Related
While reviewing some code in our legacy Delphi 7 program, I noticed that everywhere there is a record it is marked with packed. This of course means that the record is stored byte-for-byte and not aligned to be faster for the CPU to access. The packing seems to have been done blindly as an attempt to outsmart the compiler or something -- basically valuing a few bytes of memory instead of faster access
An example record:
TFooTypeRec = packed record
RID : Integer;
Description : String;
CalcInTotalIncome : Boolean;
RequireAddress : Boolean;
end;
Should I fix this and make every record normal or "not" packed? Or with modern CPUs and memory is this negligible and probably a waste of time? Are there any problems that can result from unpacking?
There is no way to answer this question without a full understanding of how each of those packed records are used in your application code. It is the same as asking "Should I change this variable declaration from Int64 to Byte ?"
Without knowing what values that variable will be expected and required to maintain the answer could be yes. Or it could be no.
Similarly in your case. If a record needs to be packed then it should be left packed. If it does not need to be packed then there is no harm in not packing it. If you are not sure or cannot tell, then the safest course is to leave them as they are.
As a guide to making this determination (should you decide to proceed), situations where record packing is required or recommended include:
persistence of record values
sharing of record values with [potentially] differently compiled code
strict compatibility with externally defined structures
deliberately overlaying a type layout over differently structured memory
This isn't necessarily an exhaustive list, and what these all have in common is:
records comprising a series of values in adjacent bytes that must and can be relied upon by any potential producer or consumer of the record without possibility of interference from the compiler or other factors
What I would recommend is that (if possible and practical) you determine what purpose packing serves in each case and add documentation to that effect to the record declaration itself so that anyone in the future with the same question doesn't have to go through that discovery process, e.g.:
type
TSomeRecordType = packed record
// This record must be packed as it is used for persistence
..
end;
TSomeExternType = packed record
// This record must be packed as it is required to be compatible
// in memory with an externally defined struct (ref: extern code docs)
..
end;
The main idea of using packed records is not that you save a few bytes of memory! Instead, it is about guaranteeing that the variables are where you expect them to be in memory. Without such a guarantee, it would be impossible (or, at least, difficult) to manage memory manually on the heap and write to and read from files.
Hence, the program might malfunction if you 'unpack' the records!
If the record is stored/retrieved as packed or transfered in any way to a receiver that expects it to be packed, then do not change it.
Update :
There is a string type declared in your example. It looks suspicious, since storing the record in a binary file will not preserve the string content.
Packed record have length exactly size like members are.
No packed record are optimised (thay are aligned -> consequently higher) for better performance.
An error occurred while playing the last record in the table - At beginning of table How to fix it.
procedure TForm1.btnNextClick(Sender: TObject);
begin
self.ListBox1.ItemIndex := Random(ListBox1.Items.Count) - 0 ;
AddALL();
begin
ClientDataSet1.RecNo:=Random(ClientDataSet1.RecordCount) - 0;
PlayFile(self.exePath + '\' + self.ClientDataSet1.FieldByName('mp3').AsString, MediaPlayer1,Image2);
end
end;
Val Marinov seems to have given you a good answer to your question.
I just want to add
some points that don't directly answer your question but may help you avoid making some mistakes.
You have some code
self.ListBox1.ItemIndex := Random(ListBox1.Items.Count)
which you want to use to set the listbox's ItemIndex to a random, valid value. There are a couple of things which are asking for trouble about this:
1. Wrong way to use Random
The online help for the Random function says
In Delphi code, Random returns a random number within the range 0 <= X < Range. If Range is not specified, the result is a real-type random number within the range
0 <= X < 1.
For a ListBox, the range of valid ItemIndex values is 0..Items.Count - 1. But Random can return a fractional part, so a better way to write what you want is:
ListBox1.ItemIndex := Trunc(Random(ListBox1.Items.Count));
Called like that, Random will return a value below ListBox1.Items.Count, and the call to Trunc discards the fractional part.
2. Unnecessary use of self.
Your code is liberally sprinkled with the self qualifier. Having to use self like that is usually a sign of bad or sloppy coding.
In your TForm1.AddALL, the self in the first line tells the compiler that the instance of ListBox1 you are referring to is the one which is the TListBox component on your TForm1, rather than some other ListBox1 variable which may also be in scope (e.g. a global variable called ListBox1) when the line is compiled. But the way to avoid that problem is to avoid having the other ListBox1 in scope in the first place.
I suggest you simply delete all the instances of self., because you shouldn't need to have them.
3. Avoid setting dataset RecordNumber
Finally, don't get into the habit of relying on the fact that TClientDataSet allows you to specify a value for RecordNumber, it is rarely a good idea and few dataset types support it.
If you want to go to a random record, better use
Dataset.First;
DataSet.MoveBy(Random(X));
I leave it to you to work out what the argument X to Random should be, to move to a valid, random, record, based on what the online help says about Random.
Record Numbers
Client datasets support a second way of moving directly to a given
record in the dataset: setting the RecNo property of the dataset.
RecNo is a one-based number indicating the sequential number of the
current record relative to the beginning of the dataset.
You can read the RecNo property to determine the current absolute
record number, and write the RecNo property to set the current record.
There are two important things to keep in mind with respect to RecNo:
Attempting to set RecNo to a number less than one, or to a number greater than the number of records in the dataset results in an At
beginning of table, or an At end of table exception, respectively.
The record number of any given record is not guaranteed to be constant. For instance, changing the active index on a dataset alters
the record number of all records in the dataset.
NOTE
You can determine the number of records in the dataset by inspecting
the dataset's RecordCount property. When setting RecNo, never attempt
to set it to a number higher than RecordCount.
See : http://docs.embarcadero.com/products/rad_studio/delphiAndcpp2009/HelpUpdate2/EN/html/delphivclwin32/DB_TDataSet_RecNo.html
What do you people use for generating unique account numbers?
Some use Autoinc field, others something else...
What would be the proper way i.e to get an account number before I run the insert query?
If you are using a SQL database, use a Generator. If you want to use an independent mechanism you could consider using a GUID.
You haven't told us what database system you are using, but from the sound of it, you're talking about the Paradox tables in Delphi. If so, an autoInc column can work, although if I recall correctly, you have to be very careful when moving your data around with Paradox autoInc columns because they re-generate from zero when moved.
As has been mentioned, you can use GUIDs - sysutils.function CreateGUID(out Guid: TGUID): HResult; - they will always be unique, but the downside in GUIDS is that ordering by these keys will not be intuitive and probably be meaningless, so you'll need a timestamp column of some sort to maintain the order of your inserts, which can be important. Also, a GUID is a rather long character string and will not be very efficient for use as an account#, which assumedly will be a primary or foreign key in many tables.
So I'd stick to autoInc if you want something automatic, but if you have to move data around and you need to maintain your original keys, load your original autoincs as integer columns in their new location or you could end up corrupting your entire database. (I believe there are other scenarios that also cause autoIncs to reset in Paradox tables - research this if it's relevant - been a long time since I've used Pdox, and it may not be a problem with other flat file databases)
If you are indeed using a database server - SQLServer, Oracle, Interbase, etc, they all have autoInc/indentity or generator functionality, sometimes in conjuction with a trigger - that is your best option.
Dorin's answer is also an excellent solution if you want to handle this yourself from within your Delphi code. Create a global, thread safe function to implement it - that will ensure a very high level of safety.
HTH
Depending on how long you want the number, you can go with Jamies MD5 conversion or:
var
LDateTime: TDateTime;
LBytes: array[0..7] of Byte absolute LDateTime;
LAccNo: string;
Index: Integer;
begin
LDateTime := Now;
LAccNo := EmptyStr;
for Index := 0 to 7 do
LAccNo := LAccNo + IntToHex( LBytes[ Index ], 2 );
// now you have a code in LAccNo, use it wisely (:
end;
I use this PHP snippet to generate a decent account number:
$account_number = str_replace(array("0","O"),"D",strtoupper(substr(md5(time()),0,7)));
This will create a 7 digit varchar string that doesn't contain 0's or o's (to avoid errors on the phone or transcribing them in e-mails, etc.) You get something like EDB6DA6 or 76337D5 or DB2E624.
I have an application which may needs to process billions of objects.Each object of is of TRange class type. These ranges are created at different parts of an algorithm which depends on certain conditions and other object properties. As a result, if you have 100 items, you can't directly create the 100th object without creating all the prior objects. If I create all the (billions of) objects and add to the collection, the system will throw Outofmemory error. Now I want to iterate through each object mainly for two purposes:
To apply an operation for each TRange object(eg:Output certain properties)
To get a cumulative sum of a certain property.(eg: Each range has a weight property and I want to retreive totalweight that is a sum of all the range weights).
How do I effectively create an Iterator for these object without raising Outofmemory?
I have handled the first case by passing a function pointer to the algorithm function. For eg:
procedure createRanges(aProc: TRangeProc);//aProc is a pointer to function that takes a //TRange
var range: TRange;
rangerec: TRangeRec;
begin
range:=TRange.Create;
try
while canCreateRange do begin//certain conditions needed to create a range
rangerec := ReturnRangeRec;
range.Update(rangerec);//don't create new, use the same object.
if Assigned(aProc) then aProc(range);
end;
finally
range.Free;
end;
end;
But the problem with this approach is that to add a new functionality, say to retrieve the Total weight I have mentioned earlier, either I have to duplicate the algorithm function or pass an optional out parameter. Please suggest some ideas.
Thank you all in advance
Pradeep
For such large ammounts of data you need to only have a portion of the data in memory. The other data should be serialized to the hard drive. I tackled such a problem like this:
I Created an extended storage that can store a custom record either in memory or on the hard drive. This storage has a maximum number of records that can live simultaniously in memory.
Then I Derived the record classes out of the custom record class. These classes know how to store and load themselves from the hard drive (I use streams).
Everytime you need a new or already existing record you ask the extended storage for such a record. If the maximum number of objects is exceeded, the storage streams some of the least used record back to the hard drive.
This way the records are transparent. You always access them as if they are in memory, but they may get loaded from hard drive first. It works really well. By the way RAM works in a very similar way so it only holds a certain subset of all you data on your hard drive. This is your working set then.
I did not post any code because it is beyond the scope of the question itself and would only confuse.
Look at TgsStream64. This class can handle a huge amounts of data through file mapping.
http://code.google.com/p/gedemin/source/browse/trunk/Gedemin/Common/gsMMFStream.pas
But the problem with this approach is that to add a new functionality, say to retrieve the Total weight I have mentioned earlier, either I have to duplicate the algorithm function or pass an optional out parameter.
It's usually done like this: you write a enumerator function (like you did) which receives a callback function pointer (you did that too) and an untyped pointer ("Data: pointer"). You define a callback function to have first parameter be the same untyped pointer:
TRangeProc = procedure(Data: pointer; range: TRange);
procedure enumRanges(aProc: TRangeProc; Data: pointer);
begin
{for each range}
aProc(range, Data);
end;
Then if you want to, say, sum all ranges, you do it like this:
TSumRecord = record
Sum: int64;
end;
PSumRecord = ^TSumRecord;
procedure SumProc(SumRecord: PSumRecord; range: TRange);
begin
SumRecord.Sum := SumRecord.Sum + range.Value;
end;
function SumRanges(): int64;
var SumRec: TSumRecord;
begin
SumRec.Sum := 0;
enumRanges(TRangeProc(SumProc), #SumRec);
Result := SumRec.Sum;
end;
Anyway, if you need to create billions of ANYTHING you're probably doing it wrong (unless you're a scientist, modelling something extremely large scale and detailed). Even more so if you need to create billions of stuff every time you want one of those. This is never good. Try to think of alternative solutions.
"Runner" has a good answer how to handle this!
But I would like to known if you could do a quick fix: make smaller TRange objects.
Maybe you have a big ancestor? Can you take a look at the instance size of TRange object?
Maybe you better use packed records?
This part:
As a result, if you have 100 items,
you can't directly create the 100th
object without creating all the prior
objects.
sounds a bit like calculating Fibonacci. May be you can reuse some of the TRange objects instead of creating redundant copies? Here is a C++ article describing this approach - it works by storing already calculated intermediate results in a hash map.
Handling billions of objects is possible but you should avoid it as much as possible. Do this only if you absolutely have to...
I did create a system once that needed to be able to handle a huge amount of data. To do so, I made my objects "streamable" so I could read/write them to disk. A larger class around it was used to decide when an object would be saved to disk and thus removed from memory. Basically, when I would call an object, this class would check if it's loaded or not. If not, it would re-create the object again from disk, put it on top of a stack and then move/write the bottom object from this stack to disk. As a result, my stack had a fixed (maximum) size. And it allowed me to use an unlimited amount of objects, with a reasonable good performance too.
Unfortunately, I don't have that code available anymore. I wrote it for a previous employer about 7 years ago. I do know that you would need to write a bit of code for the streaming support plus a bunch more for the stack controller which maintains all those objects. But it technically would allow you to create an unlimited number of objects, since you're trading RAM memory for disk space.
In a Delphi application we are working on we have a big structure of related objects. Some of the properties of these objects have values which are calculated at runtime and I am looking for a way to cache the results for the more intensive calculations. An approach which I use is saving the value in a private member the first time it is calculated. Here's a short example:
unit Unit1;
interface
type
TMyObject = class
private
FObject1, FObject2: TMyOtherObject;
FMyCalculatedValue: Integer;
function GetMyCalculatedValue: Integer;
public
property MyCalculatedValue: Integer read GetMyCalculatedValue;
end;
implementation
function TMyObject.GetMyCalculatedValue: Integer;
begin
if FMyCalculatedValue = 0 then
begin
FMyCalculatedValue :=
FObject1.OtherCalculatedValue + // This is also calculated
FObject2.OtherValue;
end;
Result := FMyCalculatedValue;
end;
end.
It is not uncommon that the objects used for the calculation change and the cached value should be reset and recalculated. So far we addressed this issue by using the observer pattern: objects implement an OnChange event so that others can subscribe, get notified when they change and reset cached values. This approach works but has some downsides:
It takes a lot of memory to manage subscriptions.
It doesn't scale well when a cached value depends on lots of objects (a list for example).
The dependency is not very specific (even if a cache value depends only on one property it will be reset also when other properties change).
Managing subscriptions impacts the overall performance and is hard to maintain (objects are deleted, moved, ...).
It is not clear how to deal with calculations depending on other calculated values.
And finally the question: can you suggest other approaches for implementing cached calculated values?
If you want to avoid the Observer Pattern, you might try to use a hashing approach.
The idea would be that you 'hash' the arguments, and check if this match the 'hash' for which the state is saved. If it does not, then you recompute (and thus save the new hash as key).
I know I make it sound like I just thought about it, but in fact it is used by well-known softwares.
For example, SCons (Makefile alternative) does it to check if the target needs to be re-built preferably to a timestamp approach.
We have used SCons for over a year now, and we never detected any problem of target that was not rebuilt, so their hash works well!
You could store local copies of the external object values which are required. The access routine then compares the local copy with the external value, and only does the recalculation on a change.
Accessing the external objects properties would likewise force a possible re-evaluation of those properties, so the system should keep itself up-to-date automatically, but only re-calculate when it needs to. I don't know if you need to take steps to avoid circular dependencies.
This increases the amount of space you need for each object, but removes the observer pattern. It also defers all calculations until they are needed, instead of performing the calculation every time a source parameter changes. I hope this is relevant for your system.
unit Unit1;
interface
type
TMyObject = class
private
FObject1, FObject2: TMyOtherObject;
FObject1Val, FObject2Val: Integer;
FMyCalculatedValue: Integer;
function GetMyCalculatedValue: Integer;
public
property MyCalculatedValue: Integer read GetMyCalculatedValue;
end;
implementation
function TMyObject.GetMyCalculatedValue: Integer;
begin
if (FObject1.OtherCalculatedValue <> FObjectVal1)
or (FObject2.OtherValue <> FObjectVal2) then
begin
FMyCalculatedValue :=
FObject1.OtherCalculatedValue + // This is also calculated
FObject2.OtherValue;
FObjectVal1 := FObject1.OtherCalculatedValue;
FObjectVal2 := Object2.OtherValue;
end;
Result := FMyCalculatedValue;
end;
end.
In my work I use Bold for Delphi that can manage unlimited complex structures of cached values depending on each other. Usually each variable only holds a small part of the problem. In this framework that is called derived attributes. Derived because the value is not saved in the database, It just depends on on other derived attributes or persistant attributes in the database.
The code behind such attribute is written in Delphi as a procedure or in OCL (Object Constraint Language) in the model. If you write it as Delphi code you have to subscribe to the depending variables. So if attribute C depends on A and B then whenever A or B changes the code for recalc C is called automatically when C is read. So the first time C is read A and B is also read (maybe from the database). As long as A and B is not changed you can read C and got very fast performance. For complex calculations this can save quite a lot of CPU-time.
The downside and bad news is that Bold is not offically supported anymore and you cannot buy it either. I suppose you can get if you ask enough people, but I don't know where you can download it. Around 2005-2006 it was downloadable for free from Borland but not anymore.
It is not ready for D2009 as someone have to port it to Unicode.
Another option is ECO with dot.net from Capable Objects. ECO is a plugin in Visual Studio. It is a supported framwork that have the same idea and author as Bold for Delphi. Many things are also improved, for example databinding is used for the GUI-components. Both Bold and ECO use a model as a central point with classes, attributes and links. Those can be persisted in a database or a xml-file. With the free version of ECO the model can have max 12 classes, but as I remember there is no other limits.
Bold and ECO contains lot more than derived attributes that makes you more productive and allow you to think on the problem instead of technical details of database or in your case how to cache values. You are welcome with more questions about those frameworks!
Edit:
There is actually a download link for Embarcadero registred users for Bold for Delphi for D7, quite old... I know there was updates for D2005, ad D2006.