Delphi XE2: How to use sets of integers with ordinal values > 255 - delphi

All I want to do is to define a set of integers that may have values above 255, but I'm not seeing any good options. For instance:
with MyObject do Visible := Tag in [100, 155, 200..225, 240]; // Works just fine
but
with MyObject do Visible := Tag in [100, 201..212, 314, 820, 7006]; // Compiler error
I've gotten by with (often lengthy) conditional statements such as:
with MyObject do Visible := (Tag in [100, 202..212]) or (Tag = 314) or (Tag = 820) or (Tag = 7006);
but that seems ridiculous, and this is just a hard-coded example. What if I want to write a procedure and pass a set of integers whose values may be above 255? There HAS to be a better, more concise way of doing this.

The base type of a Delphi set must be an ordinal type with at most 256 distinct values. Under the hood, such a variable has one bit for each possible value, so a variable of type set of Byte has size 256 bits = 32 bytes.
Suppose it were possible to create a variable of type set of Integer. There would be 232 = 4294967296 distinct integer values, so this variable must have 4294967296 bits. Hence, it would be of size 512 MB. That's a HUGE variable. Maybe you can put such a value on the stack in 100 years.
Consequently, if you truly need to work with (mathematical) sets of integers, you need a custom data structure; the built-in set types won't do. For instance, you could implement it as an advanced record. Then you can even overload the in operator to make it look like a true Pascal set!
Implementing such a slow and inefficient type is trivial, and that might be good enough for small sets. Implementing a general-purpose integer set data structure with efficient operations (membership test, subset tests, intersection, union, etc.) is more work. There might be third-party code available on the WWW (but StackOverflow is not the place for library recommendations).
If your needs are more modest, you can use a simple array of integers instead (TArray<Integer>). Maybe you don't need O(1) membership tests, subset tests, intersections, and unions?

I would say, that such task already requires a database. Something small and simple like TFDMemTable + TFDLocalSQL should do.

Related

Julia: efficient memory allocation

My program is memory-hungry, so I need to save as much memory as I can.
When you assign an integer value to a variable, the type of the value will always be Int64, whether it's 0 or +2^63-1 or -2^63.
I couldn't find a smart way to efficiently allocate memory, so I wrote a function that looks like this (in this case for integers):
function right_int(n)
types = [Int8,Int16,Int32, Int64, Int128]
for t in reverse(types)
try
n = t(n)
catch InexactError
break
end
end
n
end
a = right_int(parse(Int,eval(readline(STDIN))))
But I don't think this is a good way to do it.
I also have a related problem: what's an efficient way of operating with numbers without worrying about typemins and typemaxs? Convert each operand to BigInt and then apply right_int?
You're missing the forest for the trees. right_int is type unstable. Type stability is a key concept in reducing allocations and making Julia fast. By trying to "right-size" your integers to save space, you're actually causing more allocations and higher memory use. As a simple example, let's try making a "right-sized" array of 100 integers from 1-100. They're all small enough to fit in Int8, so that's just 100 bytes plus the array header, right?
julia> #allocated [right_int(i) for i=1:100]
26496
Whoa, 26,496 bytes! Why didn't that work? And why is there so much overhead? The key is that Julia cannot infer what the type of right_int might be, so it has to support any type being returned:
julia> typeof([right_int(i) for i=1:100])
Array{Any,1}
This means that Julia can't pack the integers densely into the array, and instead represents them as pointers to 100 individually "boxed" integers. These boxes tell Julia how to interpret the data that they contain, and that takes quite a bit of overhead. This doesn't just affect arrays, either — any time you use the result of right_int in any function, Julia can no longer optimize that function and ends up making lots of allocations. I highly recommend you read more about type stability in this very good blog post and in the manual's performance tips.
As far as which integer type to use: just use Int unless you know you'll be going over 2 billion. In the cases where you know you need to support huge numbers, use BigInt. It's notable that creating a similar array of BigInt uses significantly less memory than the "right-sized" array above:
julia> #allocated [big(i) for i=1:100]
6496

Getting length of vector in SPSS

I have an sav file with plenty of variables. What I would like to do now is create macros/routines that detect basic properties of a range of item sets, using SPSS syntax.
COMPUTE scale_vars_01 = v_28 TO v_240.
The code above is intended to define a range of items which I would like to observe in further detail. How can I get the number of elements in the "array" scale_vars_01, as an integer?
Thanks for info. (as you see, the SPSS syntax is still kind of strange to me and I am thinking about using Python instead, but that might be too much overhead for my relatively simple purposes).
One way is to use COUNT, such as:
COUNT Total = v_28 TO v_240 (LO THRU HI).
This will count all of the valid values in the vector. This will not work if the vector contains mixed types (e.g. string and numeric) or if the vector has missing values. An inefficient way to get the entire count using DO REPEAT is below:
DO IF $casenum = 1.
COMPUTE Total = 0.
DO REPEAT V = v_28 TO V240.
COMPUTE Total = Total + 1.
END REPEAT.
ELSE.
COMPUTE Total = LAG(Total).
END IF.
This will work for mixed type variables, and will count fields with missing values. (The DO IF would work the same for COUNT, this forces a data pass, but for large datasets and large lists will only evaluate for the first case.)
Python is probably the most efficient way to do this though - and I see no reason not to use it if you are familiar with it.
BEGIN PROGRAM.
import spss
beg = 'X1'
end = 'X10'
MyVars = []
for i in xrange(spss.GetVariableCount()):
x = spss.GetVariableName(i)
MyVars.append(x)
len = MyVars.index(end) - MyVars.index(beg) + 1
print len
END PROGRAM.
Statistics has a built-in macro facility that could be used to define sets of variables, but the Python apis provide much more powerful ways to access and use the metadata. And there is an extension command SPSSINC SELECT VARIABLES that can define macros based on variable metadata such as patterns in names, measurement level, type, and other properties. It generates a macro listing these variables that can then be used in standard syntax.

Issues with a published property of a large enum set

I'm creating a component with many published properties for the IDE, and one of such properties is an enum set with 38 values...
type
TBigEnum = (beOne, beTwo, beThree, beFour, beFive, beSix, beSeven, beEight,
beNine, beTen, beEleven, beTwelve, beThirteen, beFourteen, beFifteen,
beSixteen, beSeventeen, beEighteen, beNineteen, beTwenty, beTwentyOne,
beTwentyTwo, beTwentyThree, beTwentyFour, beTwentyFive, beTwentySix,
beTwentySeven, beTwentyEight, beTwentyNine, beThirty, beThirtyOne,
beThirtyTwo, beThirtyThree, beThirtyFour, beThirtyFive, beThirtySix,
beThirtySeven, beThirtyEight);
TBigEnums = set of TBigEnum;
Now, I try to use this in a component as a published property...
type
TMyComponent = class(TComponent)
private
FBigEnums: TBigEnums;
published
property BigEnums: TBigEnums read FBigEnums write FBigEnums;
end;
But the compiler does not let me...
[DCC Error] MyUnit.pas(50): E2187 Size of published set 'BigEnums' is >4 bytes
I understand this limitation, but how can I get around this without splitting it into two different sets?
PS - Each of these values actually has a unique name and purpose, but for the sake of example I just used the number as their names.
I don't remember the exactly correct syntax but in principle:
1 If the "property" does not have to be easily editable in the property inspector then definining 38 different consts of type Long with their values set to 1 shl 0, 1 shl 1, 1 shl 2..
so that those consts can be combined like this PropOne or PropTwo or PropThree
2 if the property must be editable in the property inspector then the TMyPersistent class proposed in Jerry's answer seems ok to me
3 there might be a way built-into the language (or compiler directive) how to type-cast the set representation so that it uses 8 bytes for storage. Int32 and Int64 are both native data types well supported on new processors and both assembler, C++ and C# can deal with it.
Some Pascal flavor (Free Pascal?) either has it implemented or it was in the road map
EDIT option 3 seems to be misleading. What can Free Pascal compiler do regarding enums is listed here http://www.freepascal.org/docs-html/prog/prog.html, especially in chapter $PACKENUM. As of today enums are always backed by 32bit ordinals. So the possibility to increase number of bits used for "enums" possible in assembler, C++, C# is not likely to be available in Delphi.
I'm not even sure if bitwise operators and, or, not, shl, shr used in other languages to implement enums and sets are available for 8-byte integers either in Delphi or in Free Pascal so the option 1 might also be misleading and the winner is option 2

How to visualize the value of a pointer while debugging in Delphi?

So, I have a variable buffPtr: TPointer
It has a size of 16 and contains a series of numbers, mostly starting with 0, say something like 013854351387365.
I'm sure it contains values, because the application does what it does fine.
I want to see this value while I'm debugging.
If I add "PAnsiChar(buffPtr)^" to the watches I only see the first byte.
Just type in the watch expression PAnsiChar(buffPtr)^,16 or PByte(buffPtr)^,16 if you want the ordinal/byte values.
The trick here is to add the number of pattern repeat after a comma, like ,16.
It is IMHO more convenient than changing the Watch Properties, and it works with the F7 evaluation command of the IDE.
I added a watch to
PAnsiChar(buffPtr)^
with the Watch Properties as
Repeat Count = 16
Decimal
Did you set the watch do dump a region of memory? For some structures that helps.
If you can recompile your application, then define this:
type
T16Values = array[0..15] of Byte;
P16Values = ^T16Values;
Then cast your pointer into a P16Values, and view that.
If it is another data type than Byte, change the above code accordingly.

How to judge number of buckets for TBucketList

I've been using the TBucketList and TObjectBucketList for all my hashing needs, but never experiemented with switching the number of buckets. I vaguely remember what this means from Data Structures class, but could someone elaborated on the nuances of this particular class in Delphi
The following table lists the possible values:
Value Number of buckets
bl2 2
bl4 4
bl8 8
bl16 16
bl32 32
bl64 64
bl128 128
bl256 256
TBucketList and TObjectBucketList store pointers. The hash function they use simply masks out the upper bits of the address. How many bits get masked out depends on how many buckets the object has. If you use bl2, for example, 31 bits get masked out and only one bit of the address determines the bucket. With bl256, an entire byte of the pointer gets used. It's one of the middle two bytes. The trade-off is simply the number of buckets you'll have. A bucket only takes eight bytes, so having 256 of them isn't a huge cost.
Aside from that, TBucketList is just an ordinary hash table like you learned about in your data-structure class.
TIntegerBucketList uses the same hash function as the others. If you want a more sophisticated hash function, write a descendant of TCustomBucketList and override the BucketFor method. In your descendant class, you can also assign the protected BucketCount property to use something other than the counts provided by TBucketList. Note that the class makes no effort to redistribute items due to a change in the bucket count, so don't re-assign BucketCount after items have already been added to the list unless you plan to do the redistribution yourself.

Resources