How are value equality comparisons executed in a program? - comparison

How does the computer execute value equality comparisons? Does it compare the values bit by bit, starting from the smallest bit, and stop once two different bits are encountered? Or does it start from the highest bit? Does it go through all bits regardless of where/when two unlike bits are found?

When you write an equality comparison in a higher level language (e.g. c), it will be transformed to intermediary representation and then to the instructions of a particular platform this code will be executed upon. Compiler is free to implement the equality comparison using any of the instructions available on a target architecture. The idea is usually to make it faster.
Different architectures have different instruction sets. Different processors can have varying implementation strategies (again to make things faster), as long as they comply with the spec.
Below are a few examples
x86
CMP command is used to compare two values. Here's an excerpt from Instruction set reference.
Compares the first source operand with the second source operand and sets the status flags in the EFLAGS register according to the results. The comparison is performed by subtracting the second operand from the first operand and then setting the status flags in the same manner as the SUB instruction. When an immediate value is used as an operand, it is sign-extended to the length of the first operand.
This basically means all bits are examined. I guess it was implemented that way to allow non-equality(<,>) comparisons too.
So all bits are examined. In the simplest cases that can be done serially, but can be done faster. See wikibooks on add / subtract blocks.
ARM
TEQ command can be used to test two values for equality.
Here's an excerpt from the infocenter.arm.com
The TEQ instruction performs a bitwise Exclusive OR operation on the value in Rn and the value of Operand2. This is the same as the EORS instruction, except that it discards the result.
Use the TEQ instruction to test if two values are equal without affecting the V or C flags.
Again all bits are examined.

Related

Add to zero...What is it for?

Why such code is used in some applications instead of a MOVE?
add 16 to ZERO giving SOME-RESULT
I spotted this in professionally written code at several spots.
Sorce is on this page
Why such code is used in some applications instead of a MOVE?
add 16 to ZERO giving SOME-RESULT
Without seeing more of the code, it appears that it could be a translation of IBM Assembler to COBOL. In particular, the ZAP (Zero and Add Packed) instruction may be literally translated to the above instruction, particularly if SOME-RESULT is COMP-3. Thus, someone checking the translation could see that the ZAP instruction was faithfully translated.
Or, it could be an assembler programmer's idea of a joke.
Having seen the code, I also note the use of
subtract some-data-item from some-data-item
which is used instead of
move zero to some-data-item
This is consistent with operations used with packed decimal fields in IBM Assembly, where there are no other instructions to accomplish "flexible" moves. By flexible, I mean that the packed decimal instructions contain a length field so that specific size MVC instructions need not be used.
This particular style, being unusual, may be related to catching copyright violations.
From my experience, I'm pretty sure I know the reason why the programmer would have done this. It has something to do with the binary representation of the number.
I bet SOME-RESULT is a packed-decimal (or COMP-3) format number. Let's assume the field is defined like this
05 SOME-RESULT PIC S9(5) COMP-3.
This results in a 3-byte field with a hex representation like this
x'00016C'
The decimal number is encoded as a binary encoded decimal (BCD, one decimal digit per half-byte), and the last half-byte holds the sign.
Let's take a look at how the sign is defined:
if it is one of x'C', x'A', x'F', x'E' (café), then the number is positive
if it is one of x'B', x'D', then the number is negative
any of x'0'..'x'9' are not valid signs, so we can distinguish signed packed-decimals from unsigned.
However, a zoned number (PIC 9(5) DISPLAY) - as in the source code - looks like this:
x'F0F0F0F1F6'
As you can see, each decimal digit is an EBCDIC character with the 'zone' part (the first half-byte) always being x'F'.
Now we get closer to your question!
What happens when we use
MOVE 16 TO SOME-RESULT
If you just MOVE a number to such a field, this results in being compiled into a PACK instruction on the machine code level.
PACK SOME-RESULT,=C'16'
A pack instruction takes a zoned number and packs it by picking only the second half-byte of each byte and storing it in the half-bytes of the packed number - with one exception! When it comes to the last byte, it simply flips the two half-bytes and stores them in the last half-byte of the decimal.
This means that the zone of the last byte of the zoned decimal becomes the sign in the packed decimal:
x'00016F'
So now we have an x'F' as the sign – which is a valid positive sign.
However, what happens if use this Cobol instruction instead
ADD 16 TO ZERO GIVING SOME-RESULT
This compiles into multiple machine level instructions
PACK SOME_RESULT,=C'0'
PACK TEMP,=C'16'
AP SOME_RESULT,TEMP
(or similar - the key point is that is needs an AP somewhere)
This makes a slight difference in the result, because the AP (add packed) instruction always sets the resulting sign to either x'C' for a positive or x'D' for a negative result.
So the difference lies in the sign
x'00016C'
Finally, the question is why would one make this difference? After all, both x'F' and x'C' are valid positive signs. So why care?
There is one situation when this slight difference can cause big problems: When the packed decimal is part of an index key, then we would not get a match, even though the numbers are semantically identical!
Because this situation occurred quite often in older databases like VSAM and DL/I (later: IMS/DB), it became good practice to "normalize" packed decimals if they were part of an index key.
However, some programmers adopted the practice without knowing why, so you may come across code that uses this "normalization" even though the data are not used for index keys.
You might also wonder why a compiler does not optimize out the ADD 16 TO ZERO. I'm pretty sure it once did, but that broke a lot of applications, so this specific optimization was removed again or at least made a non-default option with warnings.
Additional useful info
Note that at least the Enterprise Cobol for z/OS compiler allows you to see exactly the machine code that is produced from your source code if use the LIST compile option (see this example output). I recommend to always compile with options LIST, MAP, OFFSET, XREF because these options enable you find the exact problem in your Cobol source even when you only have a program dump from an abend.
Anyway, good programming practice is not to care about the compiler or the machine code, but about the other programmers who will have to maintain, and thus read and understand the code. Good practice would be to always prefer simple and readable instructions, and to document the reasons (right in the code) when deviating from this rule.
Some programmers like to do things "just because they can". I have a feeling that is what you are seeing here. It makes about as much sense as doing
a := 0 + b
would in go.

How to negate a variable?

I have the below Working-storage variable in my program.
01 W-WRK.
02 W-MNTH-THRSHLD PIC S9(04).
I am using the below COMPUTE function to negate the value of W-MNTH-THRSHLD.
COMPUTE W-MNTH-THRSHLD OF W-WRK =
W-MNTH-THRSHLD OF W-WRK * -1.
I want to know if this approach is right or is there any alternative for the same?
Firstly, why are you using qualification (the OF)? That is only required if you have defined duplicate names. Why define duplicate names in the WORKING-STORAGE?
Secondly, unless you are using a very old COBOL compiler, you should only use the minimum required full-stops/periods in the PROCEDURE DIVISION. That is, one to terminate the paragraph/SECTION label, one to terminate a paragraph/SECTION. One to terminate the PROCEDURE DIVISION header. One to terminate a program (if a full-stop/period is not already there. Keeping extra full-stops/periods around makes it more difficult to copy code around. Put the full-stop/period on a line of its own, so no line of code has one, then you can't accidentally terminate a scope by copying a line of code with a full-stop/period to within a scope.
With those in mind, your code becomes:
COMPUTE W-MNTH-THRSHLD = W-MNTH-THRSHLD
* -1
Multiplication is slower than subtraction. So as Bruce Martin suggested:
COMPUTE W-MNTH-THRSHLD = 0
- W-MNTH-THRSHLD
I do it like this:
SUBTRACT W-MNTH-THRSHLD FROM 0
GIVING W-MNTH-THRSHLD-REV-SIGN
I dislike "destroying" a value just for the heck of it. If the program fails, I know what W-MNTH-THRSHLD contained, plus the meaningful name for the target field explains what the line does.
You could also DIVIDE (or / in COMPUTE), but that is even slower than MULTIPLY (or *).
Also bear in mind that conversions may be required, because you are doing arithmetic with a USAGE DISPLAY field. If you define your field as BINARY or PACKED-DECIMAL conversion is less likely for arithmetic. You won't lose by doing that, unless your compiler can deal with a USAGE DISPLAY in arithmetic without requiring conversion.
Note also, COMPUTE is not a function. COMPUTE is a verb, just a part of the language. "I am using COMPUTE" is sufficient, and not even necessary, as we can see that from the code.

Lua floating point operations

I run Lua on a CPU without dedicated floating point HW, depending on SW emulation.
From luaopt.h I can see that some macros are set to double, but it does not clearly state when floats are used and its a little hard to track it.
If my script does simple stuff like:
a=0
a=a+1
for...
Would that involve a floating point operations at any level?
If no that's fine, but what is then the benefit to change macros to long?
(I tried of course but did not work....)
All numeric operations in Lua are performed (according to the default configuration) in floating point. There is no distinction made between floating point and integer, all values are simply numbers.
The actual C type used to store a Lua number is set in luaconf.h, and it is both allowed and even practical to change that to a suitable integral type. You start by changing LUA_NUMBER from double to int, long, or perhaps ptrdiff_t. Then you will find you need to tweak the related macros that control the conversions between strings and numbers. And, of course, you will likely need to eliminate most or all of the base math library since math.sin() and its friends and neighbors are not particularly useful over integers.
The result will be a Lua interpreter where all numbers are integers. The language will still allow you to type 3.14, but it will be stored as 3. Your code will likely not be completely portable to a Lua interpreter built with the standard configuration since a huge amount of Lua code casually assumes that floating point arithmetic is permitted, and remember that your compiled byte code will definitely not be compatible since byte code will store numbers as LUA_NUMBER.
There is LNUM patch (used, for example, by OpenWrt project which relies heavily on Lua for providing Web UI on hardware without FPU) that allows dual integer/floating point representation of numbers in Lua with conversions happening behind the scenes when required. With it most integer computations will be performed without resorting to FPU. Unfortunately, it's only applicable to Lua 5.1; 5.2 is not supported.

How safe is comparing numbers in lua with equality operator?

In my engine I have a Lua VM for scripting. In the scripts, I write things like:
stage = stage + 1
if (stage == 5) then ... end
and
objnum = tonumber("5")
if (stage == objnum)
According to the Lua sources, Lua uses a simple equality operator when comparing doubles, the internal number type it uses.
I am aware of precision problems when dealing with floating point values, so I want to know if the comparison is safe, that is, will there be any problems with simply comparing these numbers using Lua's default '==' operation? If so, are there any countermeasures I can employ to make sure 1+2 always compares as equal to 3? Will converting the values to strings work?
You may be better off by converting to string and then comparing the results if you only care about equality in some cases. For example:
> print(21, 0.07*300, 21 == 0.07*300, tostring(21) == tostring(0.07*300))
21 21 false true
I learned this hard way when I gave my students an assignment with these numbers (0.07 and 300) and asked them to implement a unit test that then miserably failed complaining that 21 is not equal 21 (it was comparing actual numbers, but displaying stringified values). It was a good reason for us to have a discussion about comparing floating point values.
I can employ to make sure 1+2 always compares as equal to 3?
You needn't worry. The number type in Lua is double, which can hold many more integers exactly than a long int.
Comparison and basic operations on doubles is safe in certain situations. In particular if the numbers and their result can be expressed exactly - including all low value integers.
So 2+1 == 3 will be fine for doubles.
NOTE: I believe there's even some guarantees for certain math functions ( like pow and sqrt ) and if your compiler/library respects those then sqrt(4.0)==2.0 or 4.0 == pow(2.0,2.0) will reliably be true.
By default, Lua is compiled with c++ floats, and behind the scenes number comparisons boils down to float comparisons in c/c++, which are indeed problematic and discussed in several threads, e.g. most-effective-way-for-float-and-double-comparison.
Lua makes the situation only slightly worse by converting all numbers, including c++ integers, into floats. So you need to keep it in mind.

What is the difference between a Binary and a Bitstring in Erlang?

In the Erlang shell, I can do the following:
A = 300.
300
<<A:32>>.
<<0, 0, 1, 44>>
But when I try the following:
B = term_to_binary({300}).
<<131,104,1,98,0,0,1,44>>
<<B:32>>
** exception error: bad argument
<<B:64>>
** exception error: bad argument
In the first case, I'm taking an integer and using the bitstring syntax to put it into a 32-bit field. That works as expected. In the second case, I'm using the term_to_binary BIF to turn the tuple into a binary, from which I attempt to unpack certain bits using the bitstring syntax. Why does the first example work, but the second example fail? It seems like they're both doing very similar things.
The difference between in a binary and a bitstring is that the length of a binary is evenly divisible by 8, i.e. it contains no 'partial' bytes; a bitstring has no such restriction.
This difference is not your problem here.
The problem you're facing is that your syntax is wrong. If you would like to extract the first 32 bits from the binary, you need to write a complete matching statement - something like this:
<<B1:32, _/binary>> = B.
Note that the /binary is important, as it will match the remnant of the binary regardless of its length. If omitted, the matched length defaults to 8 (i.e. one byte).
You can read more about binaries and working with them in the Erlang Reference Manual's section on bit syntax.
EDIT
To your comment, <<A:32>> isn't just for integers, it's for values. Per the link I gave, the bit syntax allows you to specify many aspects of binary matching, including data types of bound variables - while the default type is integer, you can also say float or binary (among others). The :32 part indicates that 32 bits are required for a match - that may or may not be meaningful depending on your data type, but that doesn't mean it's only valid for integers. You could, for example, say <<Bits:10/bitstring>> to describe a 10-bit bitstring. Hope that helps!
The <<A:32>> syntax constructs a binary. To deconstruct a binary, you need to use it as a pattern, instead of using it as an expression.
A = 300.
% Converts a number to a binary.
B = <<A:32>>.
% Converts a binary to a number.
<<A:32>> = B.

Resources