I've encountered an array of floating point values that I can't make print out in a reasonable way with the old tried and true printf() function. The problem, I guess, is that the range of numbers is huge... from tiny numbers like -3.66542e-296 to +9.5543e+301 and lots of values in between.
Normally values are more related to each other and something like %23.16f will work. But with these huge numbers the f specifier doesn't work, because some numbers print out dozens to hundreds of digits (overflowing the size specification). This leaves the e format (or the g format which lets printf() switch back and forth between e and f formats).
When forced to adopt e or g specifier due to large range of values, is there any way to:
make the decimal points of all values align over each other.
make the e (of the exponent) of all values align over each other.
make the number of digits following the e be fixed (always the same).
For almost any purposes, #1 is the best option - alignment with the . is most often helpful. But it seems impossible to make nice neat, readable columns in any way whatsoever in this situation... unless I'm missing something.
Related
A simple question that turned out to be quite complex:
How do I turn a float to a String in GForth? The desired behavior would look something like this:
1.2345e fToString \ takes 1.2345e from the float stack and pushes (addr n) onto the data stack
After a lot of digging, one of my colleagues found it:
f>str-rdp ( rf +nr +nd +np -- c-addr nr )
https://www.complang.tuwien.ac.at/forth/gforth/Docs-html-history/0.6.2/Formatted-numeric-output.html
Convert rf into a string at c-addr nr. The conversion rules and the
meanings of nr +nd np are the same as for f.rdp.
And from f.rdp:
f.rdp ( rf +nr +nd +np – )
https://www.complang.tuwien.ac.at/forth/gforth/Docs-html/Simple-numeric-output.html
Print float rf formatted. The total width of the output is nr. For
fixed-point notation, the number of digits after the decimal point is
+nd and the minimum number of significant digits is np. Set-precision has no effect on f.rdp. Fixed-point notation is used if the number of
siginicant digits would be at least np and if the number of digits
before the decimal point would fit. If fixed-point notation is not
used, exponential notation is used, and if that does not fit,
asterisks are printed. We recommend using nr>=7 to avoid the risk of
numbers not fitting at all. We recommend nr>=np+5 to avoid cases where
f.rdp switches to exponential notation because fixed-point notation
would have too few significant digits, yet exponential notation offers
fewer significant digits. We recommend nr>=nd+2, if you want to have
fixed-point notation for some numbers. We recommend np>nr, if you want
to have exponential notation for all numbers.
In humanly readable terms, these functions require a number on the float-stack and three numbers on the data stack.
The first number-parameter tells it how long the string should be, the second one how many decimals you would like and the third tells it the minimum number of decimals (which roughly translates to precision). A lot of implicit math is performed to determine the final String format that is produced, so some tinkering is almost required to make it behave the way you want.
Testing it out (we don't want to rebuild f., but to produce a format that will be accepted as floating-point number by forth to EVALUATE it again, so the 1.2345E0 notation is on purpose):
PI 18 17 17 f>str-rdp type \ 3.14159265358979E0 ok
PI 18 17 17 f.rdp \ 3.14159265358979E0 ok
PI f. \ 3.14159265358979 ok
I couldn't find the exact word for this, so I looked into Gforth sources.
Apparently, you could go with represent word that prints the most significant numbers into supplied buffer, but that's not exactly the final output. represent returns validity and sign flags, as well as the position of decimal point. That word then is used in all variants of floating point printing words (f., fp. fe.).
Probably the easiest way would be to substitute emit with your word (emit is a deferred word), saving data where you need it, use one of available floating pint printing words, and then restoring emit back to original value.
I'd like to hear the preferred solution too...
I have a sheet with values that I want to format in grams. The values range from high to low and I wish to format them with a comma as a thousand-separator, and rounded to two decimal places, but trimming both the decimal point and places where the number is a whole number. The following examples should explain better:
1000 to be presented as 1,000g
0.75 to be presented as 0.75g
0.2 to be presented as 0.2g
0.1234 to be presented as 0.12g
I've tried using a custom number format of #,##0.##g but this does not satisfy my first requirement (a) where the numbers are whole numbers and leaves an insignificant decimal point (i.e. 1,000.g), although does very well at formatting the remaining three requirements (b, c and d).
Is there a way of overcoming this?
#,##0.00_ g;(#,##0.00 g)
I'm a bit late to this article but I found the above format works for me.
Assuming your numbers are in column A with a header. You could try putting this in B2 then hide column A:
=arrayformula(if(arrayformula(TRUNC(A2:A,2)&"g")="0g","",arrayformula(TRUNC(A2:A,2)&"g")))
The best I could do with format was [=0]0;.## It gets rid of the period in 1000 but does not truncate 0.1234 and doesn't add the g. Maybe you could work with that if you must use format. I really don't think format can do it.
I have problem with comparison of two variables of "Real" type. One is a result of mathematical operation, stored in a dataset, second one is a value of an edit field in a form, converted by StrToFloat and stored to "Real" variable. The problem is this:
As you can see, the program is trying to tell me, that 121,97 is not equal to 121,97... I have read
this topic, and I am not copletely sure, that it is the same problem. If it was, wouldn't be both the numbers stored in the variables as an exactly same closest representable number, which for 121.97 is 121.96999 99999 99998 86313 16227 83839 70260 62011 71875 ?
Now let's say that they are not stored as the same closest representable number. How do I find how exactly are they stored? When I look in the "CPU" debugging window, I am completely lost. I see the adresses, where those values should be, but nothing even similar to some binary, hexadecimal or whatever representation of the actual number... I admit, that advanced debugging is unknown universe to me...
Edit:
those two values really are slightly different.
OK, I don't need to understand everything. Although I am not dealing with money, there will be maximum 3 decimal places, so "currency" is the way out
BTW: The calculation is:
DATA[i].Meta.UnUsedAmount := DATA[i].AMOUNT - ObjQuery.FieldByName('USED').AsFloat;
In this case it is 3695 - 3573.03
For reasons unknown, you cannot view a float value (single/double or real48) as hexadecimal in the watch list.
However, you can still view the hexadecimal representation by viewing it as a memory dump.
Here's how:
Add the variable to the watch list.
Right click on the watch -> Edit Watch...
View it as memory dump
Now you can compare the two values in the debugger.
Never use floats for monetary amounts
You do know of course that you should not use floats to count money.
You'll get into all sorts of trouble with rounding and comparisons will not work the way you want them too.
If you want to work with money use the currency type instead. It does not have these problems, supports 4 decimal places and can be compared using the = operator with no rounding issues.
In your database you use the money or currency datatype.
Studying for a test right now and can't seem to wrap my head around when to use "V" for a decimal instead of an actual decimal in PIC clauses. I've done some research but can't find anything I understand. Only been learning cobol for about a week, so is there like a rule of thumb here? Thanks for your time.
You use an actual decimal-point when you want to "output" a value which has decimal places, like a report line, a position on a screen, an item in an output file which is going to a "different" system which doesn't understand the format with an implied decimal pace.
That's what the V is, it is an implied decimal place. It tells the compiler where to align results from calculations, MOVEs, whatever. Computer chips, and the machine instructions they support, don't know about actual decimal points for their internal processing.
COBOL is a language with fixed-length fields. The machine instructions don't need to know where the decimal point is (effectively it can deal with everything as integer values) but the compiler does, and the compiler has to do the correct scaling and alignment of results.
Storing on your own files, use V, the implied decimal place.
For data which is to be "human readable" or read by a system which cannot understand your character set, cannot scale what looks like an integer, use an actual decimal-point, . (for computer-readable stuff, you can sometimes use a separate scaling factor, if that is more convenient for the receiving system).
Basically, V for internal, . for external, should be a rule of thumb to get you there.
Which COBOL are you using? I'm surprised it is not covered in your documentation.
I can't seem to get this one part right. I was given a input file with a bunch of names, some of which I need to skip, with extra information on each one. I was trying use ANDs and ORs to skip over the names I did not need and I came up with this.
IF DL-CLASS-STANDING = 'First Yr' OR 'Second Yr' AND
GRAD-STAT-IN = ' ' OR 'X'
It got rid of all but one person, but when I tried to add another set of ANDs and ORs the program started acting like the stipulations where not even there.
Did I make it too complex for the compiler? Is there an easier way to skip over things?
Try adding some parentheses to group things logically:
IF (DL-CLASS-STANDING = 'First Yr' OR 'Second Yr') AND
(GRAD-STAT-IN = ' ' OR 'X')
You may want to look into fully expanding that abbreviated expression since the expansion may not be what you think when there's a lot of clauses - it's often far better to be explicit.
However, what I would do is use the 88 level variables to make this more readable - these were special levels to allow conditions to be specified in the data division directly rather than using explicit conditions in the code.
In other words, put something like this in your data division:
03 DL-CLASS-STANDING PIC X(20).
88 FIRST-YEAR VALUE 'First Yr'.
88 SECOND-YEAR VALUE 'Second Yr'.
03 GRAD-STAT-IN PIC X.
88 GS-UNKNOWN VALUE ' '.
88 GS-NO VALUE 'X'.
Then you can use the 88 level variables in your expressions:
IF (FIRST-YEAR OR SECOND-YEAR) AND (GS-UNKNOWN OR GS-NO) ...
This is, in my opinion, more readable and the whole point of COBOL was to look like readable English, after all.
The first thing to note is that the code shown is the code which was working, and the amended code which did not give the desired result was never shown. As an addendum, why, if only one person were left, would more selection be necessary? To sum up that, the actual question is unclear beyond saying "I don't know how to use OR in COBOL. I don't know how to use AND in COBOL".
Beyond that, there were two actual questions:
Did I make it too complex for the compiler?
Is there an easier way to skip over things [is there a clearer way to write conditions]?
To the first, the answer is No. It is very far from difficult for the compiler. The compiler knows exactly how to handle any combinations of OR, AND (and NOT, which we will come to later). The problem is, can the human writer/reader code a condition successfully such that the compiler will know what they want, rather than just giving the result from the compiler following its rules (which don't account for multiple possible human interpretations of a line of code)?
The second question therefore becomes:
How do I write a complex condition which the compiler will understand in an identical way to my intention as author and in an identical way for any reader of the code with some experience of COBOL?
Firstly, a quick rearrangement of the (working) code in the question:
IF DL-CLASS-STANDING = 'First Yr' OR 'Second Yr'
AND GRAD-STAT-IN = ' ' OR 'X'
And of the suggested code in one of the answers:
IF (DL-CLASS-STANDING = 'First Yr' OR 'Second Yr')
AND (GRAD-STAT-IN = ' ' OR 'X')
The second version is clearer, but (or and) it is identical to the first. It did not make that code work, it allowed that code to continue to work.
The answer was addressing the resolution of the problem of a condition having its complexity increased: brackets/parenthesis (simply simplifying the complexity is another possibility, but without the non-working example it is difficult to make suggestions on).
The original code works, but when it needs to be more complex, the wheels start to fall off.
The suggested code works, but it does not (fully) resolve the problem of extending the complexity of the condition, because, in minor, it repeats the problem, within the parenthesis, of extending the complexity of the condition.
How is this so?
A simple condition:
IF A EQUAL TO "B"
A slightly more complex condition:
IF A EQUAL TO "B" OR "C"
A slight, but not complete, simplification of that:
IF (A EQUAL TO "B" OR "C")
If the condition has to become more complex, with an AND, it can be simple for the humans (the compiler does not care, it cannot be fooled):
IF (A EQUAL TO "B" OR "C")
AND (E EQUAL TO "F")
But what of this?
IF (A EQUAL TO "B" OR "C" AND E EQUAL TO "F")
Placing the AND inside the brackets has allowed the original problem for humans to be replicated. What does that mean, and how does it work?
One answer is this:
IF (A EQUAL TO ("B" OR "C") AND E EQUAL TO "F")
Perhaps clearer, but not to everyone, and again the original problem still exists, in the minor.
So:
IF A EQUAL TO "B"
OR A EQUAL TO "C"
Simplified, for the first part, but still that problem in the minor (just add AND ...), so:
IF (A EQUAL TO "B")
OR (A EQUAL TO "C")
Leading to:
IF ((A EQUAL TO "B")
OR (A EQUAL TO "C"))
And:
IF ((A EQUAL TO "B")
OR (A EQUAL TO C))
Now, if someone wants to augment with AND, it is easy and clear. If done at the same level as one of the condition parts, it solely attaches to that. If done at the outermost level, it attaches to both (all).
IF (((A EQUAL TO "B")
AND (E EQUAL TO "F"))
OR (A EQUAL TO "C"))
or
IF (((A EQUAL TO "B")
OR (A EQUAL TO "C"))
AND (E EQUAL TO "F"))
What if someone wants to insert the AND inside the brackets? Well, because inside the brackets it is simple, and people don't tend to do that. If what is inside the brackets is already complicated, it does tend to be added. It seems that something which is simple through being on its own tends not to be made complicated, whereas something which is already complicated (more than one thing, not on its own) tends to be made more complex without too much further thought.
COBOL is an old language. Many old programs written in COBOL are still running. Many COBOL programs have to be amended, or just read to understand something, and that many times over their lifetimes of many years.
When changing code, by adding something to a condition, it is best if the original parts of the condition do not need to be "disturbed". If complexity is left within brackets, it is more likely that code needs to be disturbed, which increases the amount of time in understanding (it is more complex) and changing (more care is needed, more testing necessary, because the code is disturbed).
Many old programs will be examples of bad practice. There is not much to do about that, except to be careful with them.
There isn't any excuse for writing new code which requires more maintenance and care in the future than is absolutely necessary.
Now, the above examples may be considered long-winded. It's COBOL, right? Lots of typing? But COBOL gives immense flexibility in data definitions. COBOL has, as part of that, the Level 88, the Condition Name.
Here are data definitions for part of the above:
01 A PIC X.
88 PARCEL-IS-OUTSIZED VALUE "B" "C".
01 F PIC X.
88 POSTAGE-IS-SUFFICIENT VALUE "F".
The condition becomes:
IF PARCEL-IS-OUTSIZED
AND POSTAGE-IS-SUFFICIENT
Instead of just literal values, all the relevant literal values now have a name, so that the coder can indicate what they actually mean, as well as the actual values which carry that meaning. If more categories should be added to PARCEL-IS-OUTSIZED, the VALUE clause on the 88-level is extended.
If another condition is to be combined, it is much more simple to do so.
Is this all true? Well, yes. Look at it this way.
COBOL operates on the results of a condition where coded.
If condition
Simple conditions can be compounded through the use of brackets, to make a condition:
If condition = If (condition) = If ((condition1) operator (condition2))...
And so on, to the limits of the compiler.
The human just has to deal with the condition they want for the purpose at hand. For general logic-flow, look at the If condition. For verification, look at the lowest detail. For a subset, look at the part of the condition relevant to the sub-set.
Use simple conditions. Make conditions simple through brackets/parentheses. Make complex conditions, where needed, by combining simple conditions. Use condition-names for comparisons to literal values.
OR and AND have been treated so far. NOT is often seen as something to treat warily:
IF NOT A EQUAL TO B
IF A NOT EQUAL TO B
IF (NOT (A EQUAL TO B)), remembering that this is just IF condition
So NOT is not scary, if it is made simple.
Throughout, I've been editing out spaces. Because the brackets are there, I like to make them in-your-face. I like to structure and indent conditions, to emphasize the meaning I have given them.
So:
IF ( ( ( condition1 )
OR ( condition2 ) )
AND
( ( condition3 )
OR ( condition4 ) ) )
(and more sculptured than that as well). By structuring, I hope that a) I mess up less and b) when/if I do mess up, someone has a better chance of noticing it.
If conditions are not simplified, then understanding the code is more difficult. Changing the code is more difficult. For people learning COBOL, keeping things simple is a long-term benefit to all.
As a rule, I avoid the use of AND if at all possible. Nested IF's work just as well, are easier to read, and with judicious use of 88-levels, do not have to go very deep. This seems so much easier to read, at least in my experience:
05 DL-CLASS-STANDING PIC X(20) VALUE SPACE.
88 DL-CLASS-STANDING-VALID VALUE 'First Yr' 'Second Yr'.
05 GRAD-STAT-IN PIC X VALUE SPACE.
88 GRAD-STAT-IN-VALID VALUE SPACE 'N'.
Then the code is as simple as this:
IF DL-CLASS-STANDING-VALID
IF GRAD-STAT-IN-VALID
ACTION ... .