Concept of "Excel [Blank] Cell" in any programming language? - parsing

Excel has a Blank cell which has some interesting properties when it comes to calculations:
In the below we will assume cell A1 is blank.
A blank cell equals another blank cell: =A1=A1.
A blank cell equals '', 0, and FALSE: =A1="", =A1=0, =A1=FALSE.
A blank cell will coerce to the expected operand type: =A1+A1 (0), =-A1 (0), =A1&A1 ("").
I suppose the closest item I've found is the bool function in python, which covers the first two cases above:
bool(None) == bool(None)
bool(None) == bool(0), bool(None) == bool(''), bool(None) == bool(False)
But that doesn't cover the third case where it implicitly casts to the expected type. Is there anything in a language that covers that?
Here is a video showing some of the properties: https://gyazo.com/b23989ba1fd28500aff32a6b6cb6dca5.

Is there anything in a language that covers that?
Short answer:
Languages that support syntactic unification, e.g. Prolog, Datalog, Answer Set Programming, ... .
These programming languages are typically Logic programming languages.
Here are demonstrations of your list using SWI-Prolog
Note: In Prolog = is unification not comparison ==. In some situations if you think of a variable as a named pointer and unification as setting pointers equal it makes sense but that analogy works only in specific cases, you have been warned.
A blank cell equals another blank cell: =A1=A1.
?- A = B.
A = B.
Now if A is bound to a value then B is also bound because they are unified.
?- A = B,A=1.
A = B, B = 1.
A blank cell equals '', 0, and FALSE: =A1="", =A1=0, =A1=FALSE.
I will take that to mean that a blank cell has a primitive data type with a default value of that type.
Since Prolog is not strongly typed a variable has no type until a value is bound to the variable, think Python with Duck typing. Even then the concept of a specific type may not be what one expects.
However when the second statement is considered with the third statement
A blank cell will coerce to the expected operand type: =A1+A1 (0), =-A1 (0), =A1&A1 ("").
then as noted the Prolog variable (Excel blank cell) acquires the type upon binding.
?- (var(B)->write('B is variable');write('B is not variable')),nl,(integer(B)->write('B is integer');write('B is not integer')),nl,A is 2,(integer(A)->write('A is integer');write('A is not integer')),nl,A=B,(integer(B)->write('B is integer');write('B is not integer')).
B is variable
B is not integer
A is integer
B is integer
B = A, A = 2.

A blank cell in Excel is not a data type but it is an object with multiple parameters.
Blank cell equals depend on the blank cell and the other value or cell type.
Remember that an empty cell in Excel could be a date or currency or Boolean
So the closest thing to Excel's blank cell is really 'null' in most programming languages. a blank cell is state rather than data type.

Related

Lua/print table error : ( attempt to concatenate a table value)

I got a simple code like:
table = {}
print(table.."hello")
then got a error like the title. I know i need to use tostring(table) to fix it . Why table or other types can't convert to string to concatenate a String automatically except number type ?
print(table) is available But print(table.."hello") is not .
Does lua have some rules?
Thanks you.
Why table or other types can't convert to string to concatenate a String automatically except number type?
This is a deliberate choice made by the Lua language designers. Strings and numbers are coerced: Every operation that expects a string will also accept a number and tostring it; every operation that expects a number will also accept a string and tonumber it.
Coercion is an operation applied to strings. Numbers will be tostringed. Any other type won't. For other primitive types like bools and nils this is somewhat questionable, since they can be converted to string without issue. For tables it's reasonable though since they are a reference type.
Unlike other languages which make such decisions for you, Lua is highly metaprogrammable: You can simply override the decision! In this case, metatables are the solution, specifically the __concat metamethod which gets called if concatenation (..) is applied to two values of which one has the metamethod (and is neither a string or number):
table = setmetatable({}, {
__concat = function(left, right)
if type(left) == "string" then
return left .. tostring(right)
end
return tostring(left) .. tostring(right)
end
})
print(table .. "hello") -- hellotable: 0x563eb139bea0
You could even extend this to primitive types (nils, booleans), some other reference types (functions, coroutines) using debug.setmetatable, but I'd advise against this.
The declaration of table = {} destroying the table library
The datatype table is not a string or number so concat with .. must fail
Try this instead...
mytab = {}
table.insert(mytab, "hello")
print(table.concat(mytab))
For the table library functions look: https://www.lua.org/manual/5.4/manual.html#6.6

Ada vector of enumerated type

I am trying to created a vector of an enumerated type in Ada, but the compiler seems to expect an equality function overload. How do I telll the compiler to just use the default equal function. Here's what I have:
package HoursWorkedVector is new Ada.Containers.Vectors(Natural,DAY_OF_WEEK);
--where Day of week is defined as an enumeration
When I try to compile, I get the message:
no visible subprogram matches the specification for "="
Do I need to create a comparison function to have a vector of an enumerated type? Thanks in advance.
The definition of Ada.Containers.Vectors starts like this:
generic
type Index_Type is range <>;
type Element_Type is private;
with function "=" (Left, Right : Element_Type)
return Boolean is <>;
package Ada.Containers.Vectors is
The meaning of <> in a generic formal function is defined by RM 12.6(10):
If a generic unit has a subprogram_default specified by a box, and the
corresponding actual parameter is omitted, then it is equivalent to an
explicit actual parameter that is a usage name identical to the
defining name of the formal.
So if, as you said in the comments, DAY_OF_WEEK is defined in another package, your instantiation is equivalent to
package HoursWorkedVector is new Ada.Containers.Vectors(Natural, Other_Package.DAY_OF_WEEK, "=");
which doesn't work because the "=" that compares DAY_OF_WEEK values is not visible.
You can include Other_Package."=" in the instantiation, as suggested in a comment. There are at least three ways to make "=" visible, so that your original instantiation would work:
use Other_Package; This will make "=" directly visible, but it will also make everything else defined in that package directly visible. This may not be what you want.
use type Other_Package.DAY_OF_WEEK; This makes all the operators of DAY_OF_WEEK directly visible, including "<", "<=", etc., as well as all the enumeration literals, and any other primitive subprograms of DAY_OF_WEEK that you may have declared in Other_Package. This is probably the favorite solution, unless for some reason it would be a problem to make the enumeration literals visible.
Use a renaming declaration to redefine "=":
function "=" (Left, Right : DAY_OF_WEEK) return Boolean
renames Other_Package."=";
This makes "=" directly visible.
The compiler automatically selects the predefined equality operator:
with
Ada.Containers.Vectors;
package Solution is
type Day_Of_Week is (Work_Day, Holiday);
package Hours_Worked_Vector is
new Ada.Containers.Vectors (Index_Type => Natural,
Element_Type => Day_Of_Week);
end Solution;

Is --- Cobol picture valid

I'm running some tests on Cobol pictures and wondering if --- is a valid picture. Am I right in saying that this picture accepts values in the range of -99 through to +99. If it is valid then it is possible for the picture to accept 3 spaces as a value?
For example:
12 would return 12
1 would return 1
Cheers
Yes --- is a valid PICTURE clause. The variable corresponding to this PICTURE will accept assignments of numeric values in the range -99 through to +99. It cannot be assigned non-numerics (space for example). However, if you were to DISPLAY this variable after assigning a numeric value to it, leading zeros will be replaced by spaces. Consequently, if you MOVE ZERO to this item it will DISPLAY only spaces. Attempting to MOVE SPACES to this item will result in a compile error (incompatible data types). This last bit may seem a little counter intutive, but remember that this type of PICTURE clause implies a USAGE of display - basically items defined in this manner are used to 'pretty print' numbers. About the only operations you can preform with USAGE DISPLAY items is MOVE to or from and DISPLAY them.
EDIT - Response to Comment
A PICTURE of ---X(2) is invalid. The chart below illustrates combinations and the order that symbols may appear in a PICTURE string. Notice that parenthesis are not in the chart. Logically you can replace them with the corresponding number of occurences of the preceding character before reading the string. For example X(3) is read as XXX. If you really want to parse out a PICTURE string properly, you can use this chart to construct a BNF grammar specifically for them.
If this is a numeric picture, it won't accept spaces.

In Cobol, to test "null or empty" we use "NOT = SPACE [ AND/OR ] LOW-VALUE" ? Which is it?

I am now working in mainframe,
in some modules, to test
Not null or Empty
we see :
NOT = SPACE OR LOW-VALUE
The chief says that we should do :
NOT = SPACE AND LOW-VALUE
Which one is it ?
Thanks!
Chief is correct.
COBOL is supposed to read something like natural language (this turns out to be just
another bad joke).
Lets play with the following variables and values:
A = 1
B = 2
C = 3
An expression such as:
IF A NOT EQUAL B THEN...
Is fairly straight forward to understand. One is not equal to two so we will do
whatever follows the THEN. However,
IF A NOT EQUAL B AND A NOT EQUAL C THEN...
Is a whole lot harder to follow. Again one is not equal to two AND one is not
equal to three so we will do whatever follows the 'THEN'.
COBOL has a short hand construct that IMHO should never be used. It confuses just about
everyone (including me from time to time). Short hand expressions let you reduce the above to:
IF A NOT EQUAL B AND C THEN...
or if you would
like to apply De Morgans rule:
IF NOT (A EQUAL B OR C) THEN...
My advice to you is avoid NOT in exprssions and NEVER use COBOL short hand expressions.
What you really want is:
IF X = SPACE OR X = LOW-VALUE THEN...
CONTINUE
ELSE
do whatever...
END-IF
The above does nothing when the 'X' contains either spaces or low-values (nulls). It
is exactly the same as:
IF NOT (X = SPACE OR X = LOW-VALUE) THEN
do whatever...
END-IF
Which can be transformed into:
IF X NOT = SPACE AND X NOT = LOW-VALUE THEN...
And finally...
IF X NOT = SPACE AND LOW-VALUE THEN...
My advice is to stick to simple to understand longer and straight forward expressions
in COBOL, forget the short hand crap.
In COBOL, there is no such thing as a Java null AND it is never "empty".
For example, take a field
05 FIELD-1 PIC X(5).
The field will always contain something.
MOVE LOW-VALUES TO FIELD-1.
now it contains hexadimal zeros. x'0000000000'
MOVE HIGH-VALUES TO FIELD-1.
Now it contains all binary ones: x'FFFFFFFFFF'
MOVE SPACES TO FIELD-1.
Now each byte is a space. x'4040404040'
Once you declare a field, it points to a certain area in memory. That memory area must be set to something, even if you never modify it, it still will have what ever garbage it had before the program was loaded. Unless you initialize it.
05 FIELD-1 PIC X(6) VALUE 'BARUCH'.
It is worth noting that the value null is not always the same as low-value and this depends on the device architecture and its character set in use as determined by the manufacturer. Mainframes can have an entirely different collating sequence (low to high character code and symbol order) and symbol set compared to a device using linux or windows as you have no doubt seen by now. The shorthand used in Cobol for comparisons is sometimes used for boolean operations, like IF A GOTO PAR-5 and IF A OR C THEN .... and can be combined with comparisons of two variables or a variable and a literal value. The parser and compiler on different devices should deal with these situations in a standard (ANSI) method but this is not always the situation.
I agree with NealB. Keep it simple, avoid "short cuts", make it easy to understand without having to refer to the manual to check things out.
IF ( X EQUAL TO SPACE )
OR ( X EQUAL TO LOW-VALUES )
CONTINUE
ELSE
do whatever...
END-IF
However, why not put an 88 on X, and keep it really simple?:
88 X-HAS-A-VALUE-INDICATING-NULL-OR-EMPTY VALUE SPACE, LOW-VALUES.
IF X-HAS-A-VALUE-INDICATING-NULL-OR-EMPTY
CONTINUE
ELSE
do whatever...
END-IF
Note, in Mainframe Cobol, NULL is very restricted in meaning, and is not the meaning that you are attributing to it, Tom. "Empty" only means something in a particular coder-generated context (it means nothing to Cobol as far as a field is concerned).
We don't have "strings". Therefore, we don't have "null strings" (a string of length one including string-terminator). We don't have strings, so a field always has a value, so it can never be "empty" other than as termed by the programmer.
Oguz, I think your post illustrates how complex something that is really simple can be made, and how that can lead to errors. Can you test your conditions, please?

What can you NOT use an identifier for?

I'm trying to understand what identifiers represent and what they don't represent.
As I understand it, an identifier is a name for a method, a constant, a variable, a class, a package/module. It covers a lot. But what can you not use it for?
Every language differs in terms of what entities/abstractions can or cannot be named and reused in that language.
In most languages, you can't use an identifier for infix arithmetic operations.
For example, plus is an identifier and you can make a function named plus. But write you can write a = b + c;, there's no way to define an operator named plus to make a = b plus c; work because the language grammar simply does not allow an identifier there.
An identifier allows you to assign a name to some data, so that you can reference it later. That is the limit of what identifiers do; you cannot "use" it for anything other than a reference to some data.
That said, there are a lot of implications that come from this, some subtle. For example, in most languages functions are, to some degree or another, considered to be data, and so a function name is an identifier. In languages where functions are values, but not "first-class" values, you can't use an identifier for a function in an place you could use an identifier for something else. In some languages, there will even be separate namespaces for functions and other data, and so what is textually the same identifier might refer to two different things, and they would be distinguished by the context in which they are used.
An example of what you usually (i.e., in most languages) cannot use an identifier for is as a reference to a language keyword. For example, this sort of thing generally can't be done:
let during = while;
during (true) { print("Hello, world."); }
You could say it's used for everything that you'll want to refer to multiple times, or maybe even once (but use it to clarify the referent's purpose).
What can/can't be named differs per language, it's often quite intuitive, IMHO.
An "Anonymous" entity is something which is not named, although referred to somehow.
#!/usr/bin/perl
$subroutine = sub { return "Anonymous subroutine returning this text"; }
In Perl-speak, this is anonymous - the subroutine is not named, but it is referred to by the reference variable $subroutine.
PS: In Perl, the subroutine would be named like this:
sub NAME_HERE {
# some code...
}
Say, in Java your cannot write something like:
Object myIf = if;
myIf (a == b) {
System.out.println("True!");
}
So, you cannot name some code statement, giving it an alias. While in REBOL it is perfectly possible:
myIf: if
myIf a = b [print "True!"]
What can and what can't be named depends on language, as you see.
as its name implifies, an identifier is used to identify something. so for everything that can be identified uniquely, you can use an identifier. But for example a literal (e.g. string literal) is not unique so you can't use an identifier for it. However you can create a variable and assign a string literal to it.
Making soup out them is rather foul.
In languages such as Lisp, an identifier exists in its own right as an symbol, whereas in languages which are not introspective identifiers don't exist in the runtime.
You write a literal identifier/symbol by putting a single quote in front of it:
[1]> 'a
A
You can create a variable and assign a symbol literal to it:
[2]> (setf a 'Hello)
HELLO
[3]> a
HELLO
[4]> (print a)
HELLO
HELLO
You can set two variables to the same symbol
[10]> (setf b a)
HELLO
[11]> b
HELLO
[12]> a
HELLO
[13]> (eq b a)
T
[14]> (eq b 'Hello)
T
Note that the values bound to b and a are the same, and the value is the literal symbol 'Hello
You can bind a function to the symbol
[15]> (defun hello () (print 'hello))
HELLO
and call it:
[16]> (hello)
HELLO
HELLO
In common lisp, the variable binding and the function binding are distinct
[19]> (setf hello 'goodbye)
GOODBYE
[20]> hello
GOODBYE
[21]> (hello)
HELLO
HELLO
but in Scheme or JavaScript the bindings are in the same namespace.
There are many other things you can do with identifiers, if they are reified as symbols. I suspect that someone more knowledgable than me in Lisp will be able to demonstrate any of the things that you 'can't do with identifiers' exist.
But even Lisp can not make identifier soup.
Sort of a left-field thought, but JSON has all those quotations in it to eliminate the danger of a JavaScript keyword messing up the parsing.

Resources