How can I indicate an error during a parse operation? - parsing

Within the scripting language I am implementing, valid IDs can consist of a sequence of numbers, which means I have an ambiguous situation where "345" could be an integer, or could be an ID, and that's not known until runtime. Up until now, I've been handling every case as an ID and planning to handle the check for whether a variable has been declared under that name at runtime, but when I was improving my implementation of a particular bit of code, I found that there was a situation where an integer is valid, but any other sort of ID would not be. It seems like it would make sense to handle this particular case as a parsing error so that, e.g., the following bit of code that activates all picks with a spell level tag greater than 5 would be considered valid:
foreach pick in hero where spell.level? > 5
pick.activate[]
nexteach
but the following which instead compares against an ID that can't be mistaken for an integer constant would be flagged as an error during parsing:
foreach pick in hero where spell.level? > threshold
pick.activate[]
nexteach
I've considered separate tokens, ID and ID_OR_INTEGER, but that means having to handle that ambiguity everywhere I'm currently using an ID, which is a lot of places, including variable declarations, expressions, looping structures, and procedure calls.
Is there a better way to indicate a parsing error than to just print to the error log, and maybe set a flag?

I would think about it differently. If an ID is "just a number" and plain numbers are also needed, I would say any string of digits is a number, and a number might designate an ID in some circumstances.
For bare integer literals (like 345), I would have the tokenizer return maybe a NUMBER token, indicating it found an integer. In the parser, wherever you currently accept ID, change it to NUMBER, and call a lookup function to verify the "NUMBER" is a valid ID.
I might have misunderstood your question. You start by talking about "345", but your second example has no integer strings.

Related

What is the convention to document types used in Lua?

I come from the strongly typed world and I want to write some Lua code. How should I document what type things are? What do Lua natives do? Hungarian notation? Something else?
For example:
local insert = function(what, where, offset)
It's impossible to tell at a glance whether we're talking about strings or tables here.
Should I do
local sInsert = function(sWhat, sWhere, nOffset)
or
-- string what, string where, number offset, return string
local insert = function(what, where, offset)
or something else?
What about local variables? What about table entries (e.g. someThing.someProperty)?
For a reference on thoughts and opinions on Lua style in the community (or a particular community?), read this: LuaStyleGuide.
The closest one could get to an enforced style would be the format used by LuaDoc, as it's a fairly popular documentation generator used by high profile projects such as LuaFileSystem.
There are only seven types in Lua.
Here are some conventions (some of them might sound a bit obvious; sorry):
Anything that sounds like a string, should be a string: street_address, request_method. If you are not sure you can add _name (or any other suffix that makes clear it's a substantive) to it: method_name
Anything that sounds like a number, should be a number: mass, temperature, percentage. When in doubt, add number, amount, coefficient, or whatever fits : number_of_children, user_id. The names n and i are usually given to numbers. If a number must be positive or natural, make assertions at the top of the function.
Boolean parameters are either an adjective (cold, dirty) or is_<adjective> (is_wet, is_ready).
Anything that sounds like a verb should be a function: consume, check. You can add _function, _callback or _f if you need to clarify it further: update_function, post_callback. The single letter f represents a function quite often. And usually you should only have one parameter of type function (recommended to put it at the end)
Anything that sounds like a collection should be a table: children, words, dictionary. People typically don't differentiate between array-like tables and dictionary-like tables, since both can be parsed with pairs. If you need to specify that a table is an array, you could add _array or _sequence at the end of the name. The letter t typically means table.
Coroutines are not used quite often; you can follow the same rules as with functions, and you can also add _cor to their names.
Any value can be nil.
If it's an optional value, initialize it at the top of the function: options = options or {}
If it's a mandatory value, make an assertion (or return error): assert(name, "The name is mandatory")

What's the difference between luaL_checknumber and lua_tonumber?

Clarification (sorry the question was not specific): They both try to convert the item on the stack to a lua_Number. lua_tonumber will also convert a string that represents a number. How does luaL_checknumber deal with something that's not a number?
There's also luaL_checklong and luaL_checkinteger. Are they the same as (int)luaL_checknumber and (long)luaL_checknumber respectively?
The reference manual does answer this question. I'm citing the Lua 5.2 Reference Manual, but similar text is found in the 5.1 manual as well. The manual is, however, quite terse. It is rare for any single fact to be restated in more than one sentence. Furthermore, you often need to correlate facts stated in widely separated sections to understand the deeper implications of an API function.
This is not a defect, it is by design. This is the reference manual to the language, and as such its primary goal is to completely (and correctly) describe the language.
For more information about "how" and "why" the general advice is to also read Programming in Lua. The online copy is getting rather long in the tooth as it describes Lua 5.0. The current paper edition describes Lua 5.1, and a new edition describing Lua 5.2 is in process. That said, even the first edition is a good resource, as long as you also pay attention to what has changed in the language since version 5.0.
The reference manual has a fair amount to say about the luaL_check* family of functions.
Each API entry's documentation block is accompanied by a token that describes its use of the stack, and under what conditions (if any) it will throw an error. Those tokens are described at section 4.8:
Each function has an indicator like this: [-o, +p, x]
The first field, o, is how many elements the function pops from the
stack. The second field, p, is how many elements the function pushes
onto the stack. (Any function always pushes its results after popping
its arguments.) A field in the form x|y means the function can push
(or pop) x or y elements, depending on the situation; an interrogation
mark '?' means that we cannot know how many elements the function
pops/pushes by looking only at its arguments (e.g., they may depend on
what is on the stack). The third field, x, tells whether the function
may throw errors: '-' means the function never throws any error; 'e'
means the function may throw errors; 'v' means the function may throw
an error on purpose.
At the head of Chapter 5 which documents the auxiliary library as a whole (all functions in the official API whose names begin with luaL_ rather than just lua_) we find this:
Several functions in the auxiliary library are used to check C
function arguments. Because the error message is formatted for
arguments (e.g., "bad argument #1"), you should not use these
functions for other stack values.
Functions called luaL_check* always throw an error if the check is not
satisfied.
The function luaL_checknumber is documented with the token [-0,+0,v] which means that it does not disturb the stack (it pops nothing and pushes nothing) and that it might deliberately throw an error.
The other functions that have more specific numeric types differ primarily in function signature. All are described similarly to luaL_checkint() "Checks whether the function argument arg is a number and returns this number cast to an int", varying the type named in the cast as appropriate.
The function lua_tonumber() is described with the token [-0,+0,-] meaning it has no effect on the stack and does not throw any errors. It is documented to return the numeric value from the specified stack index, or 0 if the stack index does not contain something sufficiently numeric. It is documented to use the more general function lua_tonumberx() which also provides a flag indicating whether it successfully converted a number or not.
It too has siblings named with more specific numeric types that do all the same conversions but cast their results.
Finally, one can also refer to the source code, with the understanding that the manual is describing the language as it is intended to be, while the source is a particular implementation of that language and might have bugs, or might reveal implementation details that are subject to change in future versions.
The source to luaL_checknumber() is in lauxlib.c. It can be seen to be implemented in terms of lua_tonumberx() and the internal function tagerror() which calls typerror() which is implemented with luaL_argerror() to actually throw the formatted error message.
They both try to convert the item on the stack to a lua_Number. lua_tonumber will also convert a string that represents a number. luaL_checknumber throws a (Lua) error when it fails a conversion - it long jumps and never returns from the POV of the C function. lua_tonumber merely returns 0 (which can be a valid return as well.) So you could write this code which should be faster than checking with lua_isnumber first.
double r = lua_tonumber(_L, idx);
if (r == 0 && !lua_isnumber(_L, idx))
{
// Error handling code
}
return r;

fortran save integer

I came across some code today that looked somewhat like this:
subroutine foo()
real blah
integer bar,k,i,j,ll
integer :: n_called=1
save integer
...
end
It seems like the intent here was probably save n_called, but is that even a valid statment to save all integers -- or is it implicitly declaring a variable named integer and saving it?
The second interpretation is correct. Fortran has many keywords, INTEGER being one of them, but it has no reserved words, which means that keywords can be used as identifiers, though this is usually a terrible idea (but nevertheless it carries on to C# where one can prefix a keyword with # and use it as an identifier, right?)
The SAVE statement, even if it was intended for n_called is superficial. Fortran automatically saves all variables that have initialisers and that's why the code probably works as intended.
integer :: n_called=1
Here n_called is automatically SAVE. This usually comes as a really bad surprise to C/C++ programmers forced to maintain/extend/create new Fortran code :)
I agree with your 2nd interpretation, that is, the statement save integer implicitly declares a variable called integer and gives it the save attribute. Fortran, of course, has no rule against using keywords as program entity names, though most sensible software developers do have such a rule.
If I try to compile your code snippet as you have presented it, my compiler (Intel Fortran) makes no complaint. If I insert implicit none at the right place it reports the error
This name does not have a type, and must have an explicit type. [INTEGER]
The other interpretation, that it gives the save attribute to all integer variables, seems at odds with the language standards and it's not a variation that I've ever come across.

Stored procedure parameter data type that allows alphanumeric values

I have a stored procedure for SQL 2000 that has an input parameter with a data type of varchar(17) to handle a vehicle identifier (VIN) that is alphanumeric. However, whenever I enter a value for the parameter when executing that has a numerical digit in it, it gives me an error. It appears to only accept alphabetic characters. What am I doing wrong here?
Based on comments, there is a subtle "feature" of SQL Server that allows letters a-z to be used as stored proc parameters without delimiters. It's been there forever (since 6.5 at least)
I'm not sure of the full rules, but it's demonstrated in MSDN (rename SQL Server etc): there are no delimiters around the "local" parameter. And I just found this KB article on it
In this case, it could be starting with a number that breaks. I assume it works for a contained number (but as I said I'm not sure of the full rules).
Edit: confirmed by Martin as "breaks with leading number", OK for "containing number"
This doesn't help much, but somewhere, you have a bug, typo, or oversight in your code. I spent 2+ years working with VINs as parameters, and other than regretting not having made it char(17) instead ov varchar(17), we never had any problems passing in alphanumeric VIN values. Somewhere, and I'd guess it's in the application layer, something is not liking digits -- perhaps a filter looking for only alphabetical characters?

Duh? help with f# option types

I am having a brain freeze on f#'s option types. I have 3 books and read all I can but I am not getting them.
Does someone have a clear and concise explanation and maybe a real world example?
TIA
Gary
Brian's answer has been rated as the best explanation of option types, so you should probably read it :-). I'll try to write a more concise explanation using a simple F# example...
Let's say you have a database of products and you want a function that searches the database and returns product with a specified name. What should the function do when there is no such product? When using null, the code could look like this:
Product p = GetProduct(name);
if (p != null)
Console.WriteLine(p.Description);
A problem with this approach is that you are not forced to perform the check, so you can easily write code that will throw an unexpected exception when product is not found:
Product p = GetProduct(name);
Console.WriteLine(p.Description);
When using option type, you're making the possibility of missing value explicit. Types defined in F# cannot have a null value and when you want to write a function that may or may not return value, you cannot return Product - instead you need to return option<Product>, so the above code would look like this (I added type annotations, so that you can see types):
let (p:option<Product>) = GetProduct(name)
match p with
| Some prod -> Console.WriteLine(prod.Description)
| None -> () // No product found
You cannot directly access the Description property, because the reuslt of the search is not Product. To get the actual Product value, you need to use pattern matching, which forces you to handle the case when a value is missing.
Summary. To summarize, the purpose of option type is to make the aspect of "missing value" explicit in the type and to force you to check whether a value is available each time you work with values that may possibly be missing.
See,
http://msdn.microsoft.com/en-us/library/dd233245.aspx
The intuition behind the option type is that it "implements" a null-value. But in contrast to null, you have to explicitly require that a value can be null, whereas in most other languages, references can be null by default. There is a similarity to SQLs NULL/NOT NULL if you are familiar with those.
Why is this clever? It is clever because the language can assume that no output of any expression can ever be null. Hence, it can eliminate all null-pointer checks from the code, yielding a lot of extra speed. Furthermore, it unties the programmer from having to check for the null-case all the same, should he or she want to produce safe code.
For the few cases where a program does require a null value, the option type exist. As an example, consider a function which asks for a key inside an .ini file. The key returned is an integer, but the .ini file might not contain the key. In this case, it does make sense to return 'null' if the key is not to be found. None of the integer values are useful - the user might have entered exactly this integer value in the file. Hence, we need to 'lift' the domain of integers and give it a new value representing "no information", i.e., the null. So we wrap the 'int' to an 'int option'. Now, if there is no integer value we will get 'None' and if there is an integer value, we will get 'Some(N)' where N is the integer value in question.
There are two beautiful consequences of the choice. One, we can use the general pattern match features of F# to discriminate the values in e.g., a case expression. Two, the framework of algebraic datatypes used to define the option type is exposed to the programmer. That is, if there were no option type in F# we could have created it ourselves!

Resources