FPGA indexing of nonuniform spaced look up table - signal-processing

We are trying to implement a fixed point nonlinear mathematical function on a FPGA. We want to be able to achieve very low latency (2-4 clock cycles max), have the computation pipelined in such a way that we can receive a new answer every clock cycle (no dropped inputs since they come in every clock cycle), have decent accuracy, AND have reasonable FPGA resource utilization.
We performed the computation using a combination of CORDIC computers and DSP blocks for a pretty good solution, except the CORDIC computers required about 12 clock cycles for good accuracy.
Using a LUT without interpolation would require way too many RAMs as we have 32 bits, so we threw that out.
Our next option was using a look up table with interpolation. The latency was good because we could automatically index the LUT using the upper bits of the input value. The problem with this was that the accuracy wasn't very good in the non-linear sections.
We are now trying to use a LUT with non-uniform spacing between the samples. Basically we sample the function more in the non-linear portions, and sample less when the function looks more linear. This should help out our accuracy problem a lot, but we now face the problem where we can't automatically index the LUT with the upper bits of the input value. We investigated ways of doing a binary search to find the index, but the latency suffered. Resource utilization wasn't great either, because in order to keep getting an answer on the output every clock cycle, we had to replicate our LUTs in different pipelined stages just to handle the binary searching. We tried a few tricks like using dual-port rams, but the latency is still killer.
So we are wondering if anyone has had a similar problem and knows of a good indexing solution, or if there are special/smart ways to sample our function non-uniformly and build the LUT in such a way that indexing can still be computed fairly quickly.

Let's say your function performs
y = f(x)
you could first compress x with a piece-wise linear
z = g(x)
implemented as a small LUT + linear interpolation (just taking into account the few most significant bits of x) such that you dedicate more memory to the interesting regions of your function and less to the almost constant regions.
Then you calculate yas
y = f(x)
= h(z)
= h(g(x))
where h would be a pre-process, "warped" version of your original table.
h(z) = f(g'(x))
and
g'(g(x)) = x
Then you would still use only LUTs on the first few bits plus linear interpolation, but in two stages:
+--------+ +--------+
| LUT #1 | | LUT #2 |
| | | |
-- x -->| g(x) |-- z -->| h(z) |--> y
| | | |
| | | |
+--------+ +--------+
Rough sketch of how this could look like:
y = f(x)
^
| : : ..:..............:
| : : ........ : :
| : : .. : :
| : :.. : :
| : ...: : :
| :........... : : :
|..............: : : :
+-----------------------------------------------------------+->
| | | | |
z = g(x)
^
| : : ...O..............O
| : : .... : :
| : : .... : :
| : ...O... : :
| : .... : : :
| : .... : : :
O..............O... : : :
+-----------------------------------------------------------+->
| | | | |
y = h(z)
^
| : : ..:...:
| : : ......... : :
| : : ....... : :
| : :...... : :
| : ........: : :
| :.................. : : :
|...: : : :
+-----------------------------------------------------------+->
| | | | |
The interesting problem is then of course how to find the optimal g(x) such that the worst-case (or average-case) error is minimized.
But that's something you can perform offline or even analytically.

Related

Converting given ambiguous arithmetic expression grammar to unambiguous LL(1)

In this term, I have course on Compilers and we are currently studying syntax - different grammars and types of parsers. I came across a problem which I can't exactly figure out, or at least I can't make sure I'm doing it correctly. I already did 2 attempts and counterexamples were found.
I am given this ambiguous grammar for arithmetic expressions:
E → E+E | E-E | E*E | E/E | E^E | -E | (E)| id | num , where ^ stands for power.
I figured out what the priorities should be. Highest priority are parenthesis, followed by power, followed by unary minus, followed by multiplication and division, and then there is addition and substraction. I am asked to convert this into equivalent LL(1) grammar. So I wrote this:
E → E+A | E-A | A
A → A*B | A/B | B
B → -C | C
C → D^C | D
D → (E) | id | num
What seems to be the problem with this is not equivalent grammar to the first one, although it's non-ambiguous. For example: Given grammar can recognize input: --5 while my grammar can't. How can I make sure I'm covering all cases? How should I modify my grammar to be equivalent with the given one? Thanks in advance.
Edit: Also, I would of course do elimination of left recursion and left factoring to make this LL(1), but first I need to figure out this main part I asked above.
Here's one that should work for your case
E = E+A | E-A | A
A = A*C | A/C | C
C = C^B | B
B = -B | D
D = (E) | id | num
As a sidenote: pay also attention to the requirements of your task since some applications might assign higher priority to the unary minus operator with respect to the power binary operator.

Can a table-based LL parser handle repetition without right-recursion?

I understand how an LL recursive descent parser can handle rules of this form:
A = B*;
with a simple loop that checks whether to continue looping or not based on whether the lookahead token matches a terminal in the FIRST set of B. However, I'm curious about table based LL parsers: how can rules of this form work there? As far as I know, the only way to handle repetition like this in one is through right-recursion, but that messes up associativity in cases where a right-associative parse tree is not desired.
I'd like to know because I'm currently attempting to write an LL(1) table-based parser generator and I'm not sure how to handle a case like this without changing the intended parse tree shape.
The Grammar
Let's expand your EBNF grammar to simple BNF and assume, that b is a terminal and <e> is an empty string:
A -> X
X -> BX
X -> <e>
B -> b
This grammar produces strings of terminal b's of any length.
The LL(1) Table
To construct the table, we will need to generate the first and follow sets (constructing an LL(1) parsing table).
First sets
First(α) is the set of terminals that begin strings derived from any string of grammar symbols α.
First(A) : b, <e>
First(X) : b, <e>
First(B) : b
Follow sets
Follow(A) is the set of terminals a that can
appear immediately to the right of a nonterminal A.
Follow(A) : $
Follow(X) : $
Follow(B) : b$
Table
We can now construct the table based on the sets, $ is the end of input marker.
+---+---------+----------+
| | b | $ |
+---+---------+----------+
| A | A -> X | A -> X |
| X | X -> BX | X -> <e> |
| B | B -> b | |
+---+---------+----------+
The parser action always depends on the top of the parse stack and the next input symbol.
Terminal on top of the parse stack:
Matches the input symbol: pop stack, advance to the next input symbol
No match: parse error
Nonterminal on top of the parse stack:
Parse table contains production: apply production to stack
Cell is empty: parse error
$ on top of the parse stack:
$ is the input symbol: accept input
$ is not the input symbol: parse error
Sample Parse
Let us analyze the input bb. The initial parse stack contains the start symbol and the end marker A $.
+-------+-------+-----------+
| Stack | Input | Action |
+-------+-------+-----------+
| A $ | bb$ | A -> X |
| X $ | bb$ | X -> BX |
| B X $ | bb$ | B -> b |
| b X $ | bb$ | consume b |
| X $ | b$ | X -> BX |
| B X $ | b$ | B -> b |
| b X $ | b$ | consume b |
| X $ | $ | X -> <e> |
| $ | $ | accept |
+-------+-------+-----------+
Conclusion
As you can see, rules of the form A = B* can be parsed without problems. The resulting concrete parse tree for input bb would be:
Yes, this is definitely possible. The standard method of rewriting to BNF and constructing a parse table is useful for figuring out how the parser should work – but as far as I can tell, what you're asking is how you can avoid the recursive part, which would mean that you'd get the slanted binary tree/linked list form of AST.
If you're hand-coding the parser, you can simply use a loop, using the lookaheads from the parse table that indicate a recursive call to decide to go around the loop once more. (I.e., you could just use while with those lookaheads as the condition.) Then for each iteration, you simply append the constructed subtree as a child of the current parent. In your case, then, A would get several direct B-children.
Now, as I understand it, you're building a parser generator, and it might be easiest to follow the standard procedure, going via plan BNF. However, that's not really an issue; there is no substantive difference between iteration and recursion, after all. You simply have to have a class of “helper rules” that don't introduce new AST nodes, but that rather append their result to the node of the nonterminal that triggered them. So when turning the repetition into X -> BX, rather than constructing X nodes, you have your X rule extend the child-list of the A or X (whichever triggered it) by its own children. You'll still end up with A having several B children, and no X nodes in sight.

Location in syntax trees

When writing a parser, I want to remember the location of lexemes found, so that I can report useful error messages to the programmer, as in “if-less else on line 23” or ”unexpected character on line 45, character 6” or “variable not defined” or something similar. But once I have built the syntax tree, I will transform it in several ways, optimizing or expanding some kind of macros. The transformations produce or rearrange lexemes which do not have a meaningful location.
Therefore it seems that the type representing the syntax tree should come in two flavor, a flavor with locations decorating lexemes and a flavor without lexemes. Ideally we would like to work with a purely abstract syntax tree, as defined in the OCaml book:
# type unr_op = UMINUS | NOT ;;
# type bin_op = PLUS | MINUS | MULT | DIV | MOD
| EQUAL | LESS | LESSEQ | GREAT | GREATEQ | DIFF
| AND | OR ;;
# type expression =
ExpInt of int
| ExpVar of string
| ExpStr of string
| ExpUnr of unr_op * expression
| ExpBin of expression * bin_op * expression ;;
# type command =
Rem of string
| Goto of int
| Print of expression
| Input of string
| If of expression * int
| Let of string * expression ;;
# type line = { num : int ; cmd : command } ;;
# type program = line list ;;
We should be allowed to totally forget about locations when working on that tree and have special functions to map an expression back to its location (for instance), that we could use in case of emergency.
What is the best way to define such a type in OCaml or to handle lexeme positions?
The best way is to work always with AST nodes fully annotated with the locations. For example:
type expression = {
expr_desc : expr_desc;
expr_loc : Lexing.position * Lexing.position; (* start and end *)
}
and expr_desc =
ExpInt of int
| ExpVar of string
| ExpStr of string
| ExpUnr of unr_op * expression
| ExpBin of expression * bin_op * expression
Your idea, keeping the AST free of locations and writing a function to retrieve the missing locations is not a good idea, I believe. Such a function should require searching by pointer equivalence of AST nodes or something similar, which does not really scale.
I strongly recommend to look though OCaml compiler's parser.mly which is a full scale example of AST with locations.

Strange problem with context free grammar

I begin with an otherwise well formed (and well working) grammar for a language. Variables,
binary operators, function calls, lists, loops, conditionals, etc. To this grammar I'd like to add what I'm calling the object construct:
object
: object_name ARROW more_objects
;
more_objects
: object_name
| object_name ARROW more_objects
;
object_name
: IDENTIFIER
;
The point is to be able to access scalars nested in objects. For example:
car->color
monster->weapon->damage
pc->tower->motherboard->socket_type
I'm adding object as a primary_expression:
primary_expression
: id_lookup
| constant_value
| '(' expression ')'
| list_initialization
| function_call
| object
;
Now here's a sample script:
const list = [ 1, 2, 3, 4 ];
for var x in list {
send "foo " + x + "!";
}
send "Done!";
Prior to adding the nonterminal object as a primary_expression everything is sunshine and puppies. Even after I add it, Bison doesn't complain. No shift and/or reduce conflicts reported. And the generated code compiles without a sound. But when I try to run the sample script above, I get told error on line 2: Attempting to use undefined symbol '{' on line 2.
If I change the script to:
var list = 0;
for var x in [ 1, 2, 3, 4 ] {
send "foo " + x + "!";
}
send "Done!";
Then I get error on line 3: Attempting to use undefined symbol '+' on line 3.
Clearly the presence of object in the grammar is messing up how the parser behaves [SOMEhow], and I feel like I'm ignoring a rather simple principle of language theory that would fix this in a jiff, but the fact that there aren't any shift/reduce conflicts has left me bewildered.
Is there a better way (grammatically) to write these rules? What am I missing? Why aren't there any conflicts?
(And here's the full grammar file in case it helps)
UPDATE: To clarify, this language, which compiles into code being run by a virtual machine, is embedded into another system - a game, specifically. It has scalars and lists, and there are no complex data types. When I say I want to add objects to the language, that's actually a misnomer. I am not adding support for user-defined types to my language.
The objects being accessed with the object construct are actually objects from the game which I'm allowing the language processor to access through an intermediate layer which connects the VM to the game engine. This layer is designed to decouple as much as possible the language definition and the virtual machine mechanics from the implementation and details of the game engine.
So when, in my language I write:
player->name
That only gets codified by the compiler. "player" and "name" are not traditional identifiers because they are not added to the symbol table, and nothing is done with them at compile time except to translate the request for the name of the player into 3-address code.
It seems you are doing a classical error when using direct strings in the yacc source file. Since you are using a lexer, you can only use token names in yacc source files. More on this here
So I spent a reasonable amount of time picking over the grammar (and the bison output) and can't see what is obviously wrong here. Without having the means to execute it, I can't easily figure out what is going on by experimentation. Therefore, here are some concrete steps I usually go through when debugging grammars. Hopefully you can do any of these you haven't already done and then perhaps post follow-ups (or edit your question) with any results that might be revealing:
Contrive the smallest (in terms of number of tokens) possible working input, and the smallest possible non-working inputs based on the rules you expect to be applied.
Create a copy of the grammar file including only the troublesome rules and as few other supporting rules as you can get away with (i.e. you want a language that only allows construction of sequences consisting of the object and more_object rules, joined by ARROW. Does this work as you expect?
Does the rule in which it is nested work as you expect? Try replacing object with some other very simple rule (using some tokens not occuring elsewhere) and seeing if you can include those tokens without it breaking everything else.
Run bison with --report=all. Inspect the output to try to trace the rules you've added and the states that they affect. Try removing those rules and repeat the process - what has changed? This is extremely time consuming often, and is a giant pain, but it's a good last resort. I recommend a pencil and some paper.
Looking at the structure of your error output - '+' is being recognised as an identifier token, and is therefore being looked up as a symbol. It might be worth checker your lexer to see how it is processing identifier tokens. You might just accidentally be grabbing too much. As a further debugging technique, you might consider turning some of those token literals (e.g. '+', '{', etc) into real tokens so that bison's error reporting can help you out a little more.
EDIT: OK, the more I've dug into it, the more I'm convinced that the lexer is not necessarily working as it should be. I would double-check that the stream of tokens you are getting from yylex() matches your expectations before proceeding any further. In particular, it looks like a bunch of symbols that you consider special (e.g. '+' and '{') are being captured by some of your regular expressions, or at least are being allowed to pass for identifiers.
You don't get shift/reduce conflicts because your rules using object_name and more_objects are right-recursive - rather than the left-recursive rules that Yacc (Bison) handles most naturally.
On classic Yacc, you would find that you can run out of stack space with deep enough nesting of the 'object->name->what->not' notation. Bison extends its stack at runtime, so you have to run out of memory, which is a lot harder these days than it was when machines had a few megabytes of memory (or less).
One result of the right-recursion is that no reductions occur until you read the last of the object names in the chain (or, more accurately, one symbol beyond that). I see that you've used right-recursion with your statement_list rule too - and in a number of other places too.
I think your principal problem is that you failed to define a subtree constructor
in your object subgrammar. (EDIT: OP says he left the semantic actions for
object out of his example text. That doesn't change the following answer).
You probably have to lookup up the objects in the order encountered, too.
Maybe you intended:
primary_expression
: constant_value { $$ = $1; }
| '(' expression ')' { $$ = $2; }
| list_initialization { $$ = $1; }
| function_call { $$ = $1; }
| object { $$ = $1; }
;
object
: IDENTIFIER { $$ = LookupVariableOrObject( yytext ); }
| object ARROW IDENTIFIER { $$ = LookupSubobject( $1, yytext ); }
;
I assume that if one encounters an identifier X by itself, your default interpretation
is that it is a variable name. But, if you encounter X -> Y, then even if X
is a variable name, you want the object X with subobject Y.
What LookupVarOrObject does is to lookup the leftmost identifier encountered to see if it is variable
(and return essentially the same value as idlookup which must produce an AST node of type AST_VAR),
or see if it is valid object name, and return an AST node marked as an AST_OBJ,
or complain if the identifier isn't one of these.
What LookupSuboject does, is to check its left operand to ensure it is an AST_OBJ
(or an AST_VAR whose name happens to be the same as that of an object).
and complain if it is not. If it is, then its looks up the yytext-child object of
the named AST_OBJ.
EDIT: Based on discussion comments in another answer, right-recursion in the OP's original
grammar might be problematic if the OP's semantic checks inspect global lexer state (yytext).
This solution is left-recursive and won't run afoul of that particular trap.
id_lookup
: IDENTIFIER
is formally identical to
object_name
: IDENTIFIER
and object_name would accept everything that id_lookup wouldn't, so assertLookup( yytext ); probably runs on everything that may look like IDENTIFIER and is not accepted by enother rule just to decide between the 2 and then object_name can't accept because single lookahead forbids that.
For the twilight zone, the two chars that you got errors for are not declared as tokens with opends the zone of undefinded behavior and could trip parser into trying to treat them as potential identifiers when the grammar gets loose.
I just tried running muscl in Ubuntu 10.04 using bison 2.4.1 and I was able to run both of your examples with no syntax errors. My guess is that you have a bug in your version of bison. Let me know if I'm somehow running your parser wrong. Below is the output from the first example you gave.
./muscle < ./test1.m (this was your first test)
\-statement list
|-declaration (constant)
| |-symbol reference
| | \-list (constant)
| \-list
| |-value
| | \-1
| |-value
| | \-2
| |-value
| | \-3
| \-value
| \-4
|-loop (for-in)
| |-symbol reference
| | \-x (variable)
| |-symbol reference
| | \-list (constant)
| \-statement list
| \-send statement
| \-binary op (addition)
| |-binary op (addition)
| | |-value
| | | \-foo
| | \-symbol reference
| | \-x (variable)
| \-value
| \-!
\-send statement
\-value
\-Done!
+-----+----------+-----------------------+-----------------------+
| 1 | VALUE | 1 | |
| 2 | ELMT | #1 | |
| 3 | VALUE | 2 | |
| 4 | ELMT | #3 | |
| 5 | VALUE | 3 | |
| 6 | ELMT | #5 | |
| 7 | VALUE | 4 | |
| 8 | ELMT | #7 | |
| 9 | LIST | | |
| 10 | CONST | #10 | #9 |
| 11 | ITER_NEW | #11 | #10 |
| 12 | BRA | #14 | |
| 13 | ITER_INC | #11 | |
| 14 | ITER_END | #11 | |
| 15 | BRT | #22 | |
| 16 | VALUE | foo | |
| 17 | ADD | #16 | #11 |
| 18 | VALUE | ! | |
| 19 | ADD | #17 | #18 |
| 20 | SEND | #19 | |
| 21 | BRA | #13 | |
| 22 | VALUE | Done! | |
| 23 | SEND | #22 | |
| 24 | HALT | | |
+-----+----------+-----------------------+-----------------------+
foo 1!
foo 2!
foo 3!
foo 4!
Done!

When should I use enhanced record types in Delphi instead of classes?

Delphi 2006 introduced new capabilities for records, making them more 'object-oriented'.
In which situations is the record type more appropriate for a design than a class type?
Which advantage does it have to use these record types?
You have records, objects and classes.
Records are available since turbo pascal 1. They are lightweight, capable of having properties and methods, but they do not support inheritance, There are some issues with functions that return records. If these records have methods this sometimes gives internal errors:
type
TRec = record
function Method1: Integer;
end;
function Func: TRec;
procedure Test;
var
x : TRec;
begin
Func.Method1; // Sometimes crashes the compiler
// Circumvention:
x := Func;
x.Method1; // Works
end;
Objects are introduced with turbo pascal 5 if I'm correct. They then provided a way for OO with pascal. They are more or less deprecated with the introduction of Delphi, but you can still use them. Objects can implement interfaces.
Classes are introduced with Delphi 1 and the most versatile. They implement interfaces and support inheritance. But each class variable is a hidden pointer. This means that classes need to be created on the heap. Luckily this process is mostly hidden.
Below is a table with the differences between the three. I added the interface for completion.
|Class|Object|Record|Interface|
------------------|-----------------------------|
Are pointers? | y | n | n | y |
Inheritance | y | y | n | y |
Helpers | y | n | y | n |
Impl. Interface | y | y | n | - |
Visibility | y | y | n | n |
Method | y | y | y | y |
Fields | y | y | y | n |
Properties | y | y | y | y |
Consts | y | y | y | n |
Types | y | y | y | n |
Variants | n | n | y | n |
Virtual | y | n | y | - |
------------------|-----------------------------|
I think those features were also available in Delphi 8 and 2005.
Main guideline: if you're in doubt, use a class.
For the rest you have to understand the main difference: Class Objects are always used through a reference, and are created by calling a Constructor.
The memory management and allocation for Records is the same as for the basic types (ie integer, double). That means that they are passed to methods by value (unless var is used). Also you don't need to Free records, and that's the reason they support operator overloading. But no inheritance or virtual methods etc. The new Records can have a constructor but it's use is kind of optional.
The main areas and criteria for using records:
when dealing with structs from the Win32 API
when the types don't have identity (because assignment means copying)
when the instances aren't too large (copying big records becomes expensive)
when building value types, whose behaviour should mimic the numerical types. Examples are DateTime, Complex Numbers, Vectors etc. And then operator overloading is a nice feature, but don't make that the deciding factor.
And efficiency-wise, don't overdo this:
for smaller types that you put in arrays often.
And finally, the rules for using a class or a records haven't really changed form the earlier versions of Delphi.
In addition to the other answers (operator overloading, lightweight value types), it's a good idea to make your enumerators records instead of classes. Since they're allocated on the stack, there's no need to construct and destruct them, which also removes the need for the hidden try..finally block that the compiler places around class-type enumerators.
See http://hallvards.blogspot.com/2007/10/more-fun-with-enumerators.html for more information.
You can use operator overloading (like implicit conversions). This you can do on Delphi 2007+ or 2006.NET on objects too, but only on these records on 2006 win32.

Resources