GNU-C-preprocessing FORTRAN source to change array indices causes recursion whilst expanding macro - preprocessor

I am parallelizing an existing FORTRAN application. I don't want to directly change parts of its code so I am using preprocessor directives to accomplish my goal. This way I am able to maintain the readability of the code and I won't induce errors in parts of the code that have already been tested. However, when I try to preprocess my source with the GNU C preprocessor I get the following error message (gcc version 4.7.2 (Debian 4.7.2-5)):
test.f:9:0: error: detected recursion whilst expanding macro "ARR""
This simple test program demonstrates my problem:
PROGRAM TEST
IMPLICIT NONE
INTEGER I,OFFSET,ARR(10)
#define ARR(I) ARR(OFFSET+I)
DO I=1,10
ARR(I)=I
END DO
#undef ARR(I)
END PROGRAM TEST
This is the commandline output:
testing$ gfortran -cpp -E test.f
# 1 "test.f"
# 1 "<command-line>"
# 1 "test.f"
PROGRAM TEST
[...]
test.f:9:0: error: detected recursion whilst expanding macro "ARR"
DO I=1,10
ARR(OFFSET+OFFSET+OFFSET+OFFSET+OFFSET+OFFSET+OFFSET+OFFSET+OFFSET+OFFSET+OFFSET+OFFSET+OFFSET+OFFSET+OFFSET+OFFSET+OFFSET+OFFSET+OFFSET+OFFSET+OFFSET+I)=I
END DO
[...]
END PROGRAM TEST
This site provides some information on the preprocessor I am using:
http://tigcc.ticalc.org/doc/cpp.html#SEC10
As it seems I am using a function-like macro with macro arguments.
Why is the preprocessor detecting a recursion? [EDIT] - Maybe because I use the same name for Makro and Identifier?
Why isn't the preprocessor capable of interpreting upper case directives (#DEFINE instead of #define)? - I am asking, because I haven't had this problem with the ifort preprocessor.
BTW: I am able to preprocess the original code either using the ifort preprocessor -fpp, or by changing the source in the following way:
PROGRAM TEST
IMPLICIT NONE
INTEGER I,OFFSET,ARR(10)
#define ARR_T(I) ARR(OFFSET+I)
DO I=1,10
ARR_T(I)=I
END DO
#undef ARR_T(I)
END PROGRAM TEST

Why is the preprocessor detecting a recursion? [EDIT] - Maybe because I use the same name for Makro and Identifier?
The preprocessor is detecting recursion because your macro name and array name are the same.
Why isn't the preprocessor capable of interpreting upper case directives (#DEFINE instead of #define)? - I am asking, because I haven't had this problem with the ifort preprocessor.
When using gfortran, you are using a C preprocessor. #DEFINE is not a recognized preprocessor directive in C. No idea about ifort. I thought in ifort you had to prefix macros with !$MS or !$DEC.
Your change to the program to get it to work for ifort will also work for gfortran.

Related

how to make target independent IR with llvm

I want to make target independent IR with LLVM.
clang -emit-llvm -S source.c -o source.ll
in source.ll
target datalayout = "e-m:e-i64:..."
target triple = "x86_64-pc-linux-gnu"
...
LLVM IR is said to be Target-independent, but the properties of the target are specified in the actual IR file.
How can I create an LLVM IR without this Target property?
Short answer: you cannot
Long answer: target-neutrality is the property of input language, not the LLVM IR. While in theory it is possible to make more or less target-neutral LLVM IR for some inputs it is not possible for C/C++ inputs. I will only mention few things that prevents us from having such LLVM IR:
Preprocessor. Target-specific #ifdef clauses obviously make resulting IR target-specific
Pointer sizes. Think about expressions like sizeof(void*). These are target-dependent compile-time constants (yes, there are ways to defer calculation of these constants later on, but this is not something frontends are prepared to deal with, this also hinders many optimizations)
Struct layout. Partly it depends on 2. (think about struct { int foo; void* bar; }
Various ABI-related things like necessary support steps for argument / result passing, etc.
I will not mention target-specific things like vectors, builtins for target-specific instructions sets, etc.

What does .text.unlikely mean in ELF object files?

In my objdump -t output, I see the following two lines:
00000000000004d2 l F .text.unlikely 00000000000000ec function-signature-goes-here [clone .cold.427]
and
00000000000018e0 g F .text 0000000000000690 function-signature-goes-here
I know l means local and g means global. I also know that .text is a section, or a type of section, in an object file, containing compiled program instructions. But what is .text.unlikely? Assuming it's a different section (or type-of-section) from .text - what's the difference?
In my GCC v5.4.0 manpage, I found the following switch:
-freorder-functions
which says:
Reorder functions in the object file in order to improve code
locality. This is implemented by using special subsections
".text.hot" for most frequently executed functions and
".text.unlikely" for unlikely executed functions. Reordering is done
by the linker so object file format must support named sections and
linker must place them in a reasonable way.
Also profile feedback must be available to make this option effective.
See -fprofile-arcs for details.
Enabled at levels -O2, -O3, -Os.
Looks like the compiler was run with optimization flags or that switch for this binary, and functions are organized in subsections to optimize spatial locality.

Flex multiple .l file arguments don't work? (eg "flex a.l b.l")

I finally have a comfortable-enough workflow for writing my flex programs, and I'll work bison into it soon (I dabbled with it before but I restarted my project entirely).
flex yy.l; flex flip.l will generate a lex.yy.c and lex.flip.ccorrectly, since I use the prefix option. But I am curious why flex yy.l flip.l or flex *.l does not.
gcc lex* seems to work perfectly fine when all .c files are correctly generated, as by the first command, but trying the same shortcut with flex produces a single lex.yy.c file, which seemed valid up until the unprocessed flip.l file pasted on the end, preventing gcc compilation.
Is this just flex telling me my workflow is dumb and I should use more start conditions in a big file? I'd prefer not to, at least until I have a more complete program to tweak for speed.
My workflow is:
fg 1; fg 2; fg 3; fg 4; flex a.l; flex flip.l; flex rot.l; gcc -g lex*; ./a.out < in
With nano editors as jobs 1, 2, 3, 4 to fg out of the background.
I'm lexing the file in this order: flip, rot, a, rot, flip. And it works, and I can even use preprocessor definitions gcc -DALONE to correctly compile my .c files alone, for testing.
I think what flex is telling you, if anything, is to learn how to use make rather than trying to put together massive build commands.
It's true that flex will only process one file per invocation. On the other hand, both gcc and clang are simply drivers which invoke the actual compiler(s) and linker(s) so that you don't have to write more complicated build recipes. You could easily write a little driver program which invoked flex multiple times, once per argument, but it would be even simpler to use make, with the additional advantage that flex would only be invoked as necessary.
In fact, most large C projects do not use gcc's ability to compile multiple files in a single invocation. Instead, they let make figure out which object files need to be rebuilt (because the corresponding source file changed), thereby considerably speeding up the debug/edit/build cycle.

Clang AST dump doesn't show #defines

I'm dumping the AST of some headers like this:
clang -cc1 -ast-dump -fblocks header.h
However, any #defines on the header are not showing on the dump. Is there a way of adding them?
It's true, #defines are handled by the preprocessor, not the compiler. So you need a preprocessor parser stage. I know of two:
Boost Wave can preprocess the input for you, and/or give you hooks to trigger on macro definitions or uses.
The Clang tool pp-trace uses a Clang library that can do callbacks on many preprocessor events, including macro definitions.

Difference between compilers and parsers?

By concept/function/implementation, what are the differences between compilers and parsers?
A compiler is often made up of several components, one of which is a parser.
A common set of components in a compiler is:
Lexer - break the program up into words.
Parser - check that the syntax of the sentences are correct.
Semantic Analysis - check that the sentences make sense.
Optimizer - edit the sentences for brevity.
Code generator - output something with equivalent semantic meaning using another vocabulary.
To add a little bit:
As mentioned elsewhere, small C is a recursive decent compiler that generated code as it parsed. Basically syntactical analysis, semantic analysis, and code generation in one pass. As I recall, it also lexed in the parser.
A long time ago, I wrote a C compiler (actually several: the Introl-C family for microcontrollers) that used recursive descent and did syntax and semantic checking during the parse and produced a tree representation of the program from which code was generated.
Today, I'm working on a compiler that does source -> tokens -> AST -> IR -> code, pretty much as I described above.
A parser just reads a text into an internal, more abstract representation, often a tree or graph of some sort.
A compiler translates such an internal representation into another format. Most often this means converting source code into executable programs. But the target doesn't have to be machine code. It can be another programming language as well; the compiler would still be a compiler. Obviously a compiler needs a parser to actually read its input.
Compiler always have a parser inside. Parser just process the language and return the tree representation of it, compiler generate something from that tree, actual machine codes or another language.
A parser is one element of a compiler.
Are you looking for the differences between an interpreter and a compiler?
A parser takes in raw-data and parses it into a tree structure. This syntax-tree is then passed on to generator, which will turn it into whatever it is supposed to generate.
So, a parser is a part of a compiler.
In general, parser is a part of the compiler, but compiler is designed to convert the received script generally into machine-readable code or sometimes into another language.
A compiler is a special type of computer program that translates a human readable text file into a form that the computer can more easily understand. At its most basic level, a computer can only understand two things, a 1 and a 0. At this level, a human will operate very slowly and find the information contained in the long string of 1s and 0s incomprehensible. A compiler is a computer program that bridges this gap.
A parser is a piece of software that evaluates the syntax of a script when it is executed on a web server. For scripting languages used on the web, the parser works like a compiler might work in other types of application development environments.Parsers are commonly used in script development because they can evaluate code when the script is executed and do not require that the code be compiled first.

Resources