Equivalent of -ftree-vectorizer-verbose for clang - clang

The question is about how to make clang print information on which loops (or other parts of code) have been vectorized. GCC has a command line switch named -ftree-vectorizer-verbose=6 to do this (or -fopt-info-vec in newer versions of GCC), but I couldn't find anything similar for clang. Does clang support this or my only option is to peek in the disassembly ?

clang has following options to print diagnostics related to vectorization:
-Rpass=loop-vectorize identifies loops that were successfully vectorized.
-Rpass-missed=loop-vectorize identifies loops that failed vectorization and indicates if vectorization was specified.
-Rpass-analysis=loop-vectorize identifies the statements that caused vectorization to fail.
Source: http://llvm.org/docs/Vectorizers.html

Looking through the clang source code, there are a couple vectorization passes in Transforms/Vectorize:
BBVectorize
LoopVectorize
SLPVectorize
The last three don't seem to have any arguments that will print things. But in inside BBVectorize there are a couple of options for printing things when clang is built debug:
bb-vectorize-debug-instruction-examination - When debugging is enabled, output information on the instruction-examination process
bb-vectorize-debug-candidate-selection - When debugging is enabled, output information on the candidate-selection process
bb-vectorize-debug-pair-selection - When debugging is enabled, output information on the pair-selection process
bb-vectorize-debug-cycle-check - When debugging is enabled, output information on the cycle-checking process
bb-vectorize-debug-print-after-every-pair -When debugging is enabled, dump the basic block after every pair is fused
That looks like it's about it.

Related

Why can't ProcDump record memory contents of a 32-bit process under 64-bit Windows 10?

I would like to use ProcDump's ability to create minidumps with a custom MINIDUMP_TYPE via the -mc command-line switch to include memory contents beyond MiniDumpNormal.
Unfortunately neither MiniDumpWithFullMemory, MiniDumpWithIndirectlyReferencedMemory, nor MiniDumpWithPrivateReadWriteMemory | MiniDumpWithPrivateWriteCopyMemory seem to have any effect: A nonempty minidump is created without an error being displayed, but a lot smaller than expected and querying the minidump via WinDbg's .dumpdebug functionality does not list any of the aforementioned flags even if explicitly included in the minidump type. It seems as if none of the flags mentioned above have an impact on ProcDump's behavior.
The process in question is a 32-bit process running under 64-bit Windows 10, build 2004. I have tried both procdump.exe and procdump64.exe version 9.0, albeit without the -64 command-line switch since I do not want to include SysWOW64 overhead. I have also tried copying 32-bit and 64-bit versions of dbghelp.dll provided by the most recent Debugging Tools for Windows SDK into the corresponding folders in which procdump.exe and procdump64.exe are located. Finally, I have made sure to pass the minidump type as hexadecimal numbers and any other flags that I have tried seem to be recognized without an issue and are being listed when inspecting the minidump in WinDbg afterwards.
As an example, the invocation procdump.exe -mc 51B25 <process> should create a dump with
0x51B25 = 334629 = (MiniDumpWithDataSegs
| MiniDumpWithProcessThreadData
| MiniDumpWithHandleData
| MiniDumpWithPrivateReadWriteMemory
| MiniDumpWithUnloadedModules
| MiniDumpWithFullMemoryInfo
| MiniDumpWithThreadInfo
| MiniDumpWithTokenInformation
| MiniDumpWithPrivateWriteCopyMemory)
When inspecting the dump in WinDbg, neither MiniDumpWithPrivateReadWriteMemory nor MiniDumpWithPrivateWriteCopyMemory show up in the .dumpdebug information with corresponding memory regions being unavailable. Note that when I create the dump from within the application using MiniDumpWriteDump for demonstration purposes, the flags do show up when using .dumpdebug and the resulting minidump will be significantly larger (under otherwise comparable conditions).
Can someone confirm that ProcDump is indeed ignoring memory-related flags or explain to me what I am doing wrong?
(Writing a MiniPlus dump using the -mp switch does work but does not necessarily include the memory regions of interest.)

How to Write Out of Tree LLVM LTO Pass?

I'm aware of similar questions here and here, however, the LLVM codebase changes so quickly I'm here to ask if the state of things have changed since then.
So, currently I'm trying to write an out-of-tree pass that works on the whole program CFG (hence the need for the merged bitcode). I would prefer to use the legacy PassManager as opposed to the Mixin-based NPM due to some other legacy passes my current pass relies on.
clang is called with these args:
clang -flto -Xclang -O0 -Xclang -load -Xclang ./mvxaa.so -fuse-ld=gold -o ./tests/target_app $(TARGET_SOURCES)
Will this register the pass as an LTO (full) pass? The pass never runs.
static void registerGlobalCollectionPass(const PassManagerBuilder &PB,
legacy::PassManagerBase &PM) {
PM.add(new CollectGlobals());
}
static RegisterStandardPasses
RegisterMyPass(PassManagerBuilder::EP_FullLinkTimeOptimizationEarly,
registerGlobalCollectionPass);
Looking deeper, it seems that PassManagerBuilder::addExtensionsToPM is called by the individual populateXPassManager functions and they will look through the GlobalExtensions list to call the respective callback functions. For other non-LTO ExtensionPointTy like EP_EnabledOnOptLevel0 there are entries, but when populateLTOPassManager is called, there are no longer any entries in the GlobalExtensions smallvector. Why is this the case?
Is it because LTO occurs at a later point after the linker runs and the -load argument given to dlopen the shared libraries only loads the shared objects at the compilation phase?

What environment variables control dyld?

There are a bunch of environment variables that control dyld launch, several of them very useful for debugging performance problems. Not all of them are documented.
These ones are explained in the dyld man page (at least on macOS 10.13)
DYLD_FRAMEWORK_PATH
DYLD_FALLBACK_FRAMEWORK_PATH
DYLD_VERSIONED_FRAMEWORK_PATH
DYLD_LIBRARY_PATH
DYLD_FALLBACK_LIBRARY_PATH
DYLD_VERSIONED_LIBRARY_PATH
DYLD_PRINT_TO_FILE
DYLD_SHARED_REGION
DYLD_INSERT_LIBRARIES
DYLD_FORCE_FLAT_NAMESPACE
DYLD_IMAGE_SUFFIX
DYLD_PRINT_OPTS
DYLD_PRINT_ENV
DYLD_PRINT_LIBRARIES
DYLD_BIND_AT_LAUNCH
DYLD_DISABLE_DOFS
DYLD_PRINT_APIS
DYLD_PRINT_BINDINGS
DYLD_PRINT_INITIALIZERS
DYLD_PRINT_REBASINGS
DYLD_PRINT_SEGMENTS
DYLD_PRINT_STATISTICS
DYLD_PRINT_DOFS
DYLD_PRINT_RPATHS
DYLD_SHARED_CACHE_DIR
DYLD_SHARED_CACHE_DONT_VALIDATE
This one is documented in man dyld, but isn't listed in the list at the top:
DYLD_PRINT_STATISTICS_DETAILS
These are undocumented:
DYLD_ROOT_PATH
DYLD_PATHS_ROOT
DYLD_DISABLE_PREFETCH
DYLD_PRINT_LIBRARIES_POST_LAUNCH
DYLD_NEW_LOCAL_SHARED_REGIONS
DYLD_NO_FIX_PREBINDING
DYLD_PREBIND_DEBUG
DYLD_PRINT_TO_STDERR
DYLD_PRINT_WEAK_BINDINGS
DYLD_PRINT_WARNINGS
DYLD_PRINT_CS_NOTIFICATIONS
DYLD_PRINT_INTERPOSING
DYLD_PRINT_CODE_SIGNATURES
DYLD_USE_CLOSURES
DYLD_IGNORE_PREBINDING
DYLD_SKIP_MAIN
DYLD_ROOT_PATH and DYLD_PATHS_ROOT appear to be synonyms and allow you to reset the "root" for searching for libraries/frameworks/etc. This is available on macOS/iPhoneSimulator but not iOS.
DYLD_DISABLE_PREFETCH disables the pre-fetching of the content of __DATA and __LINKEDIT segments.
DYLD_PRINT_LIBRARIES_POST_LAUNCH is the same as DYLD_PRINT_LIBRARIES but prints them right after launch has finished.
DYLD_NEW_LOCAL_SHARED_REGIONS and DYLD_NO_FIX_PREBINDING are ignored and don't do anything anymore.
DYLD_PREBIND_DEBUG prints out debug information on why prebinding was not used.
DYLD_PRINT_TO_STDERR only applies to iOS and forces output to stderr (instead of stdout) to help it show up on console logs.
DYLD_PRINT_WEAK_BINDINGS prints debug information on weak bindings.
DYLD_PRINT_WARNINGS prints a bunch of warnings (mostly regards to closures and how they are being used).
DYLD_PRINT_CS_NOTIFICATIONS prints information about the core symbolicator.
DYLD_PRINT_INTERPOSING prints details about interposes that occur.
DYLD_PRINT_CODE_SIGNATURES prints details about code signatures (specifically successes and failures).
DYLD_USE_CLOSURES is a dyld3 feature, but doesn't appear to work for anybody non-internal (need CSR_ALLOW_APPLE_INTERNAL set).
DYLD_IGNORE_PREBINDING has three values ("all", "app", "nonsplit") with nonsplit being the default if a value is not supplied.
DYLD_SKIP_MAIN is an apple only feature used for testing dyld (need CSR_ALLOW_APPLE_INTERNAL set).

lex & yacc multiple definition error

I want to make code scanner and parser but I don't know why this error happens just by looking at the error log. The scanner takes the sample code and divides it into tokens, then returns what each of the tokens in the code is doing.The parser receives the values returned from the scanner and parses the code according to the rules.
It checks validity of grammar of sample code.
and finally this my error
lex.yy.o: In function main:
lex.yy.c:(.text+0x1d2a): multiple definition of main
y.tab.o:y.tab.c:(.text+0x861): first defined here
collect2: error: ld returned 1 exit status
You have defined main in both your files, but C only allows a single definition of main in a program, which is what the linker error is telling you.
The main in your scanner file has an invalid prototype (C hasn't allowed function definitions without a return type for almost 20 years) and also calls yylex only once, which is not going to do much. So it seems pretty well pointless. If you want to debug your scanner without using the parser,byou can link the scanner with -lfl; that library includes a definition of main which repeatedly calls yylex until end of file is signalled.
Instead of scattering printf calls through your scanner, you can just build a debugging version of the scanner using the --debug flag when you generate the scanner. That will print out a trace of all scanner actions.

#line and jump to line

Do any editors honer C #line directives with regards to goto line features?
Context:
I'm working on a code generator and need to jump to a line of the output but the line is specified relative to the the #line directives I'm adding.
I can drop them but then finding the input line is even a worse pain
If the editor is scriptable it should be possible to write a script to do the navigation. There might even be a Vim or Emacs script that already does something similar.
FWIW when I writing a lot of Bison/Flexx I wrote a Zeus Lua macro script that attempted to do something similar (i.e. move from input file to the corresponding line of the output file by search for the #line marker).
For any one that might be interested here is that particular macro script.
#line directives are normally inserted by the precompiler, not into source code, so editors won't usually honor that if the file extension is .c.
However, the normal file extension for post-compiled files is .i or .gch, so you might try using that and see what happens.
I've used the following in a header file occasionally to produce clickable items in
the VC6 and recent VS(2003+) compiler ouptut window.
Basically, this exploits the fact that items output in the compiler output
are essentially being parsed for "PATH(LINENUM): message".
This presumes on the Microsoft compiler's treatment of "pragma remind".
This isn't quite exactly what you asked... but it might be generally helpful
in arriving at something you can get the compiler to emit that some editors might honor.
// The following definitions will allow you to insert
// clickable items in the output stream of the Microsoft compiler.
// The error and warning variants will be reported by the
// IDE as actual warnings and errors... which means you can make
// them occur in the task list.
// In theory, the coding standards could be checked to some extent
// in this way and reminders that show up as warnings or even
// errors inserted...
#define strify0(X) #X
#define strify(X) strify0(X)
#define remind(S) message(__FILE__ "(" strify( __LINE__ ) ") : " S)
// example usage
#pragma remind("warning: fake warning")
#pragma remind("error: fake error")
I haven't tried it in a while but it should still work.
Use sed or a similar tool to translate the #lines to something else not interpreted by the compiler, so you get C error messages on the real line, but have a reference to the original input file nearby.

Resources