How to discover lock declaration instruction in llvm? - clang

I'm new to llvm , and was trying to find lock declaration statement and then do some instrumention work,the code like this:
#include <iostream>
#include <thread>
#include <mutex>
using namespace std;
int share = 42;
mutex m;
void f()
{
m.lock();
--share;
cout << "function f -> share: " << share << '\n';
m.unlock();
}
int main()
{
thread thf{f};
thf.join();
return 0;
}
I want to find the lock declaration instruction eg:
mutex m;
the llvm instrumention pass like this:
struct SkeletonPass : public FunctionPass {
static char ID;
SkeletonPass() : FunctionPass(ID) {}
virtual bool runOnFunction(Function &F) {
// Get the function to call from our runtime library.
LLVMContext &Ctx = F.getContext();
Constant *logFunc = F.getParent()->getOrInsertFunction(
"logop", Type::getVoidTy(Ctx), Type::getInt32Ty(Ctx), NULL
);
for (auto &B : F) {
for (auto &I : B) {
***if ((&I) is lock declaration instruction)*** {
// Insert something *after* `op`.
IRBuilder<> builder(op);
builder.SetInsertPoint(&B, ++builder.GetInsertPoint());
// Insert a call to function.
builder.CreateCall(logFunc, ConstantInt::get(Type::getInt32Ty(Ctx), 2));
return true;
}
}
}
In short, could you please tell me how to discover lock declaration instruction, thanks!

The declaration would appear as a global, so you should write a module pass to find it, not a function pass. It should appear as something like:
#m = global %mutex zeroinitializer
In fact, using the demo at http://ellcc.org/demo/index.cgi to try this, you can indeed see that:
...
%"class.std::__1::mutex" = type { %struct.pthread_mutex_t }
%struct.pthread_mutex_t = type { %union.anon }
%union.anon = type { [5 x i8*] }
...
#m = global %"class.std::__1::mutex" zeroinitializer, align 8

You can use LLVM's CppBackend to compile your code. This would produce a C++ code that makes up the source. You can then easily find out how mutex m; definition is constructed via LLVM API.
Run clang -march=cpp foo.cpp to use CppBackend. Alternatively, you can use this demo page to compile your code online.

Related

C++ set with customized comparator crashes on insert

STL set can have customized comparator. It can be implemented in several ways, such as define an operator(), use decltype on lambda, etc. I was trying to use a static method of a class and encountered a weird crash. The crash can be demonstrated by the following code
#include <string>
#include <set>
struct Foo {
static bool Compare(std::string const& s1, std::string const& s2)
{
return s1 < s2;
}
};
std::set<std::string, decltype(&Foo::Compare)> S;
int main()
{
S.insert("hello");
S.insert("world");
return 0;
}
The crash happened on the second insert. Thank you.
You have to pass pointer to compare function to set constructor, otherwise it is null and it is why the code fails.
std::set<std::string, decltype(&Foo::Compare)> S{&Foo::Compare};
By decltype(&Foo::Compare) you only specify the type of comparator, it has to be provided because its default value is null.
Change your code to below will solve the problem.
struct Foo {
bool operator()(std::string const& s1, std::string const& s2)
{
return s1 < s2;
}
};
std::set<std::string, Foo> S;
The original program crushes because the constructor of set will try to call the constructor of decltype(&Foo::Compare) which is actually not constructible.

store a lambda that captures this

Using C++ 17, I'm looking for a way to store a lambda that captures the this pointer, without using std::function<>. The reason to not using std::function<> is that I need the guaranty that no dynamic memory allocations are used. The purpose of this, is to be able to define some asynchronous program flow. Example:
class foo {
public:
void start() {
timer(1ms, [this](){
set_pin(1,2);
timer(1ms, [this](){
set_pin(2,1);
}
}
}
private:
template < class Timeout, class Callback >
void timer( Timeout to, Callback&& cb ) {
cb_ = cb;
// setup timer and call cb_ one timeout reached
...
}
??? cb_;
};
Edit: Maybe it's not really clear: std::function<void()> would do the job, but I need / like to have the guaranty, that no dynamic allocations happens as the project is in the embedded field. In practice std::function<void()> seems to not require dynamic memory allocation, if the lambda just captures this. I guess this is due to some small object optimizations, but I would like to not rely on that.
You can write your own function_lite to store the lambda, then you can use static_assert to check the size and alignment requirements are satisfied:
#include <cstddef>
#include <new>
#include <type_traits>
class function_lite {
static constexpr unsigned buffer_size = 16;
using trampoline_type = void (function_lite::*)() const;
trampoline_type trampoline;
trampoline_type cleanup;
alignas(std::max_align_t) char buffer[buffer_size];
template <typename T>
void trampoline_func() const {
auto const obj =
std::launder(static_cast<const T*>(static_cast<const void*>(buffer)));
(*obj)();
}
template <typename T>
void cleanup_func() const {
auto const obj =
std::launder(static_cast<const T*>(static_cast<const void*>(buffer)));
obj->~T();
}
public:
template <typename T>
function_lite(T t)
: trampoline(&function_lite::trampoline_func<T>),
cleanup(&function_lite::cleanup_func<T>) {
static_assert(sizeof(T) <= buffer_size);
static_assert(alignof(T) <= alignof(std::max_align_t));
new (static_cast<void*>(buffer)) T(t);
}
~function_lite() { (this->*cleanup)(); }
function_lite(function_lite const&) = delete;
function_lite& operator=(function_lite const&) = delete;
void operator()() const { (this->*trampoline)(); }
};
int main() {
int x = 0;
function_lite f([x] {});
}
Note: this is not copyable; to add copy or move semantics you will need to add new members like trampoline and cleanup which can properly copy the stored object.
There is no drop in replacement in the language or the standard library.
Every lambda is a unique type in the typesystem. Technically you may have a lambda as a member, but then its type is fixed. You may not assign other lambdas to it.
If you really want to have an owning function wrapper like std::function, you need to write your own. Actually you want a std::function with a big enough small-buffer-optimization buffer.
Another approach would be to omit the this capture and pass it to the function when doing the call. So you have a captureless lambda, which is convertible to a function pointer which you can easily store. I would take this route and adapter complexer ways if really nessessary.
it would look like this (i trimmed down the code a bit):
class foo
{
public:
void start()
{
timer(1, [](foo* instance)
{
instance->set_pin(1,2);
});
}
private:
template < class Timeout, class Callback >
void timer( Timeout to, Callback&& cb )
{
cb_ = cb;
cb_(this); // call the callback like this
}
void set_pin(int, int)
{
std::cout << "pin set\n";
}
void(*cb_)(foo*);
};

Storing multiple types into class member container

I was reading this Q/A here and as my question is similar but different I would like to know how to do the following:
Let's say I have a basic non template non inherited class called Storage.
class Storage {};
I would like for this class to have a single container (unordered multimap) is where I'm leaning towards... That will hold a std::string for a name id to a variable type T. The class itself will not be template. However a member function to add in elements would be. A member function to add might look like this:
template<T>
void addElement( const std::string& name, T& t );
This function will then populate the unorderd multimap. However each time this function is called each type could be different. So my map would look something like:
"Hotdogs", 8 // here 8 is int
"Price", 4.85f // here 4.8f is float.
How would I declare such an unorderd multimap using templates, variadic parameters, maybe even tuple, any or variant... without the class itself being a template? I prefer not to use boost or other libraries other than the standard.
I tried something like this:
class Storage {
private:
template<class T>
typedef std::unorderd_multimap<std::string, T> DataTypes;
template<class... T>
typedef std::unordered_multimap<std::vector<std::string>, std::tuple<T...>> DataTypes;
};
But I can not seem to get the typedefs correct so that I can declare them like this:
{
DataTypes mDataTypes;
}
You tagged C++17, so you could use std::any (or std::variant if the T type can be a limited and know set of types`).
To store the values is simple.
#include <any>
#include <unordered_map>
class Storage
{
private:
using DataTypes = std::unordered_multimap<std::string, std::any>;
DataTypes mDataTypes;
public:
template <typename T>
void addElement (std::string const & name, T && t)
{ mDataTypes.emplace(name, std::forward<T>(t)); }
};
int main()
{
Storage s;
s.addElement("Hotdogs", 8);
s.addElement("Price", 4.85f);
// but how extract the values ?
}
But the problem is that now you have a element with "Hotdogs" and "Price" keys in the map, but you have no info about the type of the value.
So you have to save, in some way, a info about the type of th value (transform the value in a std::pair with some id-type and the std::any?) to extract it when you need it.
I've done something along those lines, the actual solution is very specific to your problem.
That being said, I'm doing this on a vector, but the principle applies to maps, too.
If you're not building an API and hence know all classes that will be involved you could use std::variant something along the lines of this:
#include <variant>
#include <vector>
#include <iostream>
struct ex1 {};
struct ex2 {};
using storage_t = std::variant<ex1, ex2>;
struct unspecific_operation {
void operator()(ex1 arg) { std::cout << "got ex1\n";}
void operator()(ex2 arg) { std::cout << "got ex2\n";}
};
int main() {
auto storage = std::vector<storage_t>{};
storage.push_back(ex1{});
storage.push_back(ex2{});
auto op = unspecific_operation{};
for(const auto& content : storage) {
std::visit(op, content);
}
return 0;
}
which will output:
got ex1
got ex2
If I remember correctly, using std::any will enable RTTI, which can get quite expensive; might be wrong tho.
If you provide more specifics about what you actually want to do with it, I can give you a more specific solution.
for an example with the unordered map:
#include <variant>
#include <unordered_map>
#include <string>
#include <iostream>
struct ex1 {};
struct ex2 {};
using storage_t = std::variant<ex1, ex2>;
struct unspecific_operation {
void operator()(ex1 arg) { std::cout << "got ex1\n";}
void operator()(ex2 arg) { std::cout << "got ex2\n";}
};
class Storage {
private:
using map_t = std::unordered_multimap<std::string, storage_t>;
map_t data;
public:
Storage() : data{map_t{}}
{}
void addElement(std::string name, storage_t elem) {
data.insert(std::make_pair(name, elem));
}
void doSomething() {
auto op = unspecific_operation{};
for(const auto& content : data) {
std::visit(op, content.second);
}
}
};
int main() {
auto storage = Storage{};
storage.addElement("elem1", ex1{});
storage.addElement("elem2", ex2{});
storage.addElement("elem3", ex1{});
storage.doSomething();
return 0;
}

Usage of FunctionPass over ModulePass when creating LLVM passes

I've seen quite a numerous amount of examples that go over creating functions passes (e.g. Brandon Holt and Adrian Sampson), but I am curious as to the difficulty in creating a module pass to do these very similar problems. I've tried to implement a module pass to display the global variable names using this example and llvm source code to understand how you have to iterate through members.
I am using a source compiled version of LLVM, and using the example from the above links to add the pass, and then running:
$ clang -Xclang -load -Xclang build/Skeleton/libSkeletonPass.so something.c
Which then returns this gibberish. However, if I implement a functionPass and just use Auto to determine the type to be initialized it's very straight forward and works. Am I just going about printing the global variables the wrong way?
This is a pastebin of the error output from the terminal. link
Skeleton.cpp
#include "llvm/Pass.h"
#include "llvm/IR/Function.h"
#include "llvm/Support/raw_ostream.h"
#include "llvm/IR/LegacyPassManager.h"
#include "llvm/IR/InstrTypes.h"
#include "llvm/Transforms/IPO/PassManagerBuilder.h"
#include "llvm/IR/IRBuilder.h"
#include "llvm/Transforms/Utils/BasicBlockUtils.h"
#include "llvm/IR/Module.h"
#include "llvm/Support/SourceMgr.h"
#include "llvm/IRReader/IRReader.h"
#include "llvm/IR/LLVMContext.h"
using namespace llvm;
namespace {
// Helper method for converting the name of a LLVM type to a string
static std::string LLVMTypeAsString(const Type *T) {
std::string TypeName;
raw_string_ostream N(TypeName);
T->print(N);
return N.str();
}
struct SkeletonPass : public ModulePass {
static char ID;
SkeletonPass() : ModulePass(ID) {}
virtual bool runOnModule(Module &M) {
for (Module::const_global_iterator GI = M.global_begin(),
GE = M.global_end(); GI != GE; ++GI) {
errs() << "Found global named: " << GI->getName()
<< "\tType: " << LLVMTypeAsString(GI->getType()) << "!\n";
}
return false;
}
};
}
char SkeletonPass::ID = 0;
// Automatically enable the pass.
// http://adriansampson.net/blog/clangpass.html
static void registerSkeletonPass(const PassManagerBuilder &,
legacy::PassManagerBase &PM) {
PM.add(new SkeletonPass());
}
static RegisterStandardPasses
RegisterMyPass(PassManagerBuilder::EP_EarlyAsPossible,
registerSkeletonPass);
something.c
int value0 = 5;
int main(int argc, char const *argv[])
{
int value = 4;
value += 1;
return 0;
}
I was able to figure this out after some extensive github searching. Here is the answer from which I was following a tutorial to help others who may be curious how to implement a Module Pass.

How to find out the real return types by VisitReturnStmt in Clang

I want to find out type informations of every functions using by Clang libtool.
However, VisitReturnStmt sometimes cannot find any return statements.
Also, class type return(eg. ASTConsumer * in "CreateASTConsumer" method) is converted to "int *" type.
(another case: bool -> _Bool)
How can I find out real return types for every functions?
Thanks in advance for your help.
The tool source and the input cpp source are same as follows.
#include "clang/Driver/Options.h"
#include "clang/AST/AST.h"
#include "clang/AST/ASTContext.h"
#include "clang/AST/ASTConsumer.h"
#include "clang/AST/RecursiveASTVisitor.h"
#include "clang/Frontend/ASTConsumers.h"
#include "clang/Frontend/FrontendActions.h"
#include "clang/Frontend/CompilerInstance.h"
#include "clang/Tooling/CommonOptionsParser.h"
#include "clang/Tooling/Tooling.h"
using namespace std;
using namespace clang;
using namespace clang::driver;
using namespace clang::tooling;
using namespace llvm;
Rewriter TheRewriter;
class ExampleVisitor : public RecursiveASTVisitor<ExampleVisitor> {
private:
ASTContext *astContext; // used for getting additional AST info
public:
explicit ExampleVisitor(CompilerInstance *CI)
: astContext(&(CI->getASTContext())) // initialize private members
{
TheRewriter.setSourceMgr(astContext->getSourceManager(), astContext->getLangOpts());
}
virtual bool VisitReturnStmt(ReturnStmt *ReturnStatement) {
ReturnStatement->getRetValue()->dump(TheRewriter.getSourceMgr());
return true;
}
virtual bool VisitStmt(Stmt *S) {
S->dump(TheRewriter.getSourceMgr());
return true;
}
};
class ExampleASTConsumer : public ASTConsumer {
private:
ExampleVisitor *visitor; // doesn't have to be private
public:
// override the constructor in order to pass CI
explicit ExampleASTConsumer(CompilerInstance *CI)
: visitor(new ExampleVisitor(CI)) // initialize the visitor
{ }
// override this to call our ExampleVisitor on the entire source file
virtual void HandleTranslationUnit(ASTContext &Context) {
/* we can use ASTContext to get the TranslationUnitDecl, which is
a single Decl that collectively represents the entire source file */
visitor->TraverseDecl(Context.getTranslationUnitDecl());
}
};
class ExampleFrontendAction : public ASTFrontendAction {
public:
virtual ASTConsumer *CreateASTConsumer(CompilerInstance &CI, StringRef file) {
return new ExampleASTConsumer(&CI); // pass CI pointer to ASTConsumer
}
};
int main(int argc, const char **argv) {
// parse the command-line args passed to your code
CommonOptionsParser op(argc, argv);
// create a new Clang Tool instance (a LibTooling environment)
ClangTool Tool(op.getCompilations(), op.getSourcePathList());
// run the Clang Tool, creating a new FrontendAction (explained below)
int result = Tool.run(newFrontendActionFactory<ExampleFrontendAction>());
return result;
}
If I'm interpreting the clang docs correctly
Note that GCC allows return with no argument in a function declared to return a value, and it allows returning a value in functions declared to return void. We explicitly model this in the AST, which means you can't depend on the return type of the function and the presence of an argument.
this implies that you can't reliably infer the return type of a function from its return statement.
If you want to find the return type of a function, you could visit FunctionDecl nodes and call FunctionDecl::getReturnType() on them

Resources