I want to get mutual dependencies of all dependencies of a target, to provide this information to a binary (for static analysis).
I defined a rule that loop through direct dependencies. How can I get dependencies of each dependency, to discover the entire graph recursively? Is it possible at all? If no, is there an alternative way?
def _impl(ctx):
for i, d in enumerate(ctx.attr.deps):
# I need to get dependencies of d somehow here
Depending on exactly what information you want, Aspects may do what you want:
https://docs.bazel.build/versions/master/skylark/aspects.html
They allow a rule to collect additional information from transitive dependencies.
There's also the genquery rule, but this may or may not give you all the information you want: https://docs.bazel.build/versions/master/be/general.html#genquery
genquery makes bazel query results available to actions.
I need to pass parameter --output=graph to genquery:
blaze query "kind(rule, deps(//path/to/mytarget))" --output=graph
Related
I am learning Bazel and confused by many basic concepts.
load("//bazel/rules:build_tools.bzl", "build_tools_deps")
build_tools_deps() //build_tools_deps is macro or rules?
load("#bazel_gazelle//:deps.bzl", "gazelle_dependencies")
gazelle_dependencies() //what about the # mean exactly? where is the bazel_gazelle ?
native.new_git_repository(...) //what about the native mean?
What definition is called a function? what definition is a rule?
A macro is a regular Starlark function that wraps (and expands to) rules.
def my_macro(name = ..., ...):
native.cc_library(...)
android_library(...)
native.genrule(...)
Think of macros as a way to chain and group several rules together, which allows you to pipe the output of some rules into the input of others. At this level, you don't think about how a rule is implemented, but what kinds of inputs and outputs they are associated with.
On the other hand, a rule's declaration is done using the rule() function. cc_library, android_library and genrule are all rules. The rule implementation is abstracted in a regular function that accepts a single parameter for the rule context (ctx).
my_rule = rule(
attrs = { ... },
implementation = _my_rule_impl,
)
def _my_rule_impl(ctx):
outfile = ctx.actions.declare_file(...)
ctx.actions.run(...)
return [DefaultInfo(files = depset([outfile]))]
Think of actions as a way to chain and group several command lines together, which works at the level of individual files and running your executables to transform them (ctx.actions.run with exectuable, args, inputs and outputs arguments). Within a rule implementation, you can extract information from rule attributes (ctx.attr), or from dependencies through providers (e.g. ctx.attr.deps[0][DefaultInfo].files)
Note that rules can only be called in BUILD files, not WORKSPACE files.
# is the notation for a repository namespace. #bazel_gazelle is an external repository fetched in the WORKSPACE by a repository rule (not a regular rule), typically http_archive or git_repository. This repository rule can also be called from a macro, like my_macro above or build_tools_deps in your example.
native.<rule name> means that the rule is implemented in Java within Bazel and built into the binary, and not in Starlark.
I am writing a clang tool, yet I am quite new to it, so i came across a problem, that I couldn't find in the docs (yet).
I am using the great Matchers API to find some nodes that I will later want to manipulate in the AST. The problem is, that the clang tool will actually parse eeeverything that belongs to the sourcefile including headers like iostream etc.
Since my manipulation will probably include some refactoring I definitely do not want to touch each and every thing the parser finds.
Right now I am dealing with this by comparing the sourceFiles of nodes that I matched against with the argumets in argv, but needless to say, that this feels wrong since it still parses through ALL the iostream code - it just ignores it whilst doing so. I just cant believe there is not a way to just tell the ClangTool something like:
"only match nodes which location's source file is something the user fed to this tool"
Thinking about it it only makes sense if its possible to individually create ASTs for each source file, but I do need them to be aware of each other or share contextual knowledge and I also haven't figured out a way to do that either.
I feel like I am missing something very obvious here.
thanks in advance :)
There are several narrowing matchers that might help: isExpansionInMainFile and isExpansionInSystemHeader. For example, one could combine the latter with unless to limit matches to AST nodes that are not in system files.
There are several examples of using these in the Code Analysis and Refactoring with Clang Tools repository. For example, see the file lib/callsite_expander.h around line 34, where unless(isExpansionInSystemHeader)) is used to exclude call expressions that are in system headers. Another example is at line 27 of lib/function_signature_expander.h, where the same is used to exclude function declarations in system headers that would otherwise match.
I'm trying to understand config_setting for detecting the underlying platform and had some doubts. Could you help me clarify them?
What is the difference between x64_windows and x64_windows_(msvc|msys) cpus? If I create config_setting's for all of them, will only one of them trigger? Should I just ignore x64_windows?
To detect Windows, what is the recommended way? Currently I'm doing:
config_setting(
name = "windows",
values = {"crosstool_top": "//crosstools/windows"},
)
config_setting(
name = "windows_msvc",
values = {
"crosstool_top": "//crosstools/windows",
"cpu": "x64_windows_msvc",
},
)
config_setting(
name = "windows_msys",
values = {
"crosstool_top": "//crosstools/windows",
"cpu": "x64_windows_msys",
},
)
By using this I want to use :windows to match all Windows versions and :windows_msvc, for example, to match only MSVC. Is this the best way to do it?
What is the difference between darwin and darwin_x86_64 cpus? I know they match macOS, but do I need to always specify both when selecting something for macOS? If not, is there a better way to detect macOS with only one config_setting? Like using //crosstools with Windows?
How do detect Linux? I know you can detect the operating systems you care about first and then use //conditions:default, but it'd be nice to have a way to detect specifically Linux and not leave it as the default.
What are k8, piii, etc? Is there any documentation somewhere describing all the possible cpu values and what they mean?
If I wanted to use //crosstools to detect each platform, is there somewhere I can look up all available crosstools?
Thanks!
Great questions, all. Let me tackle them one by one:
--cpu=x64_windows_msys triggers the C++ toolchain that relies on MSYS/Cygwin. --cpu=x64_windows_msvc triggers the Windows-native (MSVC) toolchain. -cpu=x64_windows triggers the default, which is still MSYS but being converted to MSVC.
Which ones you want to support is up to you, but it's probably safest to support all for generality (and if one is just an alias for the other it doesn't require very complicated logic).
Only one config_setting can trigger at a time.
Unless you'e using a custom -crosstool_top= flag to specify Windows builds, you'll probably want to trigger on --cpu, e.g:
config_setting(
name = "windows",
values = {"cpu": "x64_windows"}
There's not a great way now to define all Windows. This is a current deficiency in Bazel's ability to recognize platforms, which settings like --cpu and --crosstool_top don't quite model the right way. Ongoing work to create a first-class concept of platform will provide the best solution to what you want. But for now --cpu is probably your best option.
This would basically be the same story as Windows. But to my knowledge there's only darwin for default crosstools, no darwin_x86_64.
For the time being it's probably best to use the //conditions:default approach you'd rather not do. Once first-class platforms are available that'll give you the fidelity you want.
k8 and piii are pseudonyms for 86 64-bit and 32-bit CPUs, respectively. They also tend to be associated with "Linux" by convention, although this is not a guaranteed 1-1 match.
There is no definitive set of "all possible CPU values". Basically, --cpu is just a string that gets resolved in CROSSTOOL files to toolchains with identifiers that match that string. This allows you to write new CROSSTOOL files for new CPU types you want to encode yourself. So the exact set of available CPUs depends on who's using Bazel and how they have their workspace set up.
For the same reasons as 5., there is no definitive list. See Bazel's github tools/ directory for references to defaults.
I need it for incremental solving in the context of symbolic execution (Klee).
In points of branching of symbolic execution paths it is necessary to split solver context into 2 parts: with true and false conditions. Of course, there is an expensive workaround - create empty context and replay all constraints.
Is there a way to split Z3_context? Do you plan to add such functionality?
Note
splitting of context can be avoided if use depth-first symbolic exploration, that is exploring current execution path until it reaches "end" and hence this path won't be explored anymore in future. In this case it is enough to pop until branch point reached and continue to explore another condition branch. But in case of Klee many symbolic paths are explored "simultaneously" (exploration of true and false branches is interleaved), so you need solver context solver switching (there is Z3_context argument in each method) and branching (there are no methods for this, that is what I need).
Thanks!
No, the current version of Z3 (3.2) does not support this feature. We realize this is an important capability, and an equivalent feature will be available in the next release.
The idea is to separate the concepts of Context and Solver. In the next release, we will have APIs for creating (and copying) solvers. So, you will be able to use a different solver for each branch of the search. In a nutshell, the Context is used to manage/create Z3 expressions, and the Solver for checking satisfiability.
The approach I currently use for this sort of thing is to assert formulas like p => A instead of A, where p is a fresh Boolean literal. Then in my client I maintain the association between the list of guard literals that correspond to each branch, and use check_assumptions(). In my situation I happen to be able to get away with leaving all formulas allocated during each search, but YMMV. Even for depth-first explorations, I seem to get much more incremental reuse this way than by using push/pop.
I have the following constructor for my Class
public MyClass(File f1, File f2, File f3, Class1 c1, Class2 c2, Class3 c3)
{
..........
}
As can be seen, it has 6 parameters. On seeing this code, one of my seniors said that instead of passing 6 parameters I should rather pass a configuration object.
I wrote the code this way because recently I have read about "Dependency injection", which says "classes must ask for what they want". So I think that passing a configuration object will be against the principle.
Is my interpretation of "Dependency injection" correct? OR Should I take my senior's advice?
"Configuration object" is an obtuse term to apply in this situation; it frames your efforts in a purely mechanical sense. The goal is to communicate your intent to the class's consumer; let's refactor toward that.
Methods or constructors with numerous parameters indicate a loose relationship between them. The consumer generally has to make more inferences to understand the API. What is special about these 3 files together with these 3 classes? That is the information not being communicated.
This is an opportunity to create a more meaningful and intention-revealing interface by extracting an explicit concept from an implicit one. For example, if the 3 files are related because of a user, a UserFileSet parameter would clearly express that. Perhaps f1 is related to c1, f2 to c2, and f3 to c3. Declaring those associations as independent classes would halve the parameter count and increase the amount of information that can be derived from your API.
Ultimately, the refactoring will be highly dependent on your problem domain. Don't assume you should create a single object to fulfill a parameter list; try to refactor along the contours of the relationships between the parameters. This will always yield code which reflects the problem it solves more than the language used to solve it.
I don't think using a configuration object contradicts using dependency injection pattern. It is more about the form in which you inject your dependencies and a general question of whether it's better to have a function (in this case the constructor) that takes 20 parameters or combine those parameters into a class so that they are bundled together.
You are still free to use dependency injection, i.e. construct the configuration object by some factory or a container and inject it into the constructor when creating an instance of your class. Whether or not that's a good idea also depends on the particular case, there are no silver bullets ;)