build research and use it as an external package for research project

build research and use it as an external package for research project - tensorflow-federated

I want to perform some research regarding quantization/sparsification, I would like to use run_experiment.py script as a template, to do so in a clean matter as research is not part of the pip package I was wondering if it is possible to build it myself and then reuse it as a dependency (as in run_experiment.py some functions from research are used). I am not sure however how to do it. I am not familiar with bazel. I was able to install it and run the script, that's all. Any guidance would be highly appreciated! Or if it's not possible it would be good to know as well! Thank you for any advice in this matter.
EDIT:
I built something using bazel and I have it in bazel-bin I don't know now however how to reuse it in my script, as if I just wanted to do it in a python manner
from research.compression import compression_process_adapter
or somehthing similar in my script

Using TFF for Federated Learning Research gives a rough introduction on suggestions for organizing the experiment conceptually.
From there, seeing how "run scripts" are setup in various sub-directories under tensorflow_federated/python/research/ might be good examples. If there is an sub directory that is close to what you want to accomplish, forking/copying it might be a good place to start.
For instance, tensorflow_federated/python/research/gans/experiments/emnist/run_experiments.py might be a useful example for how to setup an experiment grid. This iteratively runs tensorflow_federated/python/research/gans/experiments/emnist/train.py, which has an example of how to import libraries under the research/ directory. Note that all of these uses bazel, and the depedencies for the imports are decalred in the tensorflow_federated/python/research/gans/experiments/emnist/BUILD file.
Finally, this script can be run with (from the git repo root directory):
bazel run -c opt tensorflow_federated/python/research/gans/experiments/emnist:run_experiments

Related

Analog of r-here or py-here for Julia

BACKGROUND
One of the very useful tools for reproducible work in R is the "here" library.
https://malco.io/2018/11/05/why-should-i-use-the-here-package-when-i-m-already-using-projects/
http://jenrichmond.rbind.io/post/how-to-use-the-here-package/
https://here.r-lib.org/
https://here.r-lib.org/articles/rmarkdown.html
I was hooked by the part in the first link where they said this:
The "here" library is encoded in Anaconda as "r-here"
I'm not sure which came first, but Python has a "here" library as well.
https://pypi.org/project/pyhere/
https://github.com/wildland-creative/pyhere
"Here" makes relative paths a trivial matter, which is really useful for reproducible data-science and analysis work.
QUESTION
What is the Julia equivalent for clean handling of relative paths for files?
Is there a clean way to integrate that with project packaging, like RStudio does?

Based on the description, it sounds like DrWatson.jl does what you're looking for. From the website:
[DrWatson] is a Julia package created to help people increase the consistency of their scientific projects, navigate them and share them faster and easier, manage scripts, existing simulations as well as project source code. DrWatson helps establishing reproducibility, and in general it makes managing a scientific project a simple job.
Like the description implies, it's more ambitious than here seems to be, having functionality to also manage data, simulation runs, etc. But they're optional, and you can use only the directory handling part if you need.
The Navigating a Project describes the projectdir function which works similar to here. projectdir("foo", "bar") resolves to foo/bar under the current project's root directory, just like with here.
There's also datadir(), srcdir(), and others to directly handle common subdirectories under a project for eg. datadir("foo", "test.jld2") resolves to data/foo/test.jld2 under the project's root directory.

It doesn't exist, as far as I'm aware (Here.jl doesn't return any Google hits), but it seems like it would a simple enough for someone to implement. Maybe you!

Using large non-bazel dependencies in a bazel project

I would like to use a very large non-bazel system in a bazel project. Specifically, ROS2. This dependency provides a large number of python, C, and C++ libraries which are built using its own hand-rolled buildsystem. Obviously, I would like to avoid having to translate the entire buildsystem over to bazel.
Broadly, what's the best way of me doing this? In instinct was to use a custom repository rule to download the source (since it's split across many repositories), then use a genrule to call the ROS2 build system. Then write my simple cc_import and py_library rules for each of the individual components that I need.
However, I'm having trouble with the bit where I need to call the foreign build system. It seems that genrules require a list of output files to be specified, while I would like it to make an entire build directory available.
Before I spent any more time on this, I thought I'd ask whether I'm on the right lines since I'm new to bazel. Is this a good strategy? How would you approach this problem? Are there any other projects that mainly use bazel, but call other build systems in this way that I can look at?

As of recent, you can use rules_foreign_cc to call native CMake or make/configure like projects.

Handling complex and large dependencies

Problem
I've been developing a game in C++ in my spare time and I've opted to use Bazel as my build tool since I have never had a ton of luck (or fun) working with make or cmake. I also have dependencies in other languages (python for some of the high level scripting). I'm using glfw for basic window handling and high level graphics support and that works well enough but now comes the problem. I'm uncertain on how I should handle dependencies like glfw in a Bazel world.
For some of my dependencies (like gtest and fruit) I can just reference them in my WORKSPACE file and Bazel handles them automagically but glfw hasn't adopted Bazel. So all of this leads me to ask, what should I do about dependencies that don't use Bazel inside a Bazel project?
Current approach
For many of the simpler dependencies I have, I simply created a new_git_repository entry in my WORKSPACE file and created a BUILD file for the library. This works great until you get to really complicated libraries like glfw that have a number of dependencies on their own.
When building glfw for a Linux machine running X11 you now have a dependency on X11 which would mean adding X11 to my Bazel setup. X11 Comes with its own set of dependencies (the X11 libraries like X11Cursor) and so on.
glfw also tries to provide basic joystick support which is provided by default in Linux which is great! Except that this is provided by the kernel which means that the kernel is also a dependency of my project. Now I shouldn't need anything more than the kernel headers this still seems like a lot to bring in.
Alternative Options
The reason I took the approach I've taken so far is to make the dependencies required to spin up a machine that can successfully build my game very minimal. In theory they just need a C/C++ compiler, Java 8, and Bazel and they're off to the races. This is great since it also means I can create a Docker container that has Bazel installed and do CI/CD really easily.
I could sacrifice this ease and just say that you need to have libraries like glfw installed before attempting to compile the game but that brings the whole which version is installed and how is it all configured problem back up that Bazel is supposed to help solve.
Surely there is a simpler solution and I'm overthinking this?

If the glfw project has no BUILD files, then you have the following options:
Build glfw inside a genrule.
If glfw supports some other build system like make, you could create a genrule that runs the tool. This approach has obvious drawbacks, like the not-to-be-underestimated impracticality of having to declare all inputs of that genrule, but it'd be the simplest way of Bazel'izing glfw.
Pre-build glfw.o and check it into your source tree.
You can create a cc_library rule for it, and put the .o file in the srcs. Even though this solution is the least flexible of all because you not only restrict the target platform to whatever the .o was built for, but also make it harder to reproduce the whole build, the benefits are sometimes worth the costs.
I view this approach as a last resort. Even in Bazel's own source code there's one cc_library.srcs that includes a raw object file, because it was worth it, as the commit message of 92caf38 explains.
Require that glfw be installed.
You already considered this option. Some people may prefer this to the other approaches.

What exacly is "buidling" from source and how does it work

So I really cant understand how this work but late me explain. First, just in case you need it, I am running Ubuntu 12.04 64-bit on a laptop.
As a building tool am using CMake. I want to load in to my project OpenCV, MRPT (http://www.mrpt.org/) and libfreenect. All of them have a "source code". What I don't understand is when they say "build from source". How to I make a project with all of them?
Do I need to build each one individually and with some way but then in my project OR do I down load the source code and build them all together at ones? As you can see I'm really confused what I have to do... do I run the CMakeList.txt from each source code and the run one CMakeList.txt that has all the other CMakeList.txt?
In fewer world, if I want to build from source, two or more libraries, how do I do that?
I would like a general answer (how this "build from source" works) and an answer specifically on the the ones I mentioned (CMake, OpenCV, MRPT, libfreenect). I hope I made clear what I don't really understand.

It depends of the 'master' project. In general in the c/c++ universe your project must know how to invoke the build process of each subproject/library OR your project needs to know how to include&link the results after building each external project yourself.
You can also mix the two approaches if needed but I think it cleaner to try to use one if possible.
In the first case if all the subprojects offer cmake building files (CMakeLists.txt) you may try to add_subdirectory() each and see if there are any conflicts. For example google test can be easily included this way and it gives your project some global variables that easy linking later.
Alternatively or if the above approach gives problems or the sub project doesn't provide CMakeLists.txt you can use ExternalProject_add(). It takes more work and you have to handle includes/linking configurations with your project manually but it makes the subproject more independent. For example if there are conflicting targets with your project or the subproject doesn't provide CMakeLists.txt.
The last approach involves building and installing the sub projects separately, using configuration variables in your project to point the includes/libraries paths of the sub project. Check CMake:How To Find Libraries for details.

How can I build PDF LaTeX documents with ANT (or some other build system if you prefer)?

The team I work for manages a large collection of technical documentation which is written in LaTeX.
Currently all the documentation we have is manually built by the editors and then checked into a version control system. Sometimes people forget to compile their documents so we have a situation where the PDF and .tex files are often out of step. Unfortunately when this happens our users find themselves reading old versions of our document.
I've managed to hack a simple script to build PDFs using Make - it's rather clumsy.
I was wondering if there was a better way to do it? Most people in our department use Eclipse + Pydev for a Python project which means we are all very familiar with this IDE. I know that Ant plays nicely with Eclipse, so might we be able to use this tool for our doc building?
So what's the best way of doing this? I hope I will not have to learn everything there is to know about a new build-system in order to automate the building of some quite simple docs.

There is an external Ant task for LaTeX PDF generation, though the site is in German.
To use it, download the jar to a location on your machine, then define a taskdef as follows:
<taskdef name="latex" classname="de.dokutransdata.antlatex.LaTeX"
classpath="/path/to/ant/lib/ant_latex.jar"/>
Then to use it, define a target like this:
<target name="doLaTeX">
<latex
latexfile="${ltx2.file}"
verbose="on"
clean="on"
pdftex="off"
workingDir="${basedir}"
/>
</target>
Where ltx2.file is the file to process.
This is a link to the howto page listing the parameters. If you need any more options, my German is just about passable enough to explain, maybe.
There is also a maven plugin for LaTeX, but I can't find any documentation.

Haven't tried it, but I remember seeing a blog post about it.

If you know python, this blog post might be interesting
EDIT: Also, I would assume that you're using some kind of version control system, and I can't say for sure, but I use git to manage all my latex docs, and it might be possible to use some kind of post-commit hook to execute a script to rebuild the document. This would depend on how your repository is structured... just thinking out loud, so to speak.

I went into great detail on a large number of build systems for latex in this question, but its slightly different in your case. I think you want rubber or latexmk. The latex-makefile seems a good idea, but only supports building via postscript, which might not be your build process.
In general, its a good idea to keep generated files outside of version control for just this reason. A good exception is when specialist build tools are not widely available, and your situation sounds similar. You might do better with a commit-hook to build automatically upon commit.
I guess I should also point out that committing something without first building it and checking it is a deadly sin, so a better solution might be to stamp that out.

Maven is a better alternative as build system compared to Ant. So I would recommend a maven-plugin to generate PDF from LaTeX sources. Have a look at mathan-latex-maven-plugin

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart