Error running Beam job with DataFlow runner (using Bazel): no module found error - google-cloud-dataflow

I am trying to run a beam job on dataflow using the python sdk.
My directory structure is :
beamjobs/
setup.py
main.py
beamjobs/
pipeline.py
When I run the job directly using python main.py, the job launches correctly. I use setup.py to package my code and I provide it to beam with the runtime option setup_file.
However if I run the same job using bazel (with a py_binary rule that includes setup.py as a data dependency), I end up getting an error:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 804, in run
work, execution_context, env=self.environment)
File "/usr/local/lib/python3.7/site-packages/dataflow_worker/workitem.py", line 131, in get_work_items
work_item_proto.sourceOperationTask.split)
File "/usr/local/lib/python3.7/site-packages/dataflow_worker/workercustomsources.py", line 144, in __init__
source_spec[names.SERIALIZED_SOURCE_KEY]['value'])
File "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", line 290, in loads
return dill.loads(s)
File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 275, in loads
return load(file, ignore, **kwds)
File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 270, in load
return Unpickler(file, ignore=ignore, **kwds).load()
File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 472, in load
obj = StockUnpickler.load(self)
File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 462, in find_class
return StockUnpickler.find_class(self, module, name)
ModuleNotFoundError: No module named 'beamjobs'
This is surprising to me because the logs show above:
Successfully installed beamjobs-0.0.1 pyyaml-5.4.1
So my package is installed successfully.
I don't understand this discrepancy between running with python or running with bazel.
In both cases, the logs seem to show that dataflow tries to use the image gcr.io/cloud-dataflow/v1beta3/python37:2.29.0
Any ideas?

Ok, so the problem was that I was sending the file setup.py as a dependency in bazel; and I could see in the logs that my package beamjobs was being installed correctly.
The issue is that the package was actually empty, because the only dependency I included in the py_binary rule was that setup.py file.
The fix was to also include all the other python files as part of the binary. I did that by creating py_library rules to add all those other files as dependencies.

Probably the wrapper-runner script generated by Bazel (you can find path to it by calling bazel build on a target) restrict set of modules available in your script. The proper approach is to fetch PyPI dependencies by Bazel, look at example

Related

Why can't CMake not find prefix_path?

I'm writing a Dockerfile for a container to simulate an environment based on Ubuntu Xenial and ROS Kinetic. Now, as part of getting the environment up and running, I'll have to run a custom build script. It stops with saying:
-- Could not find the required component 'camera_info_manager'. The following CMake
error indicates that you either need to install the package with the same name or ch
ange your environment so that it can be found.
CMake Error at /opt/ros/kinetic/share/catkin/cmake/catkinConfig.cmake:83 (find_packa
ge):
Could not find a package configuration file provided by
"camera_info_manager" with any of the following names:
camera_info_managerConfig.cmake
camera_info_manager-config.cmake
Add the installation prefix of "camera_info_manager" to CMAKE_PREFIX_PATH
or set "camera_info_manager_DIR" to a directory containing one of the above
files. If "camera_info_manager" provides a separate development package or
SDK, be sure it has been installed.
Call Stack (most recent call first):
flir_spinnaker_driver/CMakeLists.txt:10 (find_package)
-- Configuring incomplete, errors occurred!
See also "/home/eis/build/CMakeFiles/CMakeOutput.log".
See also "/home/eis/build/CMakeFiles/CMakeError.log".
Invoking "cmake" failed
root#0573de1e074a:/home/eis# echo $CMAKE_PREFIX_PATH
/opt/ros/kinetic:/usr/share/camera_info_manager/cmake:/usr/share/camera_info_manager
/cmake/camera_info_managerConfig.cmake
root#0573de1e074a:/home/eis#
I have installed package libcamera-info-manager-dev&libcamera-info-manager0d and found camera_info_managerConfig.cmake in /usr/share/camera_info_manager /cmake/ and modified CMAKE_PREFIX_PATH like:
export CMAKE_PREFIX_PATH=$CMAKE_PREFIX_PATH:/usr/share/camera_info_manager/cmake:/usr/share/camera_info_manager
I've also added /usr/share/camera_info_manager and /usr/include/camera_info_manager:
echo $CMAKE_PREFIX_PATH /opt/ros/kinetic:/usr/share/camera_info_manager/cmake:/usr/share/camera_info_manager /cmake/camera_info_managerConfig.cmake:/usr/share/camera_info_manager:/usr/include/c amera_info_manager
but I'm still stuck on this.
What else should I be looking at? Thanks!

Issues installing Drake locally on Ubuntu 18.04

I am following the instructions in the textbook of the course 6.832, appendix A, on how to install Drake locally on Linux.
All the installation steps have completed and seems to be successful. In addition, I have installed all the prerequisites as described. However, when I run the test in section 2.3
(python -c 'import pydrake; print(pydrake.__file__)')
I have experienced several errors.
It seems that it is trying to access older version of several lib***.so files than what I have.
F.eks: Pydrake tried to include libgfortran.so.3, when I only have libgfortran.so.4 on my computer. I tried to do some "hackfixes" by using the ln -s command to make the terminal accept "libgfortran.so.4" as "libgfortran.so.3". But, now I ran into another error that I don't know how to solve.
It says:
Traceback (most recent call last): File "", line 1, in
File
"/opt/drake/lib/python2.7/site-packages/pydrake/init.py", line 32,
in from . import common File
"/opt/drake/lib/python2.7/site-packages/pydrake/common/init.py",
line 3, in from ._module_py import * ImportError:
/opt/drake/lib/python2.7/site-packages/pydrake/common/../../../../libdrake.so:
undefined symbol:
_ZN6google8protobuf2io17CodedOutputStream28WriteVarint32FallbackToArrayEjPh
How do I handle this problem?
If you followed section A.2.1 "download the binaries" verbatim, you would be downloading https://drake-packages.csail.mit.edu/drake/continuous/drake-latest-xenial.tar.gz, the package for Ubuntu 16.04 (Xenial), which links to libgfortran.so.3.
Since you are on Ubuntu 18.04 (Bionic), you would instead need to download https://drake-packages.csail.mit.edu/drake/continuous/drake-latest-bionic.tar.gz, which links to libgfortran.so.4.

Bazel internal shell issue using windows

I am trying to migrate a huge project having visual studio and maven projects to bazel. I need to access our in house maven server which is encrypted. To get access I need the load the maven_jar skylark extension since the default impl does not support encryption (get error 401). using the extension leads to a lot of troubles, like:
ERROR: BUILD:4:1: no such package '#org_bouncycastle_bcpkix_jdk15on//jar': Traceback (most recent call last):
File ".../external/bazel_tools/tools/build_defs/repo/maven_rules.bzl", line 280
_maven_artifact_impl(ctx, "jar", _maven_jar_build_file_te...)
File ".../external/bazel_tools/tools/build_defs/repo/maven_rules.bzl", line 248, in _maven_artifact_impl
fail(("%s: Failed to create dirs in e...))
org_bouncycastle_bcpkix_jdk15on: Failed to create dirs in execution root.
The main issue seems to be the shell that needs to be provided to bazel in BAZEL_SH environment variables:
I am working under windows
I am using bazel 0.23.2
bazel seems to run a bash command using "bash" directly and not the one provided by env variable.
I got a ubuntu shell installed in windows. bazel was using everything from ubuntu, especially when using maven (settings.xml was using from ubuntu ~/.m2 and not from windows user)
after uninstalling ubuntu and making sure that bash in a cmd ends up in "command not found" I also removed the BAZEL_SH env var and bazel throws the message above
after setting the BAZEL_SH variable again it fails with the same error message
I am assuming that bazel gets a bash from somewhere or is ignoring the env variable. My questions are:
1. How to setup a correct shell?
2. Is BAZEL_SH needed when using current version?
For me the doc at bazel website about setup is outdated.
Cheers
Please consider using rules_jvm_external to manage your Maven dependencies. It supports both Windows and private repositories using HTTP Basic Authentication.
For me the doc at bazel website about setup is outdated.
The Bazel team is aware of this and will be updating our docs shortly.

travis-ci path to the shared library or how to link shared library to python

ci professionals,
I cannot figure out why this code cannot find the shared library. Please see the log
https://pastebin.com/KvJP9Ms3
[31mImportError while importing test module '/home/travis/build/alexlib/pyptv/tests/test_pyptv_batch.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
tests/test_pyptv_batch.py:1: in <module>
from pyptv import pyptv_batch
pyptv/pyptv_batch.py:20: in <module>
from optv.calibration import Calibration
E ImportError: liboptv.so: cannot open shared object file: No such file or directory[0m
!!!!!!!!!!!!!!!!!!! Interrupted: 1 errors during collection !!!!!!!!!!!!!!!!!!!!
[1m[31m=========================== 1 error in 0.43 seconds ============================[0m
the pull request
https://github.com/alexlib/pyptv/pull/4
and the build https://travis-ci.org/alexlib/pyptv/builds/342237102
We have a C library (http://github.com/openptv/openptv) that we need to compile and using Cython bindings add to Python, then we use Python through bindings. The tests work locally but not on Travis-CI (great service). I think it's a simple issue with paths, but I couldn't figure out how to deal with this.
Thanks in advance
Alex
The answer I have found was to set also the DYLD_LIBRARY_PATH that is required on Mac OS X:
export PATH=$PATH:/usr/local/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
export DYLD_LIBRARY_PATH=$DYLD_LIBRARY_PATH:/usr/local/lib

python-oauth2 on python 3x and windows

I want to setup an oauth library for Python on my windows desktop. I am newbie and this is my second day on Python and I having a tonne of trouble.
I downloaded the python-oauth2 (hudson-python-oauth2-167.zip) from github. I have extracted this to my python32 folder. When I run the setup command "python setup.py", I first got a syntax error on the print statement. I assumed it is because I am running on windows and so I changed it and then ran the setup.
I then got the following error:
Traceback (most recent call last):
File "C:\Python32\simplegeo-python-oauth2-1920657\setup.py", line 2, in
from setuptools import setup, find_packages
ImportError: No module named setuptools
Can someone guide me with setting up python-oauth2? Am I missing something basic here?
You need to install Distribute, which is a fork of setuptools that supports Python 3.
That said, your syntax error was because python-oauth2 doesn't seem to run on Python 3 yet, so you need to either help port it, or use Python 2.7.

Resources