AttributeError: module 'dask' has no attribute 'delayed' - dask

Using Pycharm Community 2018.1.4
Python 3.6
Dask 2.8.1
Trying to implement dask delayed on some of my methods and getting an error
AttributeError: module 'dask' has no attribute 'delayed'.
This is obviously not true so I am wondering what I am doing wrong. My implementation structure is as follows:
import dask
def main()
for i, fn in enumarate(filenames):
data = {}
for x in range(0,2):
data.update(dask.delayed(load_data)(fn, x))
succes_flag = dask.delayed(execute_analytic)(data)
if success_flag == 1:
print("success")
else:
print("fail")
def load_data(filename,selector):
def execute_analytic(data)
if __name__ == '__main__':
dask.compute(main())
Essentialy, I have a bunch of data files, which are independant of each other and so I want to run them in parallel instead of sequentially through a for loop, which i was doing if you take the dask.delayed out.
Am i fundamentally missing anything in the above implementation of dask delayed?

I refer the following URL https://github.com/dask/dask/issues/1849
To install Dask with pip there are a few options, depending on which
dependencies you would like to keep up to date:
pip install dask[complete]: Install everything
pip install dask[array]: Install dask and numpy
pip install dask[bag]: Install dask and cloudpickle
pip install dask[dataframe]: Install dask, numpy, and pandas
pip install dask: Install only dask, which depends only on the standard
library. This is appropriate if you only want the task schedulers.

You probably only installed the core library, rather than the full library with normal dependencies.
conda install dask
or
pip install dask[complete]
See https://docs.dask.org/en/latest/install.html for more information

pip install "dask[delayed]" is the minimal requirement to directly answer the OP (the other answers may install unnecessary dependencies)

Related

terminate called after throwing an instance of 'std::runtime_error' what() numpy failed to initialize

Environment: Docker Image based on nvidia/cuda:11.1-cudnn8-devel-ubuntu20.04, python3.8, numpy==1.19.4, opencv=3.4.3.
Error: terminate called after throwing an instance of 'std::runtime_error' what() numpy failed to initialize, RuntimeError: module compiled against API version 0xe but this version of numpy is 0xd
This solution helped:
pip3 install --upgrade numpy
(Successfully installed numpy-1.23.0)
Description:
The python3 application runs successfully with the specified initial version of numpy on a host ubuntu20.04.4.
However, when run in docker based on the same Ubuntu release, it stops at the indicated error.
The solution is found, in the indicated form. However, the essence of the question why this difference in the numpy versions on the host and inside the docker container has a place, is so remains unclear.
Question:
Why this difference in numpy versions on the host (1.19.4) and inside the docker container (1.23.0) has a place, if all the rest of the environment looks identical.
Pycuda is a package that compiles; when it compiles, it uses numpy's libraries.
Next solution, allowed me to use required version of numpy:
pip3 uninstall pycuda -y
pip3 install --upgrade numpy==1.19.4
pip3 install pycuda==2019.1

Drake Mathematical Program Tutorial

I am running Drake on Ubuntu 20.04 using WSL2.
I use python3.8.10 and Drake1.2.0.
I tried running the "Mathematical Program Tutorial" obtained from deepnote on my PC, but the behavior of the ipopt solver is unnatural and does not give the expected results.
The 1st error is occurred in the section using ipopt solver.
All components of the solution is printed as "nan"
The 2nd error is below about "get_solver_details().status"
RuntimeError: The solver_details has not been set yet.
I can see both errors in "Demo on manually choosing a solver" in the tutorial.
The result is following
SolutionResult.kUnknownError
x* = [nan nan]
Solver is IPOPT
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-12-2d1b3835c54a> in <module>
25 print("x* = ", result.GetSolution(x))
26 print("Solver is ", result.get_solver_id().name())
---> 27 print("Ipopt solver status: ", result.get_solver_details().status,
28 ", meaning ", result.get_solver_details().ConvertStatusToString())
RuntimeError: The solver_details has not been set yet.
Thank you in advance.
P.S.
I installed pydrake for venv by pip commands
python3 -m venv env
env/bin/pip install --upgrade pip
env/bin/pip install drake
sudo apt-get install --no-install-recommends \
libpython3.8 libx11-6 libsm6 libxt6 libglib2.0-0
source env/bin/activate
I just download the folder "Tutorial" from deepnote and put it under env.
Then, I run it by Jupyter Notebook as
jupyter notebook
and open env/Tutorials/mathematical_program.ipynb
It turns out that the pip drake == 1.2.0 version has a bug in the IpoptSolver compilation.
As a work-around, you can use SnoptSolver instead, or else use the https://drake.mit.edu/from_binary.html release (unpacking a zipped binary, instead of using pip).
It's possible that the pydrake.solvers.ipopt.IpoptSolver class (which is a wrapper around the https://coin-or.github.io/Ipopt/ library) does not run correctly under WSL2, due to using some odd libc API which doesn't work on Windows. We will need more information to reproduce the problem and try to debug.
Can you state exactly how you installed pydrake (i.e., show us the command lines you used). Was it via pip (https://drake.mit.edu/pip.html) or just via binary (https://drake.mit.edu/from_binary.html)?
Can you state exactly how you ran Jupyter (the command line) to launch the notebook? Was it python3 -m pydrake.tutorials or something else?
Looks like this may not be tied to WSL, but instead pip build (or just binary build). Ran into this on Ubuntu 20.04 (no WSL). Per Drake Slack, filed issue:
https://github.com/RobotLocomotion/drake/issues/17162

XGB via Scikit learn API doesn't seem to be running in GPU although compiled to run for GPU

It appears although XGB is compiled to run on GPU, when called/executed via Scikit learn API, it doesn't seem to be running on GPU.
Please advise if this is expected behaviour
As far as I can tell, the Scikit learn API does not currently support GPU. You need to use the learning API (e.g. xgboost.train(...)). This also requires you to first convert your data into xgboost DMatrix.
Example:
params = {"updater":"grow_gpu"}
train = xgboost.DMatrix(x_train, label=y_train)
clf = xgboost.train(params, train, num_boost_round=10)
UPDATE:
The Scikit Learn API now supports GPU via the **kwargs argument:
http://xgboost.readthedocs.io/en/latest/python/python_api.html#id1
I couldn't get this working from the pip installed XGBoost, but I pulled the most recent XGBoost from GitHub (git clone --recursive https://github.com/dmlc/xgboost) and compiled it with the PLUGIN_UPDATER_GPU flag which allowed me to use the GPU with the sklearn API. This required me to also change some NVCC flags to work on my GTX960 that was causing some build errors, then some runtime errors due to architecture mismatch. After it built, I installed with pip install -e python-package/ within the repo directory. To use the Scikit learn API (using either grow_gpu or grow_hist_gpu):
import xgboost as xgb
model = xgb.XGBClassifier(
max_depth=5,
objective='binary:logistic',
**{"updater": "grow_gpu"}
)
model.fit(train_x, train_y)
If anyone is interested in the process to fix the build with the GPU flag, here is the process that I went through on Ubuntu 14.04.
i) git clone git clone --recursive https://github.com/dmlc/xgboost
ii) cd insto xgboost and make -j4 to create multi-threaded, if no GPU is desired
iii) to make GPU, edit make/config.mk to use PLUGIN_UPDATER_GPU
iv) Edit the makefile Makefile, on the NVCC section to use the flag --gpu-architecture=sm_xx for GPU version (5.2 for GTX 960) on line 101
#CODE = $(foreach ver,$(COMPUTE),-gencode arch=compute_$(ver),code=sm_$(ver)) TO
CODE = --gpu-architecture=sm_52
v) Run the ./build.sh, it should say completed in multi-threaded mode or the NVCC build probably failed (or another error, look above for the error)
vi) In the virtualenv (if desired) in the same directory run pip install -e python-package/
These are some things that caused some nvcc errors for me:
i) Installing/updating the Cuda Toolkit by downloading the cuda toolkit .deb from Nvidia (version 8.0 worked for me, and is required in some cases?).
ii) Install/update cuda
sudo apt-get update
sudo apt-get install cuda
iii) Add nvcc to your path. Mine was in /usr/local/cuda/bin/
iv) A restart may be required if running nvidia-smi does not work due to some of the cuda/driver/toolkit updates.

Dataflow wordcount.py example " Import by filename is not supported"

Using Ubuntu 14.04,
DataFlow Python SDK
Following instructions at [https://github.com/GoogleCloudPlatform/DataflowPythonSDK#status-of-this- release] , after everything is loaded when I try the wordcount example I try get the error "Import by filename is not supported".
I suspect the issue is at line 23 of the wordcount.py example
import google.cloud.dataflow as df
Is there a workaround for this issue?
I have tried the solution posted at Python / ImportError: Import by filename is not supported , but that does not solve the problem.
Since this fails at the first import statement the immediate thing to check is if the Python Dataflow package is installed at all. Th way to do that is by running 'pip freeze'. Here is some output from running this in a virtual environment:
$ pip freeze
... Nothing since it is a clean virtual environment ...
$ pip install https://github.com/GoogleCloudPlatform/DataflowPythonSDK/archive/v0.2.3.tar.gz
... Output from installing packages ...
$ pip freeze
...
python-dataflow==0.2.3
...
Now you can run python and execute 'import google.cloud.dataflow as df' and it should work.
Hopefully this helps!

IPython parallel does not work for me in IPython 2.2 but did in 2.1

the skeleton code of what I do is
from IPython import parallel
.....
rcAll = parallel.Client()
all_engines = rcAll[:]
lbvAll = rcAll.load_balanced_view()
....
for anInpt in allInpt:
lbvAll.apply(mputil.doAll, anInpt)
lbvAll.wait()
lbvAll.get_result()
....
for ijk in range(len(list(lbvAll.results.values()))):
out = list(lbvAll.results.values())[ijk]
ionS = out[0]
However, all that out ever contains is import error messages.
This worked before but but something must have changed between IPython 2.1 and 2.2. At least, that is my guess.
Check the output of:
cat /usr/local/lib/python2.7/dist-packages/*.pth
Delete the following path if it exits inside the "catted" folder:
/usr/lib/python2.7/dist-packages
usr/lib/python2.7/dist-packages being on the front of sys.path means that there is an easy-install.pth file with this path, which should be removed. It is caused by a bug in setuptools.
If that doesn't work, simply upgrading some of your tools might fix the problem.
pip install --upgrade ipython
pip install --upgrade setuptools pip
I found the problem. I started the ipcluster in a shell with a different PYTHONPATH than the one I was running the notebook in. That simple but I took a while. I apologize for the noise.

Resources