Solving environment: failed
ResolvePackageNotFound:
- sphinxcontrib==1.0=py35_0
- requests==2.14.2=py35_0
- rope==0.9.4=py35_1
- pytorch==0.1.12=py35_0.1.12cu80
- pylint==1.7.2=py35_0
- nbconvert==5.2.1=py35_0
- vc==14=0
- libpng==1.6.30=vc14_1
- numpy==1.13.1=py35_0
- jsonschema==2.6.0=py35_0
- alabaster==0.7.10=py35_0
- simplegeneric==0.8.1=py35_1
- entrypoints==0.2.3=py35_0
- isort==4.2.15=py35_0
- qt==5.6.2=vc14_6
- setuptools==36.4.0=py35_1
- mkl==2017.0.3=0
- path.py==10.3.1=py35_0
- babel==2.5.0=py35_0
- icu==57.1=vc14_0
- vs2015_runtime==14.0.25420=0
- jedi==0.10.2=py35_2
- jpeg==9b=vc14_0
its actually an error you can fix it by: conda update --all then conda env export --no-builds > env.yml and then with regular installation via: conda env create -f env.yml
after this fix works you might also encounter an error with output: 'CondaValueError: prefix already exists: /home/user/anaconda3'
for this error you just need to open your .yml file and make sure the environment name is not 'base' or any other existing env name. THAT'S IT! I hope this works !
Related
I have the following configuration for Serverless Lambda with IAM configuration and I get the following error: MalformedPolicyDocument error - Resource must be in ARN format or "*" for the value I pass under this config:
- Effect: 'Allow'
Action:
- 'kafka-cluster:Connect'
- 'kafka-cluster:DescribeTopic'
- 'kafka-cluster:DescribeGroup'
- 'kafka-cluster:ReadData'
- 'kafka-cluster:AlterGroup'
- 'kafka-cluster:DescribeClusterDynamicConfiguration'
Resource: ${env.KAFKA_CLUSTER_ARN}
The value for is arn:aws:kafka:us-west-2:111111111111:cluster/kafka-cluster-test/6ebf68e8-ad47-47af-8c41-5801c095ab72-1 which is configured in the env config files.
Using Serverless 2.72.2
Please advise what I'm not configuring properly.
The issue occurred due to improperly calling dotenv using env. instead of env:
In my dockerfile to build the custom docker base image, I specify the following base image:
FROM nvidia/cuda:10.1-cudnn7-devel-ubuntu16.04
The dockerfile corresponding to the nvidia-cuda base image is found here: https://gitlab.com/nvidia/container-images/cuda/blob/master/dist/ubuntu16.04/10.1/devel/cudnn7/Dockerfile
Now when I print the AzureML log:
run = Run.get_context()
# setting device on GPU if available, else CPU
run.log("Using device: ", torch.device('cuda' if torch.cuda.is_available() else 'cpu'))
I get
device(type='cpu')
but I would like to have a GPU and not a CPU. What am I doing wrong?
EDIT: I do not know exactly what you need.
But I can give you the following information:
azureml.core VERSION is 1.0.57.
The compute_target is defined via:
def compute_target(ws, cluster_name):
try:
cluster = ComputeTarget(workspace=ws, name=cluster_name)
except ComputeTargetException:
compute_config=AmlCompute.provisioning_configuration(vm_size='STANDARD_NC6',min_nodes=0,max_nodes=4)
cluster = ComputeTarget.create(ws, cluster_name, compute_config)
The experiment is run via:
ws = workspace(os.path.join("azure_cloud", 'config.json'))
exp = experiment(ws, name=<name>)
c_target = compute_target(ws, <name>)
est = Estimator(source_directory='.',
script_params=script_params,
compute_target=c_target,
entry_script='azure_cloud/azure_training_wrapper.py',
custom_docker_image=image_name,
image_registry_details=img_reg_details,
user_managed = True,
environment_variables = {"SYSTEM": "azure_cloud"})
# run the experiment / train the model
run = exp.submit(config=est)
The yaml file contains:
dependencies:
- conda-package-handling=1.3.10
- python=3.6.2
- cython=0.29.10
- scikit-learn==0.21.2
- anaconda::cloudpickle==1.2.1
- anaconda::cffi==1.12.3
- anaconda::mxnet=1.5.0
- anaconda::psutil==5.6.3
- anaconda::pycosat==0.6.3
- anaconda::pip==19.1.1
- anaconda::six==1.12.0
- anaconda::mkl==2019.4
- anaconda::cudatoolkit==10.1.168
- conda-forge::pycparser==2.19
- conda-forge::openmpi=3.1.2
- pytorch::pytorch==1.2.0
- tensorboard==1.13.1
- tensorflow==1.13.1
- tensorflow-estimator==1.13.0
- pip:
- pytorch-transformers==1.2.0
- azure-cli==2.0.72
- azure-storage-nspkg==3.1.0
- azureml-sdk==1.0.57
- pandas==0.24.2
- tqdm==4.32.1
- numpy==1.16.4
- matplotlib==3.1.0
- requests==2.22.0
- setuptools==41.0.1
- ipython==7.8.0
- boto3==1.9.220
- botocore==1.12.220
- cntk==2.7
- ftfy==5.6
- gensim==3.8.0
- horovod==0.16.4
- keras==2.2.5
- langdetect==1.0.7
- langid==1.1.6
- nltk==3.4.5
- ptvsd==4.3.2
- pytest==5.1.2
- regex==2019.08.19
- scipy==1.3.1
- scikit_learn==0.21.3
- spacy==2.1.8
- tensorpack==0.9.8
EDIT 2: I tried use_gpu = True as well as upgrading to azureml-sdk=1.0.65 but to no avail. Some people suggest additionally installing cuda-drivers via apt-get install cuda-drivers, but this does not work and I cannot build a docker image with that.
The output of nvcc --version on the docker image yields:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
So I think that should be o.k. The docker image itself of course has no GPU, so command nvidia-smi is not found and
python -i
and then
import torch
print(torch.cuda.is_available())
will print False.
In your Estimator definition, please try adding use_gpu=True
est = Estimator(source_directory='.',
script_params=script_params,
compute_target=c_target,
entry_script='azure_cloud/azure_training_wrapper.py',
custom_docker_image=image_name,
image_registry_details=img_reg_details,
user_managed = True,
environment_variables = {"SYSTEM": "azure_cloud"},
use_gpu=True)
I believe, with azureml-sdk>=1.0.60 this should be inferred from the vm-size used, but since you are using 1.0.57 I think this is still required.
After I installed opencv in conda environment, I activate myenv.
python3 is then used from myenv-python3, but opencv is still using the system env-opencv, why this happened? Did I forget to do any POST-INSTALLATION, like adding PATH or modify PKG-CONFIG?
command for building conda virtual env:
conda env create -n myenv -f env.yml
cv2.so is already there, the libopencv* files are also there, the installation should be OK. It seems like PATH or something is the problem.
name: cv320_tf
channels:
- menpo
- defaults
dependencies:
- cairo=1.14.8=0
- certifi=2016.2.28=py35_0
- cycler=0.10.0=py35_0
- dbus=1.10.20=0
- expat=2.1.0=0
- fontconfig=2.12.1=3
- freetype=2.5.5=2
- glib=2.50.2=1
- gst-plugins-base=1.8.0=0
- gstreamer=1.8.0=0
- harfbuzz=0.9.39=2
- hdf5=1.8.17=2
- icu=54.1=0
- jbig=2.1=0
- jpeg=9b=0
- libffi=3.2.1=1
- libgcc=5.2.0=0
- libgfortran=3.0.0=1
- libiconv=1.14=0
- libpng=1.6.30=1
- libtiff=4.0.6=3
- libxcb=1.12=1
- libxml2=2.9.4=0
- matplotlib=2.0.2=np111py35_0
- mkl=2017.0.3=0
- numpy=1.11.3=py35_0
- openssl=1.0.2l=0
- pandas=0.20.1=np111py35_0
- patsy=0.4.1=py35_0
- pcre=8.39=1
- pip=9.0.1=py35_1
- pixman=0.34.0=0
- pyparsing=2.2.0=py35_0
- pyqt=5.6.0=py35_2
- python=3.5.4=0
- python-dateutil=2.6.1=py35_0
- pytz=2017.2=py35_0
- qt=5.6.2=5
- readline=6.2=2
- scipy=0.19.0=np111py35_0
- seaborn=0.8=py35_0
- setuptools=36.4.0=py35_1
- sip=4.18=py35_0
- six=1.10.0=py35_0
- sqlite=3.13.0=0
- statsmodels=0.8.0=np111py35_0
- tk=8.5.18=0
- wheel=0.29.0=py35_0
- xz=5.2.3=0
- zlib=1.2.11=0
- opencv3=3.2.0=np111py35_0
- pip:
- bleach==1.5.0
- enum34==1.1.6
- html5lib==0.9999999
- markdown==2.6.11
- protobuf==3.5.1
- tensorflow==1.4.1
- tensorflow-tensorboard==0.4.0
- werkzeug==0.14.1
Could you please try below steps
Uninstall existing opencv
Install opencv inside the conda environment using the command "conda install opencv"
The documentation says:
When you define multiple variables per line in the env array (matrix variables), one build is triggered per item.
rvm:
- 1.9.3
- rbx
env:
- FOO=foo BAR=bar
- FOO=bar BAR=foo
But what if I define only 1 per line? I'm doing the following:
env:
- FOO=1
- BAR=2
- BAZ=3
But it's triggering 3 builds? I expected it to trigger 1 build, with those 3 env variables. Do I have to defined them like this?
env:
- FOO=1 BAR=2 BAZ=3 QUX=4 ........ =10
Or am I missing something here?
You need to define them as global variables:
env:
global:
- FOO=1
- BAR=2
- BAZ=3
See Global variables documentation for more info.
I'm trying to set allow_failures for a complex build process, but unfortunately it isn't working.
The problem is that in my env I am setting multiple environment variables, and cannot get Travis to recognise that I want on of these rows to be allowed to fail.
The documentation on allow_failures shows how to allow a single env to fail, along with another configuration option, but doesn't cover how to allow a multiple enviroment variable setup to fail.
The troublesome sections of the .travis.yml file are below:
env:
- DJANGO_VERSION='1.8,<1.9' DB=sqlitefile SEARCH=whoosh
- DJANGO_VERSION='1.8,<1.9' DB=postgres SEARCH=whoosh
- DJANGO_VERSION='1.8,<1.9' DB=mysql SEARCH=whoosh
- DJANGO_VERSION='1.8,<1.9' DB=sqlitefile SEARCH=elasticsearch
- DJANGO_VERSION='1.8,<1.9' DB=postgres SEARCH=elasticsearch
- DJANGO_VERSION='1.8,<1.9' DB=mysql SEARCH=elasticsearch
matrix:
allow_failures:
- env: DJANGO_VERSION='1.8,<1.9' DB=mysql SEARCH=elasticsearch
- env: DJANGO_VERSION='1.8,<1.9' DB=mysql SEARCH=whoosh
How can I do this?
Fixed!
Travis allow_failure options must be identical down to the whitespace!
So this won't work:
env:
- FOO='one' BAR='two'
- FOO='three' BAR='four'
matrix:
allow_failures:
- env: FOO='one' BAR='two'
But this will:
env:
- FOO='one' BAR='two'
- FOO='three' BAR='four'
matrix:
allow_failures:
- env: FOO='one' BAR='two'