I'm having an issue deploying and linking 3 containers using docker-compose in Bluemix / IBM Containers. The compose file I am using has worked and continues to work but it is very inconsistent. When it fails, I get the following response:
Recreating xxxxx_1
Recreating yyyyy_1
Creating zzzzz_1
ERROR: for server 'message'
Traceback (most recent call last):
File "docker-compose", line 3, in <module>
File "compose/cli/main.py", line 64, in main
File "compose/cli/main.py", line 116, in perform_command
File "compose/cli/main.py", line 876, in up
File "compose/project.py", line 416, in up
File "compose/parallel.py", line 66, in parallel_execute
KeyError: 'message'
Failed to execute script docker-compose
Docker compose does not expose very good error messages (basically, when something happens it's not expecting, you get a weird stacktrace like that). My guess is that it's timing - the default timeouts on compose are expecting local docker (perhaps on very fast computers), so if things don't start very very fast, it sometimes becomes unhappy.
In the Bluemix cloud, the containers have SDN and other setup that can take a bit longer than on local docker, so that "starting very very fast" is not always within what compose expects.
Try doing export COMPOSE_HTTP_TIMEOUT=300 first to bump the timeouts, that should help matters.
Related
I am trying to replicate Machine Learning Inference using AWS Lambda and Amazon EFS. I was able to deploy the project, however, it was not possible to infer the machine learning model, because it was not found. I access CloudWatch and get the following output:
[ERROR] FileNotFoundError: Missing /mnt/ml/models/craft_mlt_25k.pth and downloads disabled
Traceback (most recent call last):
File "/var/task/app.py", line 23, in lambda_handler
model_cache[languages_key] = easyocr.Reader(language_list, model_storage_directory=model_dir, user_network_directory=network_dir, gpu=False, download_enabled=False)
File "/var/lang/lib/python3.8/site-packages/easyocr/easyocr.py", line 88, in __init__
detector_path = self.getDetectorPath(detect_network)
File "/var/lang/lib/python3.8/site-packages/easyocr/easyocr.py", line 246, in getDetectorPath
raise FileNotFoundError("Missing %s and downloads disabled" % detector_path)
Then, I noticed that not even the directory that was supposed to store the models was created in the S3 Bucket.
At Dockerfile has the following command: RUN mkdir -p /mnt/ml, But in my s3 bucket, this directory does not exist.
It is possible create the directories and upload the EasyOCR model manually? If I do, will I have to modify the original code?
i run my code on vscode ide or use cmd will be fine, but when i use jenkins run windows batch cmd will be fail (app start no use) , my jenkins also run on admin
C:\ProgramData\Jenkins\.jenkins\workspace\Agent_Automation>python test_enf_ins_gui.py
Traceback (most recent call last):
File "C:\Program Files\Python37\lib\site-packages\pywinauto\timings.py", line 436, in wait_until_passes
func_val = func(*args, **kwargs)
File "C:\Program Files\Python37\lib\site-packages\pywinauto\findwindows.py", line 87, in find_element
raise ElementNotFoundError(kwargs)
pywinauto.findwindows.ElementNotFoundError: {'class_name': 'MsiDialogCloseClass', 'backend': 'uia', 'visible_only': False}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "test_enf_ins_gui.py", line 5, in <module>
r'C:\Users\txone\Downloads\Install.exe').connect(class_name="MsiDialogCloseClass",timeout=20)
File "C:\Program Files\Python37\lib\site-packages\pywinauto\application.py", line 994, in connect
*(), **kwargs
File "C:\Program Files\Python37\lib\site-packages\pywinauto\timings.py", line 458, in wait_until_passes
raise err
pywinauto.timings.TimeoutError
If you run Jenkins agent as a service, Windows will not give you access to create/control any GUI. You have to start Jenkins agent under normal user, not as a service (maybe using Windows scheduler). Many hints are collected in the Remote Execution Guide. There are many things how to handle RDP or VNC desktop. Please read carefully.
P.S. Also PsExec may hang under Jenkins/Java (known issue, probably not your case), but you can use open-source analogue PAExec with similar options if you need it one day.
We have a Dask pipeline in which we basically use a LocalCluster as a process pool. i.e. we start the cluster with LocalCluster(processes=True, threads_per_worker=1). Like so:
dask_cluster = LocalCluster(processes=True, threads_per_worker=1)
with Client(dask_cluster) as dask_client:
exit_code = run_processing(input_file, dask_client, db_state).value
Our workflow and task parallelization works great when run locally. However when we copy the code into a Docker container (centos based), the processing completes and we sometimes get the following error as the container exits:
Traceback (most recent call last):^M
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/multiprocessing/queues.py", line 240, in _feed^M
send_bytes(obj)^M
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/multiprocessing/connection.py", line 200, in send_bytes^M
self._send_bytes(m[offset:offset + size])^M
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/multiprocessing/connection.py", line 404, in _send_bytes^M
self._send(header + buf)^M
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/multiprocessing/connection.py", line 368, in _send^M
n = write(self._handle, buf)^M
BrokenPipeError: [Errno 32] Broken pipe^M
Furthermore, we get multiple instances of this error which makes me think that the error is coming from abandoned worker processes. Our current working theory is that this is related somehow to the "Docker zombie reaping problem" but we don't know how to fix it without starting from a completely different docker image and we don't want to do that.
Is there a way to fix this using only Dask cluster/client cleanup methods?
You should create the cluster as a context manager. It is actually the thing that launches processes, not the Client.
with LocalCluster(...):
...
I am having difficulty creating a Graph object on Neo4j 3.4.6 using py2neo 4.10 with Python 3.7.0.
I created a Docker container running Neo4j, and I disabled authentication using Dockerfile entry ENV NEO4J_AUTH=none. I verified that I can browse to the Neo4j database from the host with http://localhost:7474 and that I was not required to enter a password.
I created a second Docker contain for my web server. I accessed the Bash shell using Docker exec -it 033f92b042c1 /bin/bash. I verified that I can ping the Docker image containing the Neo4j installation.
From the second Docker image, I tried to create a Database object or a Graph object.
import neo4j
import py2neo
from py2neo import Graph
graph = Graph("bolt://172.17.0.3:7687")
I tried different protocols and localhost rather than the IP. In each case, Python throws this:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/py2neo/database.py", line 88, in __new__
inst = cls._instances[key]
KeyError: '5280a6d494b601f0256493eab3a08e55'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.7/site-packages/py2neo/database.py", line 305, in __new__
database = Database(uri, **settings)
File "/usr/local/lib/python3.7/site-packages/py2neo/database.py", line 92, in __new__
from py2neo.internal.http import HTTPDriver, HTTPSDriver
File "/usr/local/lib/python3.7/site-packages/py2neo/internal/http.py", line 26, in <module>
from neo4j.addressing import SocketAddress
File "/usr/local/lib/python3.7/site-packages/neo4j/addressing.py", line 26, in <module>
from neo4j.exceptions import AddressError
ImportError: cannot import name 'AddressError' from 'neo4j.exceptions' (/usr/local/lib/python3.7/site-packages/neo4j/exceptions.py)
Am I missing a dependency, or is there another way I should be connecting the Docker images?
Py2neo doesn't require neo4j. It is possible that the reason is neo4j is creating problem.
In my case,be sure to make clean installation, I removed all neo4j-related modules and then install py2neo with its requirement neo4j-driver.
pip uninstall neo4j neobolt neo4restclient neo4j-driver py2neo
then install:
pip install neo4j-driver py2neo
I hope that will work.
Docker build will build and run the image, but during docker-compose I get the following error:
> .\docker-compose-Windows-x86_64.exe -f C:\t\tea\docker-compose.yml up
Building web
Traceback (most recent call last):
File "docker-compose", line 6, in <module>
File "compose\cli\main.py", line 71, in main
File "compose\cli\main.py", line 127, in perform_command
File "compose\cli\main.py", line 1039, in up
File "compose\cli\main.py", line 1035, in up
File "compose\project.py", line 465, in up
File "compose\service.py", line 327, in ensure_image_exists
File "compose\service.py", line 999, in build
File "site-packages\docker\api\build.py", line 149, in build
File "site-packages\docker\utils\build.py", line 15, in tar
File "site-packages\docker\utils\utils.py", line 100, in create_archive
File "tarfile.py", line 1802, in gettarinfo
FileNotFoundError: [WinError 3] The system cannot find the path specified:
'C:\\t\\tea\\src\\app\\accSettings\\account-settings-main\\components\\account-settings-catalog\\components\\account-settings-copy-catalog-main\\components\\account-settings-copy-catalog-destination\\components\\account-settings-copy-destination-table\\account-settings-copy-destination-table.component.html'
[18400] Failed to execute script docker-compose
> docker -v
Docker version 18.03.0-ce-rc1, build c160c73
> docker-compose -v
docker-compose version 1.19.0, build 9e633ef3
I've enabled Win32 long paths in my local group policy editor, but not having any luck solving this issue.
Here is the docker-compose.yml if it helps:
version: '3'
services:
web:
image: web
build:
context: .
dockerfile: Dockerfile
This is a known issue under some circumstances with docker-compose. And, it is related to the MAX_PATH limitation of 260 characters on Windows.
Exerpt from the Microsoft docs on Maximum Path Length Limitation
In the Windows API (with some exceptions discussed in the following
paragraphs), the maximum length for a path is MAX_PATH, which is
defined as 260 characters.
From reading up on this, it seems that the solution depends on your docker-compose version and Windows version. Here's a summary of the solutions that I have found:
Solution #1
Upgrade to docker-compose version 1.23.0 or beyond. There is a bugfix in the 1.23.0 release described as:
Fixed an issue where paths longer than 260 characters on Windows
clients would cause docker-compose build to fail.
Solution #2
Enable NTFS long paths:
Hit the Windows key, type gpedit.msc and press Enter.
Navigate to Local Computer Policy > Computer Configuration > Administrative Templates > System > Filesystem > NTFS.
Double click the Enable NTFS long paths option and enable it.
If you're using a version of Windows that does not provide access to Group Policy, you can edit the registry instead.
Hit the Windows key, type regedit and press Enter.
Navigate to HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windows\ CurrentVersion\Group Policy Objects\ {48981759-12F2-42A6-A048-028B3973495F} Machine\System\CurrentControlSet\Policies
Select the LongPathsEnabled key, or create it as a DWORD (32-bit) value if it does not exist.
Set the value to 1 and close the Registry Editor.
Solution #3
Install docker-compose via pip. This seems to have solve the issue for others that have come across this.
Expert from the docker-compose documentation:
For alpine, the following dependency packages are needed: py-pip,
python-dev, libffi-dev, openssl-dev, gcc, libc-dev, and make.
Compose can be installed from pypi using pip. If you install using
pip, we recommend that you use a virtualenv because many operating
systems have python system packages that conflict with docker-compose
dependencies. See the virtualenv tutorial to get started.
pip install docker-compose
If you are not using virtualenv,
sudo pip install docker-compose
pip version 6.0 or greater is
required.