Docker network disconnect - docker

I'm following this guide on how to set up docker with timescale/wale for continuous archiving.
https://docs.timescale.com/timescaledb/latest/how-to-guides/backup-and-restore/docker-and-wale/#run-the-timescaledb-container-in-docker
Everything runs as expected, but when I get to the final step, I'm seeing:
written to stdout
2022-04-13 07:43:33.349 UTC [27] LOG: redo done at 0/50000F8
Connecting to wale (172.18.0.3:80)
writing to stdout
- 100% |********************************| 36 0:00:00 ETA
written to stdout
Connecting to wale (172.18.0.3:80)
wget: server returned error: HTTP/1.0 500 INTERNAL SERVER ERROR
2022-04-13 07:43:34.264 UTC [27] LOG: selected new timeline ID: 2
2022-04-13 07:43:34.282 UTC [27] LOG: archive recovery complete
Connecting to wale (172.18.0.3:80)
wget: server returned error: HTTP/1.0 500 INTERNAL SERVER ERROR
2022-04-13 07:43:34.838 UTC [27] LOG: could not open file "pg_wal/000000010000000000000006": Permission denied
2022-04-13 07:43:34.844 UTC [1] LOG: database system is ready to accept connections
It looks like the wget to wale is failing? It's connected to the same network as timescaledb_recovered so shouldn't it work? is there some additional config that the docs are missing? Or am I misreading these logs somehow?
Some additional error output from wale log:
['wal-e', '--terse', 'wal-push', '/var/lib/postgresql/data/pg_wal/000000010000000000000012']
Pushing wal file /var/lib/postgresql/data/pg_wal/000000010000000000000012: ['wal-e', '--terse', 'wal-push', '/var/lib/postgresql/data/pg_wal/000000010000000000000012']
172.18.0.2 - - [13/Apr/2022 14:09:17] "GET /wal-push/000000010000000000000012 HTTP/1.1" 200 -
['wal-e', '--terse', 'wal-fetch', '-p=0', '000000010000000000000011', '/var/lib/postgresql/data/pg_wal/000000010000000000000011']
Fetching wal 000000010000000000000011: ['wal-e', '--terse', 'wal-fetch', '-p=0', '000000010000000000000011', '/var/lib/postgresql/data/pg_wal/000000010000000000000011']
172.18.0.4 - - [13/Apr/2022 14:09:53] "GET /wal-fetch/000000010000000000000011 HTTP/1.1" 200 -
['wal-e', '--terse', 'wal-fetch', '-p=0', '000000010000000000000012', '/var/lib/postgresql/data/pg_wal/000000010000000000000012']
Fetching wal 000000010000000000000012: ['wal-e', '--terse', 'wal-fetch', '-p=0', '000000010000000000000012', '/var/lib/postgresql/data/pg_wal/000000010000000000000012']
172.18.0.4 - - [13/Apr/2022 14:09:54] "GET /wal-fetch/000000010000000000000012 HTTP/1.1" 200 -
['wal-e', '--terse', 'wal-fetch', '-p=0', '000000010000000000000011', '/var/lib/postgresql/data/pg_wal/000000010000000000000011']
Fetching wal 000000010000000000000011: ['wal-e', '--terse', 'wal-fetch', '-p=0', '000000010000000000000011', '/var/lib/postgresql/data/pg_wal/000000010000000000000011']
172.18.0.4 - - [13/Apr/2022 14:09:54] "GET /wal-fetch/000000010000000000000011 HTTP/1.1" 200 -
['wal-e', '--terse', 'wal-fetch', '-p=0', '00000002.history', '/var/lib/postgresql/data/pg_wal/00000002.history']
Fetching wal 00000002.history: ['wal-e', '--terse', 'wal-fetch', '-p=0', '00000002.history', '/var/lib/postgresql/data/pg_wal/00000002.history']
lzop: short read
wal_e.main CRITICAL MSG: An unprocessed exception has avoided all error handling
DETAIL: Traceback (most recent call last):
File "/usr/lib/python3.5/site-packages/wal_e/cmd.py", line 657, in main
args.prefetch)
File "/usr/lib/python3.5/site-packages/wal_e/operator/backup.py", line 353, in wal_restore
self.gpg_key_id is not None)
File "/usr/lib/python3.5/site-packages/wal_e/worker/worker_util.py", line 58, in do_lzop_get
return blobstore.do_lzop_get(creds, url, path, decrypt, do_retry=do_retry)
File "/usr/lib/python3.5/site-packages/wal_e/blobstore/file/file_util.py", line 52, in do_lzop_get
raise exc
File "/usr/lib/python3.5/site-packages/wal_e/blobstore/file/file_util.py", line 64, in write_and_return_error
key.get_contents_to_file(stream)
File "/usr/lib/python3.5/site-packages/wal_e/blobstore/file/calling_format.py", line 53, in get_contents_to_file
with open(self.path, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/backups/wal_005/00000002.history.lzo'
STRUCTURED: time=2022-04-13T14:09:55.216294-00 pid=32
Failed to fetch wal 00000002.history: None
Exception on /wal-fetch/00000002.history [GET]
Traceback (most recent call last):
File "/usr/lib/python3.5/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/usr/lib/python3.5/site-packages/flask/app.py", line 1816, in full_dispatch_request
return self.finalize_request(rv)
File "/usr/lib/python3.5/site-packages/flask/app.py", line 1831, in finalize_request
response = self.make_response(rv)
File "/usr/lib/python3.5/site-packages/flask/app.py", line 1957, in make_response
'The view function did not return a valid response. The'
TypeError: The view function did not return a valid response. The function either returned None or ended without a return statement.
172.18.0.4 - - [13/Apr/2022 14:09:55] "GET /wal-fetch/00000002.history HTTP/1.1" 500 -
['wal-e', '--terse', 'wal-fetch', '-p=0', '00000001.history', '/var/lib/postgresql/data/pg_wal/00000001.history']
Fetching wal 00000001.history: ['wal-e', '--terse', 'wal-fetch', '-p=0', '00000001.history', '/var/lib/postgresql/data/pg_wal/00000001.history']
lzop: short read
wal_e.main CRITICAL MSG: An unprocessed exception has avoided all error handling
DETAIL: Traceback (most recent call last):
File "/usr/lib/python3.5/site-packages/wal_e/cmd.py", line 657, in main
args.prefetch)
File "/usr/lib/python3.5/site-packages/wal_e/operator/backup.py", line 353, in wal_restore
self.gpg_key_id is not None)
File "/usr/lib/python3.5/site-packages/wal_e/worker/worker_util.py", line 58, in do_lzop_get
return blobstore.do_lzop_get(creds, url, path, decrypt, do_retry=do_retry)
File "/usr/lib/python3.5/site-packages/wal_e/blobstore/file/file_util.py", line 52, in do_lzop_get
raise exc
File "/usr/lib/python3.5/site-packages/wal_e/blobstore/file/file_util.py", line 64, in write_and_return_error
key.get_contents_to_file(stream)
File "/usr/lib/python3.5/site-packages/wal_e/blobstore/file/calling_format.py", line 53, in get_contents_to_file
with open(self.path, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/backups/wal_005/00000001.history.lzo'
STRUCTURED: time=2022-04-13T14:09:55.689548-00 pid=38
Failed to fetch wal 00000001.history: None
Exception on /wal-fetch/00000001.history [GET]
Traceback (most recent call last):
File "/usr/lib/python3.5/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/usr/lib/python3.5/site-packages/flask/app.py", line 1816, in full_dispatch_request
return self.finalize_request(rv)
File "/usr/lib/python3.5/site-packages/flask/app.py", line 1831, in finalize_request
response = self.make_response(rv)
File "/usr/lib/python3.5/site-packages/flask/app.py", line 1957, in make_response
'The view function did not return a valid response. The'
TypeError: The view function did not return a valid response. The function either returned None or ended without a return statement.
172.18.0.4 - - [13/Apr/2022 14:09:55] "GET /wal-fetch/00000001.history HTTP/1.1" 500 -
I've added some additional logs from the wale container that create the error message on running timescaledb-recovered. I'm guessing that there is some issue with the requests timescaledb-recovered is sending because wget works until that continer is started.

This is bizarre, but apparently the critical failure and 500 error are intended to lets postgres know no further segments need to be recovered. Incredibly frustrating.

Related

macos: okta-awscli - exception - SSL validation failed for https://sts.amazonaws.com/ [Errno 2] No such file or directory

`Okta awscli is not working in my macOS.
Describe the bug
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/urllib3/util/ssl_.py", line 402, in ssl_wrap_socket
context.load_verify_locations(ca_certs, ca_cert_dir, ca_cert_data)
FileNotFoundError: [Errno 2] No such file or directory
\botocore.exceptions.SSLError: SSL validation failed for https://sts.amazonaws.com/ [Errno 2] No such file or directory`
I had tried by uninstalling and installign python, aws, okta-awscli.
I'm using macos 12.6.1

Appium starts application but test gives error “FAIL : No application is open”

The problem:
i use appium+robotframework to test my app.when i use key words:Open Application,it always gets failed result:No application is open.but actually the app was already open.i started appium server with code:appium -p 4723 --session-override --no-reset.
Environment:
info AppiumDoctor ### Diagnostic for necessary dependencies starting ###
info AppiumDoctor ✔ The Node.js binary was found at: C:\Program Files\nodejs\node.EXE
info AppiumDoctor ✔ Node version is 16.15.1
info AppiumDoctor ✔ ANDROID_HOME is set to: D:\Android_Sdk
info AppiumDoctor ✔ JAVA_HOME is set to: C:\Program Files\Java\jdk1.8.0_60
info AppiumDoctor Checking adb, android, emulator
info AppiumDoctor 'adb' is in D:\Android_Sdk\platform-tools\adb.exe
info AppiumDoctor 'android' is in D:\Android_Sdk\tools\android.bat
info AppiumDoctor 'emulator' is in D:\Android_Sdk\emulator\emulator.exe
info AppiumDoctor ✔ adb, android, emulator exist: D:\Android_Sdk
info AppiumDoctor ✔ 'bin' subfolder exists under 'C:\Program Files\Java\jdk1.8.0_60'
info AppiumDoctor ### Diagnostic for necessary dependencies completed, no fix needed. ###
Log:
in robotframework,i runed the test in debug,there some info:
20220802 18:05:05.399 : DEBUG : Starting new HTTP connection (1): 127.0.0.1:4723
20220802 18:05:14.770 : DEBUG : http://127.0.0.1:4723 "POST /wd/hub/session HTTP/1.1" 200 884
20220802 18:05:14.771 : DEBUG : Remote response: status=200 | data={"value":{"capabilities":{"platform":"LINUX","webStorageEnabled":false,"takesScreenshot":true,"javascriptEnabled":true,"databaseEnabled":false,"networkConnectionEnabled":true,"locationContextEnabled":false,"warnings":{},"desired":{"platformName":"Android","appPackage":"com.cmcc.myhouse.demo","appActivity":"com.cmcc.myhouse.MainActivity","appWaitDuration":60000,"noSign":true},"platformName":"Android","appPackage":"com.cmcc.myhouse.demo","appActivity":"com.cmcc.myhouse.MainActivity","appWaitDuration":60000,"noSign":true,"deviceName":"ed192f0","deviceUDID":"ed192f0","deviceApiLevel":29,"platformVersion":"10","deviceScreenSize":"1080x2160","deviceScreenDensity":380,"deviceModel":"ONEPLUS A5010","deviceManufacturer":"OnePlus","pixelRatio":2.375,"statBarHeight":57,"viewportRect":{"left":0,"top":57,"width":1080,"height":2103}},"sessionId":"312366fe-1008-47f4-9063-1cf0e4a27e0c"}} | headers=HTTPHeaderDict({'X-Powered-By': 'Express', 'Vary': 'X-HTTP-Method-Override', 'Content-Type': 'application/json; charset=utf-8', 'Content-Length': '884', 'ETag': 'W/"374-cX9IxtSKVtVV/oMPHrqcO0PP2Yg"', 'Date': 'Tue, 02 Aug 2022 10:05:14 GMT', 'Connection': 'keep-alive', 'Keep-Alive': 'timeout=600'})
20220802 18:05:14.771 : DEBUG : Finished Request
20220802 18:05:14.774 : FAIL : No application is open
20220802 18:05:14.776 : DEBUG :
Traceback (most recent call last):
File "c:\users\xiangfang\appdata\local\programs\python\python37\lib\site-packages\AppiumLibrary\keywords\keywordgroup.py", line 16, in _run_on_failure_decorator
return method(*args, **kwargs)
File "c:\users\xiangfang\appdata\local\programs\python\python37\lib\site-packages\AppiumLibrary\keywords\_applicationmanagement.py", line 52, in open_application
application = webdriver.Remote(str(remote_url), desired_caps)
File "c:\users\xiangfang\appdata\local\programs\python\python37\lib\site-packages\appium\webdriver\webdriver.py", line 268, in __init__
AppiumConnection(command_executor, keep_alive=keep_alive), desired_capabilities, browser_profile, proxy
File "c:\users\xiangfang\appdata\local\programs\python\python37\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 275, in __init__
self.start_session(capabilities, browser_profile)
File "c:\users\xiangfang\appdata\local\programs\python\python37\lib\site-packages\appium\webdriver\webdriver.py", line 361, in start_session
self.capabilities = response.get('value')
AttributeError: can't set attribute
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:\users\xiangfang\appdata\local\programs\python\python37\lib\site-packages\decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "c:\users\xiangfang\appdata\local\programs\python\python37\lib\site-packages\AppiumLibrary\keywords\keywordgroup.py", line 21, in _run_on_failure_decorator
raise err
File "c:\users\xiangfang\appdata\local\programs\python\python37\lib\site-packages\AppiumLibrary\keywords\keywordgroup.py", line 16, in _run_on_failure_decorator
return method(*args, **kwargs)
File "c:\users\xiangfang\appdata\local\programs\python\python37\lib\site-packages\AppiumLibrary\keywords\_screenshot.py", line 31, in capture_page_screenshot
if hasattr(self._current_application(), 'get_screenshot_as_file'):
File "c:\users\xiangfang\appdata\local\programs\python\python37\lib\site-packages\AppiumLibrary\keywords\_applicationmanagement.py", line 367, in _current_application
raise RuntimeError('No application is open')
RuntimeError: No application is open
20220802 18:05:14.779 : WARN : Keyword 'Capture Page Screenshot' could not be run on failure: No application is open
20220802 18:05:14.780 : FAIL : AttributeError: can't set attribute
20220802 18:05:14.780 : DEBUG :
Traceback (most recent call last):
File "c:\users\xiangfang\appdata\local\programs\python\python37\lib\site-packages\decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "c:\users\xiangfang\appdata\local\programs\python\python37\lib\site-packages\AppiumLibrary\keywords\keywordgroup.py", line 21, in _run_on_failure_decorator
raise err
File "c:\users\xiangfang\appdata\local\programs\python\python37\lib\site-packages\AppiumLibrary\keywords\keywordgroup.py", line 16, in _run_on_failure_decorator
return method(*args, **kwargs)
File "c:\users\xiangfang\appdata\local\programs\python\python37\lib\site-packages\AppiumLibrary\keywords\_applicationmanagement.py", line 52, in open_application
application = webdriver.Remote(str(remote_url), desired_caps)
File "c:\users\xiangfang\appdata\local\programs\python\python37\lib\site-packages\appium\webdriver\webdriver.py", line 268, in __init__
AppiumConnection(command_executor, keep_alive=keep_alive), desired_capabilities, browser_profile, proxy
File "c:\users\xiangfang\appdata\local\programs\python\python37\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 275, in __init__
self.start_session(capabilities, browser_profile)
File "c:\users\xiangfang\appdata\local\programs\python\python37\lib\site-packages\appium\webdriver\webdriver.py", line 361, in start_session
self.capabilities = response.get('value')
AttributeError: can't set attribute
Ending test: XiriTest.XiriBusinessTest.26MainBusinessTest.2.6CommonCMD
The app is usually opened automatically by Appium and then makes the necessary connections. Tests can fail if the app is already open.
Are you sure the appPackage and appActivity are correct? It would be worth double checking these.

Getting handshake error while doing Oauth 2.0 flow with IdentityServer4 and Authlib

I have implemented an authorization server using IdentityServer4 and also a client app using Python Flask and try to test authentication with Authlib. I managed to get past the error one by one but there is one that I am stuck with and have no idea why am I getting the error. here is the exception thrown in the Python(client) side:
usr/lib/python3/dist-packages/urllib3/connectionpool.py:999: InsecureRequestWarning: Unverified HTTPS request is being made to host '192.168.1.90'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
warnings.warn(
ERROR:root:HTTPSConnectionPool(host='192.168.1.90', port=4443): Max retries exceeded with url: /.well-known/openid-configuration/jwks (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', '', 'certificate verify failed')])")))
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/contrib/pyopenssl.py", line 485, in wrap_socket
cnx.do_handshake()
File "/usr/local/lib/python3.8/dist-packages/OpenSSL/SSL.py", line 1991, in do_handshake
self._raise_ssl_error(self._ssl, result)
File "/usr/local/lib/python3.8/dist-packages/OpenSSL/SSL.py", line 1700, in _raise_ssl_error
_raise_current_error()
File "/usr/local/lib/python3.8/dist-packages/OpenSSL/_util.py", line 55, in exception_from_error_queue
raise exception_type(errors)
OpenSSL.SSL.Error: [('SSL routines', '', 'certificate verify failed')]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 665, in urlopen
httplib_response = self._make_request(
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 376, in _make_request
self._validate_conn(conn)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 996, in _validate_conn
conn.connect()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 366, in connect
self.sock = ssl_wrap_socket(
File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 370, in ssl_wrap_socket
return context.wrap_socket(sock, server_hostname=server_hostname)
File "/usr/lib/python3/dist-packages/urllib3/contrib/pyopenssl.py", line 491, in wrap_socket
raise ssl.SSLError("bad handshake: %r" % e)
ssl.SSLError: ("bad handshake: Error([('SSL routines', '', 'certificate verify failed')])",)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 439, in send
resp = conn.urlopen(
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 719, in urlopen
retries = retries.increment(
File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 436, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='192.168.1.90', port=4443): Max retries exceeded with url: /.well-known/openid-configuration/jwks (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', '', 'certificate verify failed')])")))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/securify/SecurifyID/chrome-extension/chrome-extension-backend/app.py", line 96, in callback_handling
securify.authorize_access_token(verify=False)
File "/home/securify/.local/lib/python3.8/site-packages/authlib/integrations/flask_client/apps.py", line 107, in authorize_access_token
userinfo = self.parse_id_token(token, nonce=state_data['nonce'])
File "/home/securify/.local/lib/python3.8/site-packages/authlib/integrations/base_client/sync_openid.py", line 66, in parse_id_token
claims = _jwt.decode(
File "/home/securify/.local/lib/python3.8/site-packages/authlib/jose/rfc7519/jwt.py", line 96, in decode
data = self._jws.deserialize_compact(s, load_key, decode_payload)
File "/home/securify/.local/lib/python3.8/site-packages/authlib/jose/rfc7515/jws.py", line 101, in deserialize_compact
algorithm, key = self._prepare_algorithm_key(jws_header, payload, key)
File "/home/securify/.local/lib/python3.8/site-packages/authlib/jose/rfc7515/jws.py", line 254, in _prepare_algorithm_key
key = key(header, payload)
File "/home/securify/.local/lib/python3.8/site-packages/authlib/integrations/base_client/sync_openid.py", line 38, in load_key
jwk_set = JsonWebKey.import_key_set(self.fetch_jwk_set())
File "/home/securify/.local/lib/python3.8/site-packages/authlib/integrations/base_client/sync_openid.py", line 17, in fetch_jwk_set
resp = session.request('GET', uri, withhold_token=True)
File "/home/securify/.local/lib/python3.8/site-packages/authlib/integrations/requests_client/oauth2_session.py", line 104, in request
return super(OAuth2Session, self).request(
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 514, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='192.168.1.90', port=4443): Max retries exceeded with url: /.well-known/openid-configuration/jwks (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', '', 'certificate verify failed')])")))
2.186.124.22 - - [30/May/2022 07:21:13] "GET /callback?code=4FD8DE309058C13FF8FD0A3FC70A1793D9B2CA0F6CFF84362309BBEC56881C60&scope=openid%20profile%20email%20Roles&state=Bzlq7Ot4O6lIdmEOs0tYpSWZIj1nV8&session_state=RF3Fhyoxgg097pLXygTmXLKjWuSj1DbzIsuL_MEMURs.7038FDC84E3DD2C2F908E76BA513B2E2 HTTP/1.1" 500 -
This exception occurs in test.authorize_access_token(verify=False) step of the Python code I even passed verify=False as on my IdentityServer I am using self signed certificates but not for client side. I am suspecting it might be related to JWKS_URI but not sure. Here is the JWKS_URI contents.
{"keys":[{"kty":"RSA","use":"sig","kid":"626D09B2DC030BE93D98473AAD272727","e":"AQAB","n":"rSEKbbU0E7GgnuGHMVAfzhYj34Z7rgGcNy5nukzY-Ci6M_U0S-sab52cpoALSKPNep46aXgBpoSTGCuonHTIyy1ZJtx5aGFNnj80t4Lu1l9R-dKmUE3zr4JgdzO8eHBN1ZQ9ybvM5-k6zB9nyYavfFTFhgCGNVvwWpCko_fVU7ma8sled-h4iKRTcupy4mtCS9JPfa9Iu2O0sm9K6cqM_HrDM9p_wiM0D7e5ZL_27XwS_O1MfaBeLsAOZQ-1ayvCRq4eGI9yMGcr_U_EGV_pKqyDL1SzNguVbZaBkUqZrBKZl4OQOl8thjPld7ontTmoF2DvN_U0hpXiQOT_ZSAgOQ","alg":"RS256"}]}
Oh and here is the error seen in the browser:
{
"message": "HTTPSConnectionPool(host='192.168.1.90', port=4443): Max retries exceeded with url: /connect/userinfo (Caused by SSLError(SSLError(\"bad handshake: Error([('SSL routines', '', 'certificate verify failed')])\")))"
}
All I needed to do was setting CURL_CA_BUNDLE="".
It seems setting verify = False in authorize access token does not overwrite all requests.

"docker-compose up" failed to build, The command '/bin/sh -c pipenv install' returned a non-zero code: 1

Cloned a project from github and in the README file it had the instructions to do
docker-compose up --build
npm run dev
so I typed docker-compose up --build in my terminal. It seemed to be working until
Installing dependencies from Pipfile.lock (68781d)...
and then I had to wait a few minutes and got the following error.
Installing dependencies from Pipfile.lock (68781d)...
An error occurred while installing tensorflow==2.4.1 --hash=sha256:36d5acd60aac48e34bd545d0ce1fb8b3fceebff6b8782436defd0f71c12203bd --hash=sha256:55368ba0bedb513ba0e36a2543a588b5276e9b2ca99fa3232a9a176601a7bab5 --hash=sha256:e1f2799cc86861680d8515167f103e2207a8cab92a4afe5471e4839330591f08 --hash=sha256:22723b8e1fa83b34f56c349b16a57aaff913b404451fcf70981f2b1d6e0c64fc --hash=sha256:efa9daa4b3701a4e439b24b74c1e4b66844aee8ae5263fb3cc12281ac9cc9f67 --hash=sha256:2357112319303da1b5459a621fd0503c2b2cd97b6c33c4903abd46b3c3e380e2 --hash=sha256:4a04081647b89a8fb602895b29ffc559e3c20aac8bde1d4c5ecd2a65adce5d35 --hash=sha256:0e427b1350be6dbe572f971947c5596fdbb152081f227808d8becd894bf40282 --hash=sha256:eedcf578afde5e6e69c75d796bed41093451cd1ab54afb438760e40fb74a09de! Will try again.
An error occurred while installing torch==1.7.1; python_full_version >= '3.6.2' --hash=sha256:f0aaf657145533824b15f2fd8fde8f8c67fe6c6281088ef588091f03fad90243 --hash=sha256:5d76c255a41484c1d41a9ff570b9c9f36cb85df9428aa15a58ae16ac7cfc2ea6 --hash=sha256:a3793dcceb12b1e2281290cca1277c5ce86ddfd5bf044f654285a4d69057aea7 --hash=sha256:e000b94be3aa58ad7f61e7d07cf379ea9366cf6c6874e68bd58ad0bdc537b3a7 --hash=sha256:af464a6f4314a875035e0c4c2b07517599704b214634f4ed3ad2e748c5ef291f --hash=sha256:de84b4166e3f7335eb868b51d3bbd909ec33828af27290b4171bce832a55be3c --hash=sha256:38d67f4fb189a92a977b2c0a38e4f6dd413e0bf55aa6d40004696df7e40a71ff --hash=sha256:d241c3f1c4d563e4ba86f84769c23e12606db167ee6f674eedff6d02901462e3 --hash=sha256:dd2fc6880c95e836960d86efbbc7f63d3287f2e1893c51d31f96dbfe02f0d73e --hash=sha256:6652a767a0572ae0feb74ad128758e507afd3b8396b6e7f147e438ba8d4c6f63 --hash=sha256:422e64e98d0e100c360993819d0307e5d56e9517b26135808ad68984d577d75a --hash=sha256:2e49cac969976be63117004ee00d0a3e3dd4ea662ad77383f671b8992825de1a! Will try again.
An error occurred while installing torchvision==0.8.2 --hash=sha256:24db8f4c3d812a032273f68563ad5dbd724f5bfbed523d0c6dce8cede26bb153 --hash=sha256:afb76a66b9b0693f758a881a2bf333ed97e3c0c3f15a413c4f49d8dd8bd21307 --hash=sha256:976750a49db2e23dc5a1ed0b5c31f7af51ed2702eee410ee09ef985c3a3e48cf --hash=sha256:1bd58acc3366ec02266aae56a7a752d43ef07de4a6ba420c4f907d0c9168bb8c --hash=sha256:cd8817e9197fc60ebae37162a445db90bbf35591314a5767ad3d1490b5d65b0f --hash=sha256:86fae370d222f76ad57c57c3bee03f78b8db727743bfb4c1559a3d395159cea8 --hash=sha256:b068f6bcbe91bdd34dda0a39e8a26392add45a3be82543f6dd523b76484fb56f --hash=sha256:951239b5fcb911dbf78c1385d677f5f48c7a1b12859e3d3ec287562821b17cf2! Will try again.
An error occurred while installing tqdm==4.56.0; python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3' --hash=sha256:fe3d08dd00a526850568d542ff9de9bbc2a09a791da3c334f3213d8d0bbbca65 --hash=sha256:4621f6823bab46a9cc33d48105753ccbea671b68bab2c50a9f0be23d4065cb5a! Will try again.
An error occurred while installing transformers==4.2.2 --hash=sha256:d0999ababcc3e416a51c42823b56f5116acc5c0913e44e829e83d0db2d475021 --hash=sha256:e151ee7a56e7649de567ad6f4d6a83245c564ca93a886ef0e025f058895cf9cc! Will try again.
An error occurred while installing twitter==1.18.0 --hash=sha256:acdc85e5beea752967bb64c63bde8b915c49a31a01db1b2fecccf9f2c1d5c44d --hash=sha256:52545fd3b70d3d3807d3ce62d1a256727856d784d1630d64dedcc643aaf0b908! Will try again.
An error occurred while installing typing-extensions==3.7.4.3 --hash=sha256:7cb407020f00f7bfc3cb3e7881628838e69d8f3fcab2f64742a5e76b2f841918 --hash=sha256:99d4073b617d30288f569d3f13d2bd7548c3a7e4c8de87db09a9d29bb3a4a60c --hash=sha256:dafc7639cde7f1b6e1acc0f457842a83e722ccca8eef5270af2d74792619a89f! Will try again.
An error occurred while installing unidic-lite==1.0.7 --hash=sha256:260798392fddc8746d7d5596dc9f9a4250a10f41961771cf709ec2dc7db8260a! Will try again.
An error occurred while installing uritemplate==3.0.1; python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3' --hash=sha256:07620c3f3f8eed1f12600845892b0e036a2420acf513c53f7de0abd911a5894f --hash=sha256:5af8ad10cec94f215e3f48112de2022e1d5a37ed427fbd88652fa908f2ab7cae! Will try again.
An error occurred while installing urllib3==1.26.2; python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3, 3.4' and python_version < '4' --hash=sha256:d8ff90d979214d7b4f8ce956e80f4028fc6860e4431f731ea4a8c08f23f99473 --hash=sha256:19188f96923873c92ccb987120ec4acaa12f0461fa9ce5d3d0772bc965a39e08! Will try again.
An error occurred while installing werkzeug==1.0.1; python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3, 3.4' --hash=sha256:6c80b1e5ad3665290ea39320b91e1be1e0d5f60652b964a3070216de83d2e47c --hash=sha256:2de2a5db0baeae7b2d2664949077c2ac63fbd16d98da0ff71837f7d1dea3fd43! Will try again.
An error occurred while installing wheel==0.36.2; python_version >= '3' --hash=sha256:78b5b185f0e5763c26ca1e324373aadd49182ca90e825f7853f4b2509215dc0e --hash=sha256:e11eefd162658ea59a60a0f6c7d493a7190ea4b9a85e335b33489d9f17e0245e! Will try again.
Installing initially failed dependencies...
An error occurred while installing wrapt==1.12.1 --hash=sha256:b62ffa81fb85f4332a4f609cab4ac40709470da05643a082ec1eb88e6d9b97d7! Will try again.
[InstallError]: File "/usr/local/lib/python3.8/site-packages/pipenv/cli/command.py", line 233, in install
[InstallError]: retcode = do_install(
[InstallError]: File "/usr/local/lib/python3.8/site-packages/pipenv/core.py", line 2052, in do_install
[InstallError]: do_init(
[InstallError]: File "/usr/local/lib/python3.8/site-packages/pipenv/core.py", line 1304, in do_init
[InstallError]: do_install_dependencies(
[InstallError]: File "/usr/local/lib/python3.8/site-packages/pipenv/core.py", line 899, in do_install_dependencies
[InstallError]: batch_install(
[InstallError]: File "/usr/local/lib/python3.8/site-packages/pipenv/core.py", line 796, in batch_install
[InstallError]: _cleanup_procs(procs, failed_deps_queue, retry=retry)
[InstallError]: File "/usr/local/lib/python3.8/site-packages/pipenv/core.py", line 703, in _cleanup_procs
[InstallError]: raise exceptions.InstallError(c.dep.name, extra=err_lines)
[pipenv.exceptions.InstallError]: WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fd8389ffc70>: Failed to establish a new connection: [Errno -2] Name or service not known')': /simple/tensorflow/
[pipenv.exceptions.InstallError]: WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fd8389ffee0>: Failed to establish a new connection: [Errno -2] Name or service not known')': /simple/tensorflow/
[pipenv.exceptions.InstallError]: WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fd8383f5070>: Failed to establish a new connection: [Errno -2] Name or service not known')': /simple/tensorflow/
[pipenv.exceptions.InstallError]: WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fd8383f51f0>: Failed to establish a new connection: [Errno -2] Name or service not known')': /simple/tensorflow/
[pipenv.exceptions.InstallError]: WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fd8383f53d0>: Failed to establish a new connection: [Errno -2] Name or service not known')': /simple/tensorflow/
[pipenv.exceptions.InstallError]: ERROR: Could not find a version that satisfies the requirement tensorflow==2.4.1
[pipenv.exceptions.InstallError]: ERROR: No matching distribution found for tensorflow==2.4.1
ERROR: Couldn't install package: tensorflow
Package installation failed...
ERROR: Service 'youtuberboard' failed to build : The command '/bin/sh -c pipenv install --system' returned a non-zero code: 1
Why am I getting these errors and how can I fix this?
If I do docker images I can check that the image is made, but the container does not seem to be made when I checkdocker ps -a.
Why is the container not made?
edit:
so I thought using the pipfile.lock was the problem so in the Dockerfile I changed the
RUN pipenv install --system
to
RUN pipenv install --system --verbose --skip-lock
and added the --skip-lock
this was not the answer and I still got a similar error.
.
. (deleting some lines)
.
Skipping link: none of the wheel's tags match: cp39-cp39-win_amd64: https://files.pythonhosted.org/packages/d6/c1/70f2fd464a895844a9bf4cf1d93b09eb6cd5edf8274d19a7fed2ed6c4cc3/torch-1.7.1-cp39-cp39-win_amd64.whl#sha256=6652a767a0572ae0feb74ad128758e507afd3b8396b6e7f147e438ba8d4c6f63 (from https://pypi.org/simple/torch/) (requires-python:>=3.6.2)
Skipping link: none of the wheel's tags match: cp39-none-macosx_10_9_x86_64: https://files.pythonhosted.org/packages/79/c8/7f7843dcbaf2263918d257e8022770be577a3d7587dd0ddf8171947eabb4/torch-1.7.1-cp39-none-macosx_10_9_x86_64.whl#sha256=38d67f4fb189a92a977b2c0a38e4f6dd413e0bf55aa6d40004696df7e40a71ff (from https://pypi.org/simple/torch/) (requires-python:>=3.6.2)
Given no hashes to check 1 links for project 'torch': discarding no candidates
Collecting torch==1.7.1
Created temporary directory: /tmp/pip-unpack-uv81hb8h
Looking up "https://files.pythonhosted.org/packages/1d/a9/f349273a0327fdf20a73188c9c3aa7dbce68f86fad422eadd366fd2ed7a0/torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl" in the cache
No cache entry available
Starting new HTTPS connection (1): files.pythonhosted.org:443
https://files.pythonhosted.org:443 "GET /packages/1d/a9/f349273a0327fdf20a73188c9c3aa7dbce68f86fad422eadd366fd2ed7a0/torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl HTTP/1.1" 200 776818711
Downloading torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl (776.8 MB)
Ignoring unknown cache-control directive: immutable
Updating cache with response from "https://files.pythonhosted.org/packages/1d/a9/f349273a0327fdf20a73188c9c3aa7dbce68f86fad422eadd366fd2ed7a0/torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl"
Caching due to etag
sys.exit(cli())
File "/usr/local/lib/python3.8/site-packages/pipenv/vendor/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/pipenv/vendor/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.8/site-packages/pipenv/vendor/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.8/site-packages/pipenv/vendor/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.8/site-packages/pipenv/vendor/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/pipenv/vendor/click/decorators.py", line 73, in new_func
return ctx.invoke(f, obj, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/pipenv/vendor/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/pipenv/vendor/click/decorators.py", line 21, in new_func
return f(get_current_context(), *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/pipenv/cli/command.py", line 233, in install
retcode = do_install(
File "/usr/local/lib/python3.8/site-packages/pipenv/core.py", line 2052, in do_install
do_init(
File "/usr/local/lib/python3.8/site-packages/pipenv/core.py", line 1304, in do_init
do_install_dependencies(
File "/usr/local/lib/python3.8/site-packages/pipenv/core.py", line 903, in do_install_dependencies
_cleanup_procs(procs, failed_deps_queue, retry=False)
File "/usr/local/lib/python3.8/site-packages/pipenv/core.py", line 703, in _cleanup_procs
raise exceptions.InstallError(c.dep.name, extra=err_lines)
pipenv.exceptions.InstallError: ERROR: Couldn't install package: facenet-pytorch
Package installation failed...
ERROR: Service 'youtuberboard' failed to build : The command '/bin/sh -c pipenv install --system --verbose --skip-lock' returned a non-zero code: 1
I needed to make the docker memory size larger. Changed the docker setting.

dask.distributed SLURM cluster Nanny Timeout

I am trying to use the dask.distributed.SLURMCluster to submit batch jobs to a SLURM job scheduler on a supercomputing cluster. The jobs all submit as expect, but throw an error after 1 minute of running: asyncio.exceptions.TimeoutError: Nanny failed to start in 60 seconds. How do I get the nanny to connect?
Full Trace:
distributed.nanny - INFO - Start Nanny at: 'tcp://206.76.203.125:38324'
distributed.dashboard.proxy - INFO - To route to workers diagnostics web server please install jupyter-server-proxy: python -m pip install jupyter-server-proxy
distributed.worker - INFO - Start worker at: tcp://206.76.203.125:37609
distributed.worker - INFO - Listening to: tcp://206.76.203.125:37609
distributed.worker - INFO - dashboard at: 206.76.203.125:35505
distributed.worker - INFO - Waiting to connect to: tcp://129.114.63.43:35489
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Threads: 8
distributed.worker - INFO - Memory: 2.00 GB
distributed.worker - INFO - Local Directory: /home1/06729/tg860286/tests/dask-rsmas-presentation/dask-worker-space/worker-pu937jui
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Waiting to connect to: tcp://129.114.63.43:35489
distributed.worker - INFO - Waiting to connect to: tcp://129.114.63.43:35489
distributed.worker - INFO - Waiting to connect to: tcp://129.114.63.43:35489
distributed.worker - INFO - Waiting to connect to: tcp://129.114.63.43:35489
distributed.nanny - INFO - Closing Nanny at 'tcp://206.76.203.125:38324'
distributed.worker - INFO - Stopping worker at tcp://206.76.203.125:37609
distributed.worker - INFO - Closed worker has not yet started: None
distributed.dask_worker - INFO - End worker
Traceback (most recent call last):
File "/home1/06729/tg860286/miniconda3/envs/daskbase/lib/python3.8/site-packages/distributed/node.py", line 173, in wait_for
await asyncio.wait_for(future, timeout=timeout)
File "/home1/06729/tg860286/miniconda3/envs/daskbase/lib/python3.8/asyncio/tasks.py", line 490, in wait_for
raise exceptions.TimeoutError()
asyncio.exceptions.TimeoutError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home1/06729/tg860286/miniconda3/envs/daskbase/lib/python3.8/runpy.py", line 193, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home1/06729/tg860286/miniconda3/envs/daskbase/lib/python3.8/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home1/06729/tg860286/miniconda3/envs/daskbase/lib/python3.8/site-packages/distributed/cli/dask_worker.py", line 440, in <module>
go()
File "/home1/06729/tg860286/miniconda3/envs/daskbase/lib/python3.8/site-packages/distributed/cli/dask_worker.py", line 436, in go
main()
File "/home1/06729/tg860286/miniconda3/envs/daskbase/lib/python3.8/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/home1/06729/tg860286/miniconda3/envs/daskbase/lib/python3.8/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/home1/06729/tg860286/miniconda3/envs/daskbase/lib/python3.8/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home1/06729/tg860286/miniconda3/envs/daskbase/lib/python3.8/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/home1/06729/tg860286/miniconda3/envs/daskbase/lib/python3.8/site-packages/distributed/cli/dask_worker.py", line 422, in main
loop.run_sync(run)
File "/home1/06729/tg860286/miniconda3/envs/daskbase/lib/python3.8/site-packages/tornado/ioloop.py", line 532, in run_sync
return future_cell[0].result()
File "/home1/06729/tg860286/miniconda3/envs/daskbase/lib/python3.8/site-packages/distributed/cli/dask_worker.py", line 416, in run
await asyncio.gather(*nannies)
File "/home1/06729/tg860286/miniconda3/envs/daskbase/lib/python3.8/asyncio/tasks.py", line 684, in _wrap_awaitable
return (yield from awaitable.__await__())
File "/home1/06729/tg860286/miniconda3/envs/daskbase/lib/python3.8/site-packages/distributed/node.py", line 176, in wait_for
raise TimeoutError(
asyncio.exceptions.TimeoutError: Nanny failed to start in 60 seconds```
It looks like your workers weren't able to connect to the scheduler. My guess is that you need to specify a network interface. You should ask your system administrator which network interface you should use, and then specify that with the interface= keyword.
You might also want to read through https://blog.dask.org/2019/08/28/dask-on-summit , which gives a case study of common problems that arise.

Resources