Connection error while retrieving metada from container running an ECS task - docker

I'm trying to retrieve the actual region where the instance running the ECS task in a container is. The container runs a python script which first task is to get the region so that I can use boto3 methods like sqs.get_queue_by_name() which need a region to be set. To do that, I try to get the region with
meta = requests.get('http://169.254.169.254/latest/dynamic/instance-identity/document', timeout=1).json()
os.environ["AWS_DEFAULT_REGION"] = meta.get("region")
but I got a connection error.
When I build my stack by hand, there is no issue, but when the stack is deployed by CDK (the the same security groups, roles etc), I got the error
requests.exceptions.ConnectionError: HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /latest/dynamic/instance-identity/document (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fe3d62491f0>: Failed to establish a new connection: [Errno 22] Invalid argument'))
I can see two different avenues to solve this issue:
Set the environment variable 'AWS_DEFAULT_REGION' when deploying with CDK, but with
taskDefinition.addContainer('DSPTContainer', {
image: ecrImage,
memoryLimitMiB: 30000,
environment: {
AWS_DEFAULT_REGION: props.env?.region
})
there is an issue with Property 'AWS_DEFAULT_REGION' is incompatible with index signature.
Modifying the task role (but how) or something else (like the security group) to allow the connection. Note that within the instance, I am able to establish the connection....
[EDIT]
Inside the container (I can log into container when instance is running), I can ping say google.com but not the instance metadata URI:
import requests
requests.get("https://www.google.com", timeout=1) ---> Response200
requests.get("http://169.254.169.254/latest/meta-data/", timeout=1) ---> ConnectTimeout Exception
[SOLUTION]
Issue linked to duplicate?

Related

How to add a Minio connection to Airflow connections?

I am trying to add a running instance of MinIO to Airflow connections, I thought it should be as easy as this setup in the GUI (never mind the exposed credentials, this is a blocked of environment and will be changed afterwards):
Airflow as well as minio are running in docker containers, which both use the same docker network. Pressing the test button results in the following error:
'ClientError' error occurred while testing connection: An error occurred (InvalidClientTokenId) when calling the GetCallerIdentity operation: The security token included in the request is invalid.
I am curious about what I am missing. The idea was to set up this connection and then use a bucket for data-aware scheduling (= I want to trigger a DAG as soon as someone uploads a file to the bucket)
I am also facing the problem that the endpoint URL refused connection. what I have done is the is actually running in the docker container so we should give docker host url
{
"aws_access_key_id":"your_minio_access_key",
"aws_secret_access_key": "your_minio_secret_key",
"host": "http://host.docker.internal:9000"
}
I am also facing this error in Airflow 2.5.0.
I've found workaround using boto3 library that already buit-in.
Firsty I created connection with parameters:
Connection Id: any label (Minio in my case)
Connection Type: Generic
Host: minio server ip and port
Login: Minio access key
Password: Minio secret key
And here's my code:
import boto3
from airflow.hooks.base import BaseHook
conn = BaseHook.get_connection('Minio')
s3 = boto3.resource('s3',
endpoint_url=conn.host,
aws_access_key_id=conn.login,
aws_secret_access_key=conn.password
)
s3client = s3.meta.client
#and then you can use boto3 methods for manipulating buckets and files
#for example:
bucket = s3.Bucket('test-bucket')
# Iterates through all the objects, doing the pagination for you. Each obj
# is an ObjectSummary, so it doesn't contain the body. You'll need to call
# get to get the whole body.
for obj in bucket.objects.all():
key = obj.key

Test Containers work normally locally Windows but not when Jenkins is running the tests

I have some testcontainers running for my junit intergration tests (Spring Boot, Junit 5)
public static PostgreSQLContainer<?> postgresContainer= new PostgreSQLContainer<>("postgres:13")
.withDatabaseName("test")
.withUsername("postgres")
.withPassword("testIntegration")
.withExposedPorts(5432)
.withInitScript("test.sql")
And one for another postgrs database and one Generic one for ActiveMQ
public static GenericContainer<?> aMQContainer= new GenericContainer<>("rmohr/activemq")
.withExposedPorts(61616)
.withEnv("DISABLE_SECURITY", "true")
.withEnv("BROKER_CONFIG_GLOBAL_MAX_SIZE", "50000")
.withEnv("BROKER_CONFIG_MAX_SIZE_BYTES", "50000")
.withEnv("BROKER_CONFIG_MAX_DISK_USAGE", "100");
postgresContainer.start();
postgresContainer2.start();
aMQContainer.start();
Locally everything work fine but when I run the tests in Jenkins which is set in a Linux environment (Raspberry Pi 4 4GB Model B) I get the following error:
Caused by: org.testcontainers.containers.ContainerLaunchException: Container startup failed
Caused by: org.rnorth.ducttape.RetryCountExceededException: Retry limit hit with exception
Caused by: org.testcontainers.containers.ContainerLaunchException: Could not create/start container
Caused by: org.testcontainers.containers.ContainerLaunchException: Timed out waiting for log output matching .*database systemt is ready to accept connections
I tried adding waiting conditions, or withStartupTimeoutSeconds(240) but to no avail.
Anyone with a similar problem?
In the end, I came up with this solution and it works stably for me:
postgreSQLContainer.setWaitStrategy(new LogMessageWaitStrategy()
.withRegEx(".database system is ready to accept connections.\\s")
.withTimes(1)
.withStartupTimeout(Duration.of(60, SECONDS)));
The problem seems to be with PostgresSqlContainer, it can't understand if the image is running in docker or not. Changing PostgresSqlContainer.withStartupCheckStrategy() to new IsRunningStartupCheckStrategy() helped me

Docker: 502/503 errors, waiting for response to warmup request for container

I have a web app in a Docker container that works perfectly if I run it locally, but when I deploy it to Azure it just won't work, whatever I try. I keep getting 502/503 errors. In the log file it says:
Traceback (most recent call last):
[ERROR] File "app.py", line 22, in
[ERROR] app.run(use_reloader=use_reloader, debug=use_reloader, host=host, port=port)
[ERROR] File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 922, in run
[ERROR] run_simple(t.cast(str, host), port, self, **options) [ERROR] File "/usr/local/lib/python3.6/site-packages/werkzeug/serving.py", line 982, in run_simple
[ERROR] s.bind(server_address)
[ERROR] socket.gaierror: [Errno -2] Name or service not known
The configuration I have:
Dockerfile: EXPOSE 80, application settings: see picture, app runs with this python code (this is just a snippet to show how environment variables are called):
if __name__ == '__main__':
# Get environment variables
use_reloader = os.environ.get('use_reloader')
debug = os.environ.get('debug')
port = os.environ.get('port')
# Run the app
app.run(use_reloader=use_reloader, debug=use_reloader, host=host, port=port)
What am I missing here? I looked at other answers related to this on here but that didn't help me with this. Anyone any suggestions? Thanks!
EDIT:
I tried another attempt, now with in Dockerfile: EXPOSE 8000, and in application settings: port 80 (see code snippet app.py above) and WEBSITES_PORT 8000. But now I get: Waiting for response to warmup request for container. And then after many of these messages it times out and restarts... I think I still don't understand quite how it works with the port settings, would someone be able to explain this to me? So what I need to know: how don the environment variable 'port' in app.py, the EXPOSE in the Dockerfile and the settings 'port' and 'WEBSITES_PORT' in the application settings in the web app need to be aligned/configured? I just can't find clear information about this.
I resolved the issue myself: the reason for the errors was that I had a huge image (with BERT model) but using a basic app plan. I upgraded to P1V3 and now it runs like a charm, with WEBSITES_PORT=8000 and WEBSITES_CONTAINER_START_LIMIT=1200. Please allow 2 minutes to warm up.

Why jib dockerBuild plugin fails to connect

I was trying to build the docker image for a project I'm working onto.
It's based on jhipster, after configuring the project it tells me to run the following maven command:
./mvnw -ntp -Pprod verify jib:dockerBuild
Unfortunately it doesn't seem to work, it returns me this errors:
[WARNING] The credential helper (docker-credential-pass) has nothing for server URL: registry.hub.docker.com
...
[WARNING] The credential helper (docker-credential-pass) has nothing for server URL: index.docker.io
[WARNING]
And finally fails with:
[ERROR] Failed to execute goal com.google.cloud.tools:jib-maven-plugin:2.4.0:dockerBuild (default-cli) on project booking: (null exception message): NullPointerException -> [Help 1]
Recently I worked on a google cloud project, and I edited the ~/.docker/config.json configuration file. I had to remove google's configuration entries to sort out another problem. Could that be the origin of the problem I'm facing now?
I've tried to do docker logout and docker login without success.
Some considerations
I don't know if editing manually the configuration caused the error, in fact I'm pretty sure to have deleted only google-related entries, but nothing referring to docker.* or similar.
To solve this issue, avoid to edit manually the docker configuration file. In fact I think that it should be avoided whenever possible, to avoid configuration problems of any sort.
Instead, just follow what the error message is trying to tell you: docker is not able to access those urls. Excluding network problems (which you can troubleshoot with ping registry-1.docker.io for example), it should be an authentication problem.
How to fix
I've found out that running those commands fixed it:
docker login registry.hub.docker.com
docker login registry-1.docker.io
I don't know if registry-1.docker.io is just a mirror of the other first server, which the plugin tries to access after the first unsuccessful connection. You can try to loging to registry.hub.docker.com and re-launch the command to see if it sufficient. In case it's not, login to the second one and then it will work.
I ran jib via Gradle:
./gradlew jibDockerBuild
and got a similar error
FAILURE: Build failed with an exception.
* What went wrong:
Execution failed for task ':jibDockerBuild'.
> com.google.cloud.tools.jib.plugins.common.BuildStepsExecutionException: Build to Docker daemon failed, perhaps you should make sure your credentials for 'registry-1.docker.io/library/openjdk' are set up correctly. See https://github.com/GoogleContainerTools/jib/blob/master/docs/faq.md#what-should-i-do-when-the-registry-responds-with-unauthorized for help
What ended up solving this error for me, bizarrely enough, was to log out of Docker Desktop.
I later also tried funder7's solution while logged in to Docker Desktop, and that also worked.

Stopping OrientDB service fails, ETL import not possible

My goal is to import data from CSV-files into OrientDB.
I use the OrientDB 2.2.22 Docker image.
When I try to execute the /orientdb/bin/oetl.sh config.json script within Docker, I get the error: "Can not open storage it is acquired by other process".
I guess this is, because the OrientDB - service is still running. But, if I try to stop it i get the next error.
./orientdb.sh stop
./orientdb.sh: return: line 70: Illegal number: root
or
./orientdb.sh status
./orientdb.sh: return: line 89: Illegal number: root
The only way for to use the ./oetl.sh script is to stop the Docker instance and restart it in the interactive mode running the shell, but this is awkward because to use the "OrientDB Studio" I have to stop docker again and start it in the normal mode.
As Roberto Franchini mentioned above setting the dbURL parameter in the Loader to use a remote URL fixed the first issue "Can not open storage it is acquired by other process".
The issues with the .orientdb.sh still exists, but with the remote-URL approach I don't need to shutdown and restart the service anymore.

Resources