MLflow: Unable to store artifacts to S3 - docker

I'm running my mlflow tracking server in a docker container on a remote server and trying to log mlflow runs from local computer with the eventual goal that anyone on my team can send their run data to the same tracking server. I've set the tracking URI to be http://<ip of remote server >:<port on docker container>. I'm not explicitly setting any of the AWS credentials on the local machine because I would like to just be able to train locally and log to the remote server (run data to RDS and artifacts to S3). I have no problem logging my runs to an RDS database but I keep getting the following error when it get to the point of trying to log artifacts: botocore.exceptions.NoCredentialsError: Unable to locate credentials. Do I have to have the credentials available outside of the tracking server for this to work (ie: on my local machine where the mlflow runs are taking place)? I know that all of my credentials are available in the docker container that is hosting the tracking server. I've be able to upload files to my S3 bucket using the aws cli inside of the container that hosts my tracking server so I know that it as access. I'm confused by the fact that I can log to RDS but not S3. I'm not sure what I'm doing wrong at this point. TIA.

Yes, apparently I do need to have the credentials available to the local client as well.

Related

Local IMAP server on docker

I want to setup a local IMAP server within my home network for archiving emails. The server does not need to be accessable via the internet. Therefore I can pass on a secured access via SSL (If this makes it easier). I want to integrate the server in my current docker setup. So the server has to run within a docker container.
I already tried the following containers:
https://hub.docker.com/r/blackflysolutions/dovecot
https://hub.docker.com/r/dovecot/dovecot
https://hub.docker.com/r/mailu/dovecot
https://hub.docker.com/r/mailcow/dovecot
https://hub.docker.com/r/eilandert/dovecot
But i could not get any of them to run. At the same time none of them have a forum or anything where I can put a question. Two of them (mailu/dovecot and mailcow/dovecot) are part of a bigger mailserver package. Which I do not need, I only want a IMAP server to put some email locally. But I tried them anyway.
Does anyone know how to get any of those to run? Or suggest me another stable docker container solution.

Spring Cloud DataFlow: access to application.jar

Spring Cloud DataFlow (SCDF) deployed and runned in docker-container.
Applications deployed at localhost.
The task is to register applications in SCDF.
When I use this URI file:///${HOME}/IdeaProjects/file_read_maven-0.0.1-SNAPSHOT, where {HOME} is an element of absolute path at localhost, I receive in log "Error: Unable to access jarfile."
When I change location of application from localhost to container, in which SCDF is deployed, I receive the same result.
What may be the possible solution?
To resolve local artifacts from your laptop from inside the running Docker daemon, you will have to mount the volume where the artifacts are hosted.
Either you can resolve from the mounted file-system or alternatively you can from the local Maven cache, as well.
More details here.

Connecting to a remote ArangoDB dockerized server

I am a beginner in regards to ArangoDB and I am trying to deploy my first project using it. The website is PHP based - what I did is that I created an Arango Docker container on Digital Ocean so that I can access it from the browser with the ipv4 provided. Public access to port 8529 is enabled. Locally, I am able to modify the .config file in order to point to the corresponding ip and I can painlessly retrieve data.
As a hosting provider I am using one.com. When uploading the same files that I am able to run locally on my own domain I get the following error:
["_database":"ArangoDBClient\Connection":private]=> string(7) "_system" } ArangoDBClient\ConnectException: cannot connect to endpoint 'tcp://xxx.xx.xxx.xxx:8529/': Connection timed out
I want to mention that I have also tried out ArangoOasis. No luck with it - I get the same error. Been at it for quite a few weeks - I would very much use some guidance. Even what to do next as I am out of ideas and documentation to read.

Cannot access S3 bucket from WildFly running in Docker

I am trying to configure WildFly using the docker image jboss/wildfly:10.1.0.Final to run in domain mode. I am using docker for macos 8.06.1-ce using aufs storage.
I followed the instructions in this link https://octopus.com/blog/wildfly-s3-domain-discovery. It seems pretty simple, but I am getting the error:
WFLYHC0119: Cannot access S3 bucket 'wildfly-mysaga': WFLYHC0129: bucket 'wildfly-mysaga' could not be accessed (rsp=403 (Forbidden)). Maybe the bucket is owned by somebody else or the authentication failed.
But my access key, secret and bucket name are correct. I can use them to connect to s3 using AWS CLI.
What can I be doing wrong? The tutorial seems to run it in an EC2 instance, while my test is in docker. Maybe it is a certificate problem?
I generated access keys from admin user and it worked.

Airflow: Could not send worker log to S3

I deployed Airflow webserver, scheduler, worker, and flower on my kubernetes cluster using Docker images.
Airflow version is 1.8.0.
Now I want to send worker logs to S3 and
Create S3 connection of Airflow from Admin UI (Just set S3_CONN as
conn id, s3 as type. Because my kubernetes cluster is running on
AWS and all nodes have S3 access roles, it should be sufficient)
Set Airflow config as follows
remote_base_log_folder = s3://aws-logs-xxxxxxxx-us-east-1/k8s-airflow
remote_log_conn_id = S3_CONN
encrypt_s3_logs = False
and first I tried creating a DAG so that it just raises an exception immediately after it's running. This works, log can be seen on S3.
So I modified so that the DAG now creates an EMR cluster and waits for it to be ready (waiting status). To do this, I restarted all 4 docker containers of airflow.
Now the DAG looks working, a cluster is started and once it's ready, DAG marked as success. But I could see no logs on S3.
There is no related error log on worker and web server, so I even cannot see what may cause this issue. The log just not sent.
Does anyone know if there is some restriction for remote logging of Airflow, other than this description in the official documentation?
https://airflow.incubator.apache.org/configuration.html#logs
In the Airflow Web UI, local logs take precedence over remote logs. If
local logs can not be found or accessed, the remote logs will be
displayed. Note that logs are only sent to remote storage once a task
completes (including failure). In other words, remote logs for running
tasks are unavailable.
I didn't expect it but on success, will the logs not be sent to remote storage?
The boto version that is installed with airflow is 2.46.1 and that version doesn't use iam instance roles.
Instead, you will have to add an access key and secret for an IAM user that has access in the extra field of your S3_CONN configuration
Like so:
{"aws_access_key_id":"123456789","aws_secret_access_key":"secret12345"}

Resources