Celery worker in Docker using SQS gets cancelled - docker

I am running celery using AWS SQS in docker.
Here is my config -
AWS_ACCESS_KEY_ID=***
AWS_SECRET_ACCESS_KEY=***
CELERY_BROKER_URL=sqs://
CELERY_BROKER_TRANSPORT_OPTIONS={
region: "us-east-1",
}
Command to run celery -
celery -A config.celery_app worker -l DEBUG
Container exits with this output -
[2023-01-14 20:23:30,662: INFO/MainProcess] Connected to sqs://localhost//
[2023-01-14 20:23:30,676: DEBUG/MainProcess] Setting config variable for region to 'us-east-1'
[2023-01-14 20:23:30,683: DEBUG/MainProcess] Looking for credentials via: env
[2023-01-14 20:23:30,683: INFO/MainProcess] Found credentials in environment variables.
[2023-01-14 20:23:30,686: DEBUG/MainProcess] Loading JSON file: /usr/local/lib/python3.10/site-packages/botocore/data/endpoints.json
[2023-01-14 20:23:30,757: DEBUG/MainProcess] Loading JSON file: /usr/local/lib/python3.10/site-packages/botocore/data/sdk-default-configuration.json
[2023-01-14 20:23:30,770: DEBUG/MainProcess] Loading JSON file: /usr/local/lib/python3.10/site-packages/botocore/data/sqs/2012-11-05/service-2.json
[2023-01-14 20:23:30,790: DEBUG/MainProcess] Loading JSON file: /usr/local/lib/python3.10/site-packages/botocore/data/sqs/2012-11-05/endpoint-rule-set-1.json.gz
[2023-01-14 20:23:30,791: DEBUG/MainProcess] Loading JSON file: /usr/local/lib/python3.10/site-packages/botocore/data/partitions.json
[2023-01-14 20:23:30,800: DEBUG/MainProcess] Loading JSON file: /usr/local/lib/python3.10/site-packages/botocore/data/_retry.json
[2023-01-14 20:23:31,910: DEBUG/MainProcess] No retry needed.
[2023-01-14 20:23:32,187: DEBUG/MainProcess] No retry needed.
[2023-01-14 20:23:32,208: DEBUG/MainProcess] Canceling task consumer...
[2023-01-14 20:23:33,218: DEBUG/MainProcess] Canceling task consumer...
[2023-01-14 20:23:33,219: DEBUG/MainProcess] Closing consumer channel...
Although, when I ran it with same config without docker, it worked, the consumer wasn't cancelled.
Here is a similar issue I found on stackoverflow, but it didn't help much - Celery/Kombu worker under Docker worker won't pick up messages from SQS

Related

Confluent Docker log4j logger level configurations

I am running locally Kafka using the confluentinc/cp-kafka Docker image and I am setting the following logging container environment variables:
KAFKA_LOG4J_ROOT_LOGLEVEL: ERROR
KAFKA_LOG4J_LOGGERS: >-
org.apache.zookeeper=ERROR,
org.apache.kafka=ERROR,
kafka=ERROR,
kafka.cluster=ERROR,
kafka.controller=ERROR,
kafka.coordinator=ERROR,
kafka.log=ERROR,
kafka.server=ERROR,
kafka.zookeeper=ERROR,
state.change.logger=ERROR
and I see in the Kafka logs that Kafka is starting with the following configuration:
===> ENV Variables ...
ALLOW_UNSIGNED=false
COMPONENT=kafka
CONFLUENT_DEB_VERSION=1
CONFLUENT_PLATFORM_LABEL=
CONFLUENT_VERSION=5.4.1
...
KAFKA_LOG4J_LOGGERS=org.apache.zookeeper=ERROR, org.apache.kafka=ERROR, kafka=ERROR, kafka.cluster=ERROR, kafka.controller=ERROR, kafka.coordinator=ERROR, kafka.log=ERROR, kafka.server=ERROR, kafka.zookeeper=ERROR, state.change.logger=ERROR
KAFKA_LOG4J_ROOT_LOGLEVEL=ERROR
...
Still I see further down in the logs the INFO and TRACE log levels. For example:
[2020-03-26 16:22:12,838] INFO [Controller id=1001] Ready to serve as the new controller with epoch 1 (kafka.controller.KafkaController)
[2020-03-26 16:22:12,848] INFO [Controller id=1001] Partitions undergoing preferred replica election: (kafka.controller.KafkaController)
[2020-03-26 16:22:12,849] INFO [Controller id=1001] Partitions that completed preferred replica election: (kafka.controller.KafkaController)
[2020-03-26 16:22:12,855] INFO [Controller id=1001] Skipping preferred replica election for partitions due to topic deletion: (kafka.controller.KafkaController)
How can I really deactivate the logs below a certain level? In the example above, I really want only ERROR logs.
The approach above is the way described in the Confluent documentation.
And the Apache Kafka source code lists all sorts of loggers that I could not influence using the KAFKA_LOG4J_LOGGERS Docker environment variable.
I went and troubleshot the Dockerfile's and inspected the Kafka container. The cause of this behaviour was the YAML multiline string folding.
Hence the provided environment variable (using a YAML multiline value) is at runtime:
KAFKA_LOG4J_LOGGERS=org.apache.zookeeper=ERROR, org.apache.kafka=ERROR, kafka=ERROR, kafka.cluster=ERROR, kafka.controller=ERROR, kafka.coordinator=ERROR, kafka.log=ERROR, kafka.server=ERROR, kafka.zookeeper=ERROR, state.change.logger=ERROR
instead of (no spaces in between):
KAFKA_LOG4J_LOGGERS=org.apache.zookeeper=ERROR,org.apache.kafka=ERROR, kafka=ERROR, kafka.cluster=ERROR,kafka.controller=ERROR, kafka.coordinator=ERROR,kafka.log=ERROR,kafka.server=ERROR,kafka.zookeeper=ERROR,state.change.logger=ERROR
And this was visible inside the container in the generated /etc/kafka/log4j.properties file:
log4j.rootLogger=ERROR, stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c)%n
log4j.logger.kafka.authorizer.logger=WARN
log4j.logger.kafka.cluster=ERROR
log4j.logger.kafka.producer.async.DefaultEventHandler=DEBUG
log4j.logger.kafka.zookeeper=ERROR
log4j.logger.org.apache.kafka=ERROR
log4j.logger.kafka.coordinator=ERROR
log4j.logger.org.apache.zookeeper=ERROR
log4j.logger.kafka.log.LogCleaner=INFO
log4j.logger.kafka.controller=ERROR
log4j.logger.kafka=INFO
log4j.logger.kafka.log=ERROR
log4j.logger.state.change.logger=ERROR
log4j.logger.kafka=ERROR
log4j.logger.kafka.server=ERROR
log4j.logger.kafka.controller=TRACE
log4j.logger.kafka.network.RequestChannel$=WARN
log4j.logger.kafka.request.logger=WARN
log4j.logger.state.change.logger=TRACE
If you really need to split the long line in a YAML multiline value, you would have to use this YAML syntax.
More hints from the code:
here is where the log4j.properties file is generated when a confluent container is run.
these are the default log levels that Kafka will start with.
these should be all the loggers supported by Kafka

How to capture logs from workers from a Dask-Yarn job?

I have tried using the following in ~/.config/dask/distributed.yaml and ~/.config/dask/yarn.yaml,
logging-file-config: "/path/to/config.ini"
or
logging:
version: 1
disable_existing_loggers: false
root:
level: INFO
handlers: [consoleHandler]
handlers:
consoleHandler:
class: logging.StreamHandler
level: INFO
formatter: sample_formatter
stream: ext://sys.stderr
formatters:
sample_formatter:
format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
and then in my function that gets evaluated at the worker:
import logging
from distributed.worker import logger
import dask
from dask.distributed import Client
from dask_yarn import YarnCluster
log = logging.getLogger(__name__)
#dask.delayed
def worker_func(args):
logger.info("This will show up in the worker logs")
log.info("This does not show up in worker logs")
return
if __name__ == "__main__":
dag_1 = {'worker_func': (worker_func, arg_1)}
tasks = dask.get(dag_1, 'load-1')
log.info("This also shows up in logs, and custom formatted)
cluster = YarnCluster()
client = Client(cluster)
dask.compute(tasks)
When I try to view the yarn logs using:
yarn logs -applicationId {application_id}
I do not see the log from log.info inside worker_func, but I do see the logs from distributed.worker.logger and from outside that function on the console. I also tried using client.get_worker_logs(), but that returned an empty dictionary. Is there a way to see customized logs from inside the function that gets evaluated at a worker?
There's a lot going on in this question, so I'm going to answer "How do I configure logging for dask-yarn workers" and hope everything else becomes clear by answering that.
Dask's configuration system is loaded locally on the machine you start a dask cluster from (usually the edge node). This configuration is not distributed to the workers automatically, you're responsible for doing that yourself. You have a few options here:
Have admin/IT put configuration in /etc/dask/ on every node, which will affect all users.
Bundle configuration with your packaged environment. Dask will load configuration from {prefix}/etc/dask/, where prefix is sys.prefix.
For example, if you have a conda environment at /path/to/environment, you'd do the following to bundle the configuration
# Create the configuration directory in the environment
mkdir -p /path/to/environment/etc/dask/
# Add your configuration to this directory
mv config.yaml /path/to/environment/etc/dask/config.yaml
# Package the environment
conda pack -p /path/to/environment -o environment.tar.gz
Any configuration values set in config.yaml will now be available on all the worker nodes. An example configuration file setting some logging configuration would be:
logging:
version: 1
root:
level: INFO
handlers: [consoleHandler]
handlers:
consoleHandler:
class: logging.StreamHandler
level: INFO
formatter: sample_formatter
stream: ext://sys.stderr
formatters:
sample_formatter:
format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
Logs from completed dask-yarn applications can be retrieved using the YARN cli at
yarn logs -applicationId <application-id>
Logs for running dask-yarn applications can be retrieved using client.get_worker_logs(). Note that these logs will only contain logs written to the distributed.worker logger. You cannot write to your own logger and have them appear in the output of client.get_worker_logs(). To write to this logger, get it via
import logging
logger = logging.getLogger("distributed.worker")
logger.info("Writing with the worker logger")
Any logger appropriately configured to log to stdout or stderr will appear in the logs accessed via the yarn CLI, but only the distributed.worker logger output will also be available to get_worker_logs().
Side note
I have tried using the following in ~/.config/dask/distributed.yaml and ~/.config/dask/yarn.yaml
The name of the config files doesn't matter, dask loads all yaml files in all config directories and merges their contents. For more information please read the configuration docs

shoryuken error to `validate_queues': The specified queue(s)

I am using rails "Shoryuken" gem but I am getting errors for validation on queues on my development environment when I started rails server below is the error:-
gems/shoryuken-2.0.11/lib/shoryuken/environment_loader.rb:172:in `validate_queues': The specified queue(s) ["development_worker"] do not exist (ArgumentError)
I have used below settings:-
config/shoryuken.yml
aws:
access_key_id: <%= ENV["SQS_IAM"] %>
secret_access_key: <%= ENV["SQS_IAM_SECRET"] %>
region: <%= ENV["SQS_IAM_REGION"] %>
concurrency: 25 # The number of allocated threads to process messages. Default 25
delay: 0 # The delay in seconds to pause a queue when it's empty. Default 0
queues:
- ["<%= Rails.env %>_worker", 1]
initializers/shoryuken.rb
def parse_config(config_file)
if File.exist?(config_file)
YAML.load(ERB.new(IO.read(config_file)).result)
else
raise ArgumentError, "Config file #{config_file} does not exist"
end
end
config = parse_config([Dir.pwd, 'config/shoryuken.yml'].join('/')).deep_symbolize_keys
Shoryuken::EnvironmentLoader.load(config)
I want the queue should be environment specific.
Ravindra, looking at the code of shoryuken, you are getting the error because you do not have created an SQS queue named development_worker, is that rigth?
You will need to create one queue for each developer, because shoryuken will be running in the computers of each developer. If you don't do that, all shoryuken processes of each computer will be polling messages of the same queue.
Imagine that there are two shoryuken processes, sh1 and sh2, that correspond to the dev1 and dev2 machine respectively.
If the dev1 sends a SQS msg to the dev-queue, the sh2 proccess can be able to poll the message that the dev1 sent.
If you wish to avoid to create the queues in AWS, you can take a look at this.

Enequeued jobs are not processed by Sidekiq

I'm using Sidekiq with Rails and Heroku to execute asynchronous workers but I am not able to get it working. I've tried launching those workers manually (executing the perform method) and they work, however when you schedule them and check the Sidekiq webpage they appear as enqueued and it doesn't process them.
I'm using an Heroku Worker and Redis To Go and this is the config:
sidekiq.rb
require 'sidekiq'
Sidekiq.configure_client do |config|
config.redis = { :size => 1 }
end
Sidekiq.configure_server do |config|
config.redis = { :size => 2 }
end
sidekiq.yml
:concurrency: 2
Worker launch
Sidekiq::Client.enqueue(MyWorker)
Sidekiq and Redis seems to be properly connected because I get the following log on Heroku:
015-04-27T21:46:51Z 3 TID-ovbydc96k INFO: Sidekiq client with redis options {:size=>1, :url=>"redis://redistogo:REDACTED#soapfish.redistogo.com:xxxx/"}
I'm missing something but I don't know what. Any help would be appreciated. Thanks!
Have you run this command so that Sidekiq knows to use the Redis-To-Go URL?
heroku config:set REDIS_PROVIDER=REDISTOGO_URL

Start RabbitMQ server failed

I start my rabbitmq server by command ./rabbitmq-server .
But on starting broker, it show "BOOT FAILED":
BOOT FAILED
Error description:
function_clause`enter code here`
Stack trace:
[{rabbit_networking,'-boot_tcp/0-lc$^0/1-0-',[5670],[]},
{rabbit_networking,boot_tcp,0,[]},
{rabbit_networking,boot,0,[]},
{rabbit,'-run_boot_step/1-lc$^1/1-1-',1,[]},
{rabbit,run_boot_step,1,[]},
{rabbit,'-start/2-lc$^0/1-0-',1,[]},
{rabbit,start,2,[]},
{application_master,start_it_old,4,
[{file,"application_master.erl"},{line,272}]}]
Your configuration is incorrect. The variable tcp_listeners in rabbitmq.config is set to an integer, 5670, but it needs to be set to a list of integers, such as [5670]. See the RabbitMQ configuration guide for more information.

Resources