Spring Cloud Dataflow Kafka Topic creation/configuration - spring-cloud-dataflow

I created a Processor here I will call it my-transform.
And a Stream:
:my-topic > my-transform | log
And I configured the Processor not to auto create Topic with this config
and wrote the Code as a Function Bean.
spring:
cloud:
stream:
kafka:
binder:
brokers: localhost:9093
auto-create-topics: false
function:
bindings:
transform-in-0: input
transform-out-0: output
The Problem is that the Topic (my-topic) is still created.
But it should not be created.
Instead, I want to create with another application.
So it can be configured with the right retention policy.

In my Kafka installation, autoCreateTopicsEnable was still enabled.

Related

Deploying smart contract using truffle on private blockchain node on docker

I am facing problems deploying a smart contract on my private blockchain network. I created my blockchain network on three VMs (miners) using puppeth on a fourth VM (controller) by following the steps in this blog: https://medium.com/#collin.cusce/using-puppeth-to-manually-create-an-ethereum-proof-of-authority-clique-network-on-aws-ae0d7c906cce
Afterwards, I installed truffle on one of the miners VMs and i initialized truffle using the command:
truffle init
Then I wrote a simple hello world smart contract, compiled it and deployed it on truffle development blockchain and it worked. However, I tried to deploy it on my private blockchain but I can't connect to the network.
The admin.nodeInfo command in geth console returns the folowing output:
docker exec -it 954cd3955065 geth attach ipc:/root/.ethereum/geth.ipc
Welcome to the Geth JavaScript console!
instance: Geth/v1.9.25-unstable-ead81461-20201123/linux-amd64/go1.15.5
coinbase: 0xe8cc4bea2cfdfd14cddefe1141bedd109576b9a9
at block: 78558 (Tue Dec 01 2020 22:01:02 GMT+0000 (UTC))
datadir: /root/.ethereum
modules: admin:1.0 clique:1.0 debug:1.0 eth:1.0 miner:1.0 net:1.0 personal:1.0 rpc:1.0 txpool:1.0 web3:1.0
To exit, press ctrl-d
> admin.nodeInfo
{
enode: "enode://7206ca3c62f6db47e1230dcf14a765d4c9b4870a66470dbb21fcc5ed2fab2167d6bcc47eec8044c42037b3e6e0017aeb8ddfc3580471da54a6c7274a0c1fe46b#10.100.2.32:30303",
enr: "enr:-Je4QGXlVAESp8r2s1uHBJxoDLWQo8IvZsbe5sX2YRBb0un9Gdlt8nfDKQBR_j0lDPtaoCCuis4cJJlqtEHfa4tLO2EIg2V0aMfGhG5b-B6AgmlkgnY0gmlwhApkAiCJc2VjcDI1NmsxoQNyBso8YvbbR-EjDc8Up2XUybSHCmZHDbsh_MXtL6shZ4N0Y3CCdl-DdWRwgnZf",
id: "027a351994ac1b127df56180b6210310cc0164f17f1b12c167cb167c4ffaa122",
ip: "10.100.2.32",
listenAddr: "[::]:30303",
name: "Geth/v1.9.25-unstable-ead81461-20201123/linux-amd64/go1.15.5",
ports: {
discovery: 30303,
listener: 30303
},
protocols: {
eth: {
config: {
byzantiumBlock: 0,
chainId: 1515,
clique: {...},
constantinopleBlock: 0,
eip150Block: 0,
eip150Hash: "0x0000000000000000000000000000000000000000000000000000000000000000",
eip155Block: 0,
eip158Block: 0,
homesteadBlock: 0,
istanbulBlock: 0,
petersburgBlock: 0
},
difficulty: 98201,
genesis: "0x17f752387c901db617cf0594ecd2cb9811dfcd666318c2e0e7cb0239471da979",
head: "0xf8a37d0390558746901faa55463c127c553f02cf2d23ce0cb469fcd470c810f9",
network: 1515
}
}
}
I tried adding the network configuration in truffle-config.js like this:
devnet2: {
host: "localhost",
port: "30303", //port where the node is
network_id: "*",
from: 0x91cd7b879fefff34259d577a56d290b3315bf9b3 // Treats this network as if it was a public net. (default: false)
}
then, when deploying using the command truffle deploy --network devnet2 i always get this error:
Compiling your contracts...
===========================
> Everything is up to date, there is nothing to compile.
/usr/local/lib/node_modules/truffle/build/webpack:/packages/provider/index.js:56
throw new Error(errorMessage);
^
Error: There was a timeout while attempting to connect to the network.
Check to see that your provider is valid.
If you have a slow internet connection, try configuring a longer timeout in your Truffle config. Use the networks[networkName].networkCheckTimeout property to do this.
at Timeout.setTimeout (/usr/local/lib/node_modules/truffle/build/webpack:/packages/provider/index.js:56:1)
at ontimeout (timers.js:436:11)
at tryOnTimeout (timers.js:300:5)
at listOnTimeout (timers.js:263:5)
at Timer.processTimers (timers.js:223:10)
I tried extending the timeout limit but it didn't work. I also tried using Web3 Providers (HTTPProvider and IPCProvider) but without any luck (i can give more details, if needed).
Any help is well appreciated because i spent a lot of time on it without getting anywhere. Unfortunately, i couldn't find anything on deploying smart contracts to a node that is running on docker. If needed, i can gladly give more details about what i did.
I managed to run smart contracts on a private network, not using docker however. Some things come to mind. did you run a miner on your network? you will need to run a miner so that the contract gets migrated. Did you make sure that the gaslimit is met when running the contract? the miners will wait for the max gas limit to be reached before processing any request.
Did you already deploy the contract? in the migration scripts you either create a new migration script by bumping the version or use the reset flag to run all migration scripts again.

Flink consumer job is unable to connect to Kafka from Docker

I have a simple streaming Flink Scala job which connects to a Kafka topic and maps its
org.apache.avro.generic.GenericRecord messages and map into Json string.
When it is running in IntelliJ it ingests the topic well and printing out the jsons.
When I run it in docker-compose I got the following exception:
com.esotericsoftware.kryo.KryoException: Error constructing instance of class: org.apache.avro.Schema$LockableArrayList
Serialization trace:
types (org.apache.avro.Schema$UnionSchema)
schema (org.apache.avro.Schema$Field)
fieldMap (org.apache.avro.Schema$RecordSchema)
schema (org.apache.avro.generic.GenericData$Record)
at com.twitter.chill.Instantiators$$anon$1.newInstance(KryoBase.scala:136)
at com.esotericsoftware.kryo.Kryo.newInstance(Kryo.java:1061)
at com.esotericsoftware.kryo.serializers.CollectionSerializer.create(CollectionSerializer.java:89)
at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:93)
at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:22)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:679)
at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:528)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:679)
at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:528)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:761)
at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:143)
at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:21)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:679)
at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:528)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:679)
at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:528)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:657)
at org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.copy(KryoSerializer.java:262)
at org.apache.flink.api.java.typeutils.runtime.TupleSerializer.copy(TupleSerializer.java:111)
at org.apache.flink.api.java.typeutils.runtime.TupleSerializer.copy(TupleSerializer.java:37)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:635)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:612)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:592)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:727)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:705)
at org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collect(StreamSourceContexts.java:104)
at org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collectWithTimestamp(StreamSourceContexts.java:111)
at org.apache.flink.streaming.connectors.kafka.internals.AbstractFetcher.emitRecordWithTimestamp(AbstractFetcher.java:398)
at org.apache.flink.streaming.connectors.kafka.internal.KafkaFetcher.emitRecord(KafkaFetcher.java:185)
at org.apache.flink.streaming.connectors.kafka.internal.KafkaFetcher.runFetchLoop(KafkaFetcher.java:150)
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.run(FlinkKafkaConsumerBase.java:715)
at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100)
at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63)
at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:208)
Caused by: java.lang.IllegalAccessException: Class com.twitter.chill.Instantiators$ can not access a member of class org.apache.avro.Schema$LockableArrayList with modifiers "public"
at sun.reflect.Reflection.ensureMemberAccess(Reflection.java:102)
at java.lang.reflect.AccessibleObject.slowCheckMemberAccess(AccessibleObject.java:296)
at java.lang.reflect.AccessibleObject.checkAccess(AccessibleObject.java:288)
at java.lang.reflect.Constructor.newInstance(Constructor.java:413)
at com.twitter.chill.Instantiators$.$anonfun$normalJava$1(KryoBase.scala:170)
at com.twitter.chill.Instantiators$$anon$1.newInstance(KryoBase.scala:133)
... 37 more
I tried forcing Avro serialization with:
val env: StreamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment
env.getConfig.disableForceKryo()
env.getConfig.enableForceAvro()
but got the same error.
Based on [this][1] I'm using all Flink related dependencies as "provided" with no good result.
What can be the difference between running the job in the IDE and in
Docker?
How can I fix the job to be able to read the Kafka topic from
Docker?
How shall I setup Docker for this?
What can I handle Kryo/Avro serialization issue?
SS
[1]: http://www.alternatestack.com/development/com-esotericsoftware-kryo-kryoexception-unusual-solution-upgrading-flink/

Confluent Docker log4j logger level configurations

I am running locally Kafka using the confluentinc/cp-kafka Docker image and I am setting the following logging container environment variables:
KAFKA_LOG4J_ROOT_LOGLEVEL: ERROR
KAFKA_LOG4J_LOGGERS: >-
org.apache.zookeeper=ERROR,
org.apache.kafka=ERROR,
kafka=ERROR,
kafka.cluster=ERROR,
kafka.controller=ERROR,
kafka.coordinator=ERROR,
kafka.log=ERROR,
kafka.server=ERROR,
kafka.zookeeper=ERROR,
state.change.logger=ERROR
and I see in the Kafka logs that Kafka is starting with the following configuration:
===> ENV Variables ...
ALLOW_UNSIGNED=false
COMPONENT=kafka
CONFLUENT_DEB_VERSION=1
CONFLUENT_PLATFORM_LABEL=
CONFLUENT_VERSION=5.4.1
...
KAFKA_LOG4J_LOGGERS=org.apache.zookeeper=ERROR, org.apache.kafka=ERROR, kafka=ERROR, kafka.cluster=ERROR, kafka.controller=ERROR, kafka.coordinator=ERROR, kafka.log=ERROR, kafka.server=ERROR, kafka.zookeeper=ERROR, state.change.logger=ERROR
KAFKA_LOG4J_ROOT_LOGLEVEL=ERROR
...
Still I see further down in the logs the INFO and TRACE log levels. For example:
[2020-03-26 16:22:12,838] INFO [Controller id=1001] Ready to serve as the new controller with epoch 1 (kafka.controller.KafkaController)
[2020-03-26 16:22:12,848] INFO [Controller id=1001] Partitions undergoing preferred replica election: (kafka.controller.KafkaController)
[2020-03-26 16:22:12,849] INFO [Controller id=1001] Partitions that completed preferred replica election: (kafka.controller.KafkaController)
[2020-03-26 16:22:12,855] INFO [Controller id=1001] Skipping preferred replica election for partitions due to topic deletion: (kafka.controller.KafkaController)
How can I really deactivate the logs below a certain level? In the example above, I really want only ERROR logs.
The approach above is the way described in the Confluent documentation.
And the Apache Kafka source code lists all sorts of loggers that I could not influence using the KAFKA_LOG4J_LOGGERS Docker environment variable.
I went and troubleshot the Dockerfile's and inspected the Kafka container. The cause of this behaviour was the YAML multiline string folding.
Hence the provided environment variable (using a YAML multiline value) is at runtime:
KAFKA_LOG4J_LOGGERS=org.apache.zookeeper=ERROR, org.apache.kafka=ERROR, kafka=ERROR, kafka.cluster=ERROR, kafka.controller=ERROR, kafka.coordinator=ERROR, kafka.log=ERROR, kafka.server=ERROR, kafka.zookeeper=ERROR, state.change.logger=ERROR
instead of (no spaces in between):
KAFKA_LOG4J_LOGGERS=org.apache.zookeeper=ERROR,org.apache.kafka=ERROR, kafka=ERROR, kafka.cluster=ERROR,kafka.controller=ERROR, kafka.coordinator=ERROR,kafka.log=ERROR,kafka.server=ERROR,kafka.zookeeper=ERROR,state.change.logger=ERROR
And this was visible inside the container in the generated /etc/kafka/log4j.properties file:
log4j.rootLogger=ERROR, stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c)%n
log4j.logger.kafka.authorizer.logger=WARN
log4j.logger.kafka.cluster=ERROR
log4j.logger.kafka.producer.async.DefaultEventHandler=DEBUG
log4j.logger.kafka.zookeeper=ERROR
log4j.logger.org.apache.kafka=ERROR
log4j.logger.kafka.coordinator=ERROR
log4j.logger.org.apache.zookeeper=ERROR
log4j.logger.kafka.log.LogCleaner=INFO
log4j.logger.kafka.controller=ERROR
log4j.logger.kafka=INFO
log4j.logger.kafka.log=ERROR
log4j.logger.state.change.logger=ERROR
log4j.logger.kafka=ERROR
log4j.logger.kafka.server=ERROR
log4j.logger.kafka.controller=TRACE
log4j.logger.kafka.network.RequestChannel$=WARN
log4j.logger.kafka.request.logger=WARN
log4j.logger.state.change.logger=TRACE
If you really need to split the long line in a YAML multiline value, you would have to use this YAML syntax.
More hints from the code:
here is where the log4j.properties file is generated when a confluent container is run.
these are the default log levels that Kafka will start with.
these should be all the loggers supported by Kafka

How to capture logs from workers from a Dask-Yarn job?

I have tried using the following in ~/.config/dask/distributed.yaml and ~/.config/dask/yarn.yaml,
logging-file-config: "/path/to/config.ini"
or
logging:
version: 1
disable_existing_loggers: false
root:
level: INFO
handlers: [consoleHandler]
handlers:
consoleHandler:
class: logging.StreamHandler
level: INFO
formatter: sample_formatter
stream: ext://sys.stderr
formatters:
sample_formatter:
format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
and then in my function that gets evaluated at the worker:
import logging
from distributed.worker import logger
import dask
from dask.distributed import Client
from dask_yarn import YarnCluster
log = logging.getLogger(__name__)
#dask.delayed
def worker_func(args):
logger.info("This will show up in the worker logs")
log.info("This does not show up in worker logs")
return
if __name__ == "__main__":
dag_1 = {'worker_func': (worker_func, arg_1)}
tasks = dask.get(dag_1, 'load-1')
log.info("This also shows up in logs, and custom formatted)
cluster = YarnCluster()
client = Client(cluster)
dask.compute(tasks)
When I try to view the yarn logs using:
yarn logs -applicationId {application_id}
I do not see the log from log.info inside worker_func, but I do see the logs from distributed.worker.logger and from outside that function on the console. I also tried using client.get_worker_logs(), but that returned an empty dictionary. Is there a way to see customized logs from inside the function that gets evaluated at a worker?
There's a lot going on in this question, so I'm going to answer "How do I configure logging for dask-yarn workers" and hope everything else becomes clear by answering that.
Dask's configuration system is loaded locally on the machine you start a dask cluster from (usually the edge node). This configuration is not distributed to the workers automatically, you're responsible for doing that yourself. You have a few options here:
Have admin/IT put configuration in /etc/dask/ on every node, which will affect all users.
Bundle configuration with your packaged environment. Dask will load configuration from {prefix}/etc/dask/, where prefix is sys.prefix.
For example, if you have a conda environment at /path/to/environment, you'd do the following to bundle the configuration
# Create the configuration directory in the environment
mkdir -p /path/to/environment/etc/dask/
# Add your configuration to this directory
mv config.yaml /path/to/environment/etc/dask/config.yaml
# Package the environment
conda pack -p /path/to/environment -o environment.tar.gz
Any configuration values set in config.yaml will now be available on all the worker nodes. An example configuration file setting some logging configuration would be:
logging:
version: 1
root:
level: INFO
handlers: [consoleHandler]
handlers:
consoleHandler:
class: logging.StreamHandler
level: INFO
formatter: sample_formatter
stream: ext://sys.stderr
formatters:
sample_formatter:
format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
Logs from completed dask-yarn applications can be retrieved using the YARN cli at
yarn logs -applicationId <application-id>
Logs for running dask-yarn applications can be retrieved using client.get_worker_logs(). Note that these logs will only contain logs written to the distributed.worker logger. You cannot write to your own logger and have them appear in the output of client.get_worker_logs(). To write to this logger, get it via
import logging
logger = logging.getLogger("distributed.worker")
logger.info("Writing with the worker logger")
Any logger appropriately configured to log to stdout or stderr will appear in the logs accessed via the yarn CLI, but only the distributed.worker logger output will also be available to get_worker_logs().
Side note
I have tried using the following in ~/.config/dask/distributed.yaml and ~/.config/dask/yarn.yaml
The name of the config files doesn't matter, dask loads all yaml files in all config directories and merges their contents. For more information please read the configuration docs

how to configure jmx exporter in tomcat for prometheus

I am trying to configure jmx monitor for monitor my java metrics. but facing some issue which is described below.
My current process:
I set below parameters in my catalina.sh file.
Prometheus_JMX_OPTS="-javaagent:/home/centos/jmx_prometheus_javaagent-0.11.0.jar=7777:/home/centos/config.yml"
JMX_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=3000 -Dcom.sun.management.jmxremote.rmi.port=3000 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false"
JAVA_OPTS="-Xms${JVM_MINIMUM_MEMORY} -Xmx${JVM_MAXIMUM_MEMORY} ${JAVA_OPTS} ${OPC_JVM_ARGS} ${JVM_REQUIRED_ARGS} ${DISABLE_NOTIFICATIONS} ${JVM_SUPPORT_RECOMMENDED_ARGS} ${JVM_EXTRA_ARGS} ${JIRA_HOME_MINUSD} ${JMX_OPTS} ${Prometheus_JMX_OPTS}"
I download the jmx_prometheus_javaagent-0.11.0.jar file in /home/centos path.
Create a config file with below content.
startDelaySeconds: 0
ssl: false
lowercaseOutputName: false
lowercaseOutputLabelNames: false
Open 7777 port from Security groups.
Now when I trying to access http://localhost:7777/metrics then it is showing unable to reach now.
Anyone can help me into this, I am stuck here . ☺

Resources