kafka2.12-2.4.0 confluent5.4.1
I am trying to use Confluent's Schema-register.
But when I start schema-register and connect-Distributed.
Connect logs did not report errors.
connect-avro-distributed.properties
key.converter=io.confluent.connect.avro.AvroConverter
key.converter.schema.registry.url=http://k2:8081
value.converter=io.confluent.connect.avro.AvroConverter
value.converter.schema.registry.url=http://k2:8081
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
plugin.path=/usr/local/tools/confluent-5.4.1/share/java,/usr/local/tools/kafka/kafka_2.12-2.4.0/plugin
I have configured the confluent jar address so that connect can find the class. (plugin.path)
But when I POST the conector request.
{
"name": "dbz-mysql-avro-connector",
"config": {
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"tasks.max": "1",
"database.hostname": "xx.xx.xx.xx",
"database.port": "3306",
"database.user": "debezium",
"database.history.kafka.topic": "dbhistory.debezium.mysql.avro",
"database.password": "123456",
"database.server.id": "184124",
"database.server.name": "debezium",
"key.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"key.converter.schema.registry.url": "http://k2:8081",
"value.converter.schema.registry.url": "http://k2:8081",
"table.whitelist": "debeziumdb.hosttable",
"database.history.kafka.bootstrap.servers": "k1:9092,k2:9092,k3:9092"
}
}
Throw the Exception.
[2020-04-23 10:37:00,064] INFO Creating task dbz-mysql-avro-connector-0 (org.apache.kafka.connect.runtime.Worker:419)
[2020-04-23 10:37:00,065] INFO ConnectorConfig values:
config.action.reload = restart
connector.class = io.debezium.connector.mysql.MySqlConnector
errors.log.enable = false
errors.log.include.messages = false
errors.retry.delay.max.ms = 60000
errors.retry.timeout = 0
errors.tolerance = none
header.converter = null
key.converter = class io.confluent.connect.avro.AvroConverter
name = dbz-mysql-avro-connector
tasks.max = 1
transforms = []
value.converter = class io.confluent.connect.avro.AvroConverter
(org.apache.kafka.connect.runtime.ConnectorConfig:347)
[2020-04-23 10:37:00,065] INFO EnrichedConnectorConfig values:
config.action.reload = restart
connector.class = io.debezium.connector.mysql.MySqlConnector
errors.log.enable = false
errors.log.include.messages = false
errors.retry.delay.max.ms = 60000
errors.retry.timeout = 0
errors.tolerance = none
header.converter = null
key.converter = class io.confluent.connect.avro.AvroConverter
name = dbz-mysql-avro-connector
tasks.max = 1
transforms = []
value.converter = class io.confluent.connect.avro.AvroConverter
(org.apache.kafka.connect.runtime.ConnectorConfig$EnrichedConnectorConfig:347)
[2020-04-23 10:37:00,067] INFO TaskConfig values:
task.class = class io.debezium.connector.mysql.MySqlConnectorTask
(org.apache.kafka.connect.runtime.TaskConfig:347)
[2020-04-23 10:37:00,067] INFO Instantiated task dbz-mysql-avro-connector-0 with version 1.1.0.Final of type io.debezium.connector.mysql.MySqlConnectorTask (org.apache.kafka.connect.runtime.Worker:434)
[2020-04-23 10:37:00,067] ERROR Failed to start task dbz-mysql-avro-connector-0 (org.apache.kafka.connect.runtime.Worker:470)
java.lang.NoClassDefFoundError: io/confluent/connect/avro/AvroConverterConfig
at io.confluent.connect.avro.AvroConverter.configure(AvroConverter.java:61)
at org.apache.kafka.connect.runtime.isolation.Plugins.newConverter(Plugins.java:293)
at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:440)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:1140)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1700(DistributedHerder.java:125)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:1155)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:1151)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2020-04-23 10:37:00,071] INFO [Worker clientId=connect-1, groupId=connect-cluster] Finished starting connectors and tasks (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1125)
All jars are in this directory.
Now what can I do to allow the class to be introduced, or does the version of confluent not exist in the class?
Thanks.
I finally solved this exception.
I did not use the confluent platform, just installed the schema-registry component.
To be precise, I only installed the community version and only activated the schema-registry component.
Then I downloaded the Avro jar package on the official website, and finally put it in the plugin completely, and started the connect successfully.
Confluent Avro jar address
And, I executed the following statement so that it can be read.
export CLASSPATH=/usr/local/tools/kafka/kafka_2.12-2.4.0/plugin/*
Looks like your kafka-connect-avro-converter is not compatible with other confluent jars. your Q does not list the kafka-connect-avro-converter jar as well. Can you add the correct version jar of kafka-connect-avro-converter in your classpath.
Related
I have a Kafka Streams application that works fine locally, but when I run it in Docker containers not all data is processed, and I get a lot of repeated errors in the logs about "unable to records bytes produce to topic"
18:58:32.647 [kafka-producer-network-thread | my-app-events-processor.splitPackets-cf462b02-f1e3-4ed5-a1e7-acc1f040495b-StreamThread-1-producer] ERROR o.a.k.s.p.i.RecordCollectorImpl - stream-thread [my-app-events-processor.splitPackets-cf462b02-f1e3-4ed5-a1e7-acc1f040495b-StreamThread-1] task [0_0] Unable to records bytes produced to topic my-app.packet.surface by sink node split-server-log as the node is not recognized.
Known sink nodes are [].
18:58:49.216 [kafka-producer-network-thread | my-app-events-processor.splitPackets-cf462b02-f1e3-4ed5-a1e7-acc1f040495b-StreamThread-1-producer] ERROR o.a.k.s.p.i.RecordCollectorImpl - stream-thread [my-app-events-processor.splitPackets-cf462b02-f1e3-4ed5-a1e7-acc1f040495b-StreamThread-1] task [0_0] Unable to records bytes produced to topic my-app.packet.surface by sink node split-server-log as the node is not recognized.
Known sink nodes are [].
18:59:05.981 [kafka-producer-network-thread | my-app-events-processor.splitPackets-cf462b02-f1e3-4ed5-a1e7-acc1f040495b-StreamThread-1-producer] ERROR o.a.k.s.p.i.RecordCollectorImpl - stream-thread [my-app-events-processor.splitPackets-cf462b02-f1e3-4ed5-a1e7-acc1f040495b-StreamThread-1] task [0_0] Unable to records bytes produced to topic my-app.packet.surface by sink node split-server-log as the node is not recognized.
Known sink nodes are [].
19:00:28.484 [my-app-events-processor.splitPackets-cf462b02-f1e3-4ed5-a1e7-acc1f040495b-StreamThread-1] INFO o.a.k.s.p.internals.StreamThread - stream-thread [my-app-events-processor.splitPackets-cf462b02-f1e3-4ed5-a1e7-acc1f040495b-StreamThread-1] Processed 3 total records, ran 0 punctuators, and committed 3 total tasks since the last update
When I run the application, not all data is processed, only some. Some KafkaStream instances produce data, while others only seem to consume it. I expect it to consume JSON data, and produce images (to be used in a Leaflet web-map). However it will only do this for some of the KafkaStream instances.
I don't get this error when I run locally. What does it mean? How can I fix it?
Application setup
I have a single application, events-processors, written in Kotlin that uses Kafka Streams. The application uses a Kafka Admin instance to create the topics, then launches 4 separate KafkaStream instances using independent Kotlin Coroutines. events-processors runs in a Docker container.
The Kafka instance is using Kafka Kraft, and is running in another Docker container on the same Docker network.
I am using
Kafka 3.3.1
Kotlin 1.7.20
docker-compose version 1.29.2
Docker version 20.10.19
Debian GNU/Linux 11 (bullseye)
Kernel: Linux 5.10.0-18-amd64
Architecture: x86-64
Kafka config
Here is the config of one of the KafkaStreams instances:
18:38:25.138 [DefaultDispatcher-worker-5 #my-app-events-processor.splitPackets#5] INFO o.a.k.s.p.internals.StreamThread - stream-thread [my-app-events-processor.splitPackets-d7b897b3-3a10-48d6-95c7-e291cb1839d8-StreamThread-1] Creating restore consumer client
18:38:25.142 [DefaultDispatcher-worker-5 #my-app-events-processor.splitPackets#5] INFO o.a.k.c.consumer.ConsumerConfig - ConsumerConfig values:
allow.auto.create.topics = true
auto.commit.interval.ms = 5000
auto.offset.reset = none
bootstrap.servers = [http://kafka:29092]
check.crcs = true
client.dns.lookup = use_all_dns_ips
client.id = my-app-events-processor.splitPackets-d7b897b3-3a10-48d6-95c7-e291cb1839d8-StreamThread-1-restore-consumer
client.rack =
connections.max.idle.ms = 540000
default.api.timeout.ms = 60000
enable.auto.commit = false
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id = null
group.instance.id = null
heartbeat.interval.ms = 3000
interceptor.classes = []
internal.leave.group.on.close = false
internal.throw.on.fetch.stable.offset.unsupported = true
isolation.level = read_committed
key.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
max.partition.fetch.bytes = 1048576
max.poll.interval.ms = 300000
max.poll.records = 1000
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor, class org.apache.kafka.clients.consumer.CooperativeStickyAssignor]
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retry.backoff.ms = 100
sasl.client.callback.handler.class = null
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.login.callback.handler.class = null
sasl.login.class = null
sasl.login.connect.timeout.ms = null
sasl.login.read.timeout.ms = null
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.login.retry.backoff.max.ms = 10000
sasl.login.retry.backoff.ms = 100
sasl.mechanism = GSSAPI
sasl.oauthbearer.clock.skew.seconds = 30
sasl.oauthbearer.expected.audience = null
sasl.oauthbearer.expected.issuer = null
sasl.oauthbearer.jwks.endpoint.refresh.ms = 3600000
sasl.oauthbearer.jwks.endpoint.retry.backoff.max.ms = 10000
sasl.oauthbearer.jwks.endpoint.retry.backoff.ms = 100
sasl.oauthbearer.jwks.endpoint.url = null
sasl.oauthbearer.scope.claim.name = scope
sasl.oauthbearer.sub.claim.name = sub
sasl.oauthbearer.token.endpoint.url = null
security.protocol = PLAINTEXT
security.providers = null
send.buffer.bytes = 131072
session.timeout.ms = 45000
socket.connection.setup.timeout.max.ms = 30000
socket.connection.setup.timeout.ms = 10000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.3]
ssl.endpoint.identification.algorithm = https
ssl.engine.factory.class = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.certificate.chain = null
ssl.keystore.key = null
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLSv1.3
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.certificates = null
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
value.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
The server config is the Kafka Kraft config, https://github.com/apache/kafka/blob/215d4f93bd16efc8e9b7ccaa9fc99a1433a9bcfa/config/kraft/server.properties, although I have changed the advertised listeners.
advertised.listeners=PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
Docker config
The Docker config is defined in a docker-compose file.
version: "3.9"
services:
events-processors:
image: events-processors
container_name: events-processors
restart: unless-stopped
environment:
KAFKA_BOOTSTRAP_SERVERS: "http://kafka:29092"
networks:
- my-app-infra-nw
depends_on:
- infra-kafka
secrets:
- source: my-app_config
target: /.secret.config.yml
infra-kafka:
image: kafka-kraft
container_name: infra-kafka
restart: unless-stopped
networks:
my-app-infra-nw:
aliases: [ kafka ]
volumes:
- "./config/kafka-server.properties:/kafka/server.properties"
ports:
# note: other Docker containers should use 29092
- "9092:9092"
- "9093:9093"
I'm trying to get data from my kafka topic into InfluxDB using the Confluent/Kafka stack. At the moment, the messages in the topic have a form of {"tag1":"123","tag2":"456"} (I have relatively good control over the message format, I chose the JSON to be as above, could include a timestamp etc if necessary).
Ideally, I would like to add many tags without needing to specify a schema/column names in the future.
I followed https://docs.confluent.io/kafka-connect-influxdb/current/influx-db-sink-connector/index.html (the "Schemaless JSON tags example") as this matches my use case quite closely. The "key" of each message is currently just the MQTT topic name (the topic's source is an MQTT connector). So I set the "key.converter" to "stringconverter" (instead of JSONconverter as in the example).
Other examples I've seen online seem to suggest the need for a schema to be set, which I'd like to avoid. Using InfluxDB v1.8, everything on Docker/maintained on Portainer.
I cannot seem to start the connector and never get any data to move across.
Below is the config for my InfluxDBSink Connector:
{
"name": "InfluxDBSinkKafka",
"config": {
"key.converter.schemas.enable": "false",
"value.converter.schemas.enable": "false",
"name": "InfluxDBSinkKafka",
"connector.class": "io.confluent.influxdb.InfluxDBSinkConnector",
"tasks.max": "1",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"topics": "KAFKATOPIC1",
"influxdb.url": "http://URL:PORT",
"influxdb.db": "tagdata",
"measurement.name.format": "${topic}"
}
}
The connector fails, and each time I click "start" (the play button) the following pops up in the connect container's logs:
[2022-03-22 15:46:52,562] INFO [Worker clientId=connect-1, groupId=compose-connect-group]
Connector InfluxDBSinkKafka target state change (org.apache.kafka.connect.runtime.distributed.DistributedHerder)
[2022-03-22 15:46:52,562] INFO Setting connector InfluxDBSinkKafka state to STARTED (org.apache.kafka.connect.runtime.Worker)
[2022-03-22 15:46:52,562] INFO SinkConnectorConfig values:
config.action.reload = restart
connector.class = io.confluent.influxdb.InfluxDBSinkConnector
errors.deadletterqueue.context.headers.enable = false
errors.deadletterqueue.topic.name =
errors.deadletterqueue.topic.replication.factor = 3
errors.log.enable = false
errors.log.include.messages = false
errors.retry.delay.max.ms = 60000
errors.retry.timeout = 0
errors.tolerance = none
header.converter = null
key.converter = class org.apache.kafka.connect.storage.StringConverter
name = InfluxDBSinkKafka
predicates = []
tasks.max = 1
topics = [KAFKATOPIC1]
topics.regex =
transforms = []
value.converter = class org.apache.kafka.connect.json.JsonConverter
(org.apache.kafka.connect.runtime.SinkConnectorConfig)
[2022-03-22 15:46:52,563] INFO EnrichedConnectorConfig values:
config.action.reload = restart
connector.class = io.confluent.influxdb.InfluxDBSinkConnector
errors.deadletterqueue.context.headers.enable = false
errors.deadletterqueue.topic.name =
errors.deadletterqueue.topic.replication.factor = 3
errors.log.enable = false
errors.log.include.messages = false
errors.retry.delay.max.ms = 60000
errors.retry.timeout = 0
errors.tolerance = none
header.converter = null
key.converter = class org.apache.kafka.connect.storage.StringConverter
name = InfluxDBSinkKafka
predicates = []
tasks.max = 1
topics = [KAFKATOPIC1]
topics.regex =
transforms = []
value.converter = class org.apache.kafka.connect.json.JsonConverter
(org.apache.kafka.connect.runtime.ConnectorConfig$EnrichedConnectorConfig)
I am feeling a little out of my depth and would appreciate any and all help.
The trick here is getting the data in the right format to Kafka in the first place. My MQTT source stream needed to have the value converter set to Bytearray with e schema url and schema = true. Then the Influx Sink started working when I used the jsonconverter, with schema=false. Then it started working. This is deceptive because the message queue looks the same with different valueconverters for the MQTT source connecter, so it took a while to figure out that was the problem.
After getting this working, and realising the confluent stack was perhaps a little overkill for this task, I went with the (much) easier route of pushing MQTT directly to Telegraf and having Telegraf push into InfluxDB. I would recommend this.
I have the followinf flume configuration. I am trying to transfer a file of size 9GB to hdfs using flume from spool directory. I have the following flume configuration.
#initialize agent's source, channel and sink
wagent.sources = wavetronix
wagent.channels = memoryChannel2
wagent.sinks = flumeHDFS
# Setting the source to spool directory where the file exists
wagent.sources.wavetronix.type = spooldir
wagent.sources.wavetronix.spoolDir = /johir/WAVETRONIX/output/Yesterday
wagent.sources.wavetronix.fileHeader = false
wagent.sources.wavetronix.basenameHeader = true
#agent.sources.wavetronix.fileSuffix = .COMPLETED
# Setting the channel to memory
wagent.channels.memoryChannel2.type = memory
# Max number of events stored in the memory channel
wagent.channels.memoryChannel2.capacity = 50000
agent.channels.memoryChannel2.batchSize = 1000
wagent.channels.memoryChannel2.transactioncapacity = 1000
# Setting the sink to HDFS
wagent.sinks.flumeHDFS.type = hdfs
#agent.sinks.flumeHDFS.useLocalTimeStamp = true
wagent.sinks.flumeHDFS.hdfs.path =/user/root/WAVETRONIXFLUME/%Y-%m-%d/
wagent.sinks.flumeHDFS.hdfs.useLocalTimeStamp = true
wagent.sinks.flumeHDFS.hdfs.filePrefix= %{basename}
wagent.sinks.flumeHDFS.hdfs.fileType = DataStream
# Write format can be text or writable
wagent.sinks.flumeHDFS.hdfs.writeFormat = Text
# use a single csv file at a time
wagent.sinks.flumeHDFS.hdfs.maxOpenFiles = 1
wagent.sinks.flumeHDFS.hdfs.rollCount=0
wagent.sinks.flumeHDFS.hdfs.rollInterval=0
wagent.sinks.flumeHDFS.hdfs.rollSize = 6400000
wagent.sinks.flumeHDFS.hdfs.batchSize =1000
# never rollover based on the number of events
wagent.sinks.flumeHDFS.hdfs.rollCount = 0
# rollover file based on max time of 1 min
#agent.sinks.flumeHDFS.hdfs.rollInterval = 0
# agent.sinks.flumeHDFS.hdfs.idleTimeout = 600
# Connect source and sink with channel
wagent.sources.wavetronix.channels = memoryChannel2
wagent.sinks.flumeHDFS.channel = memoryChannel2
But I am getting the following exception.
Exception in thread "SinkRunner-PollingRunner-DefaultSinkProcessor"
java.lang.OutOfMemoryError: Java heap space
at java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1043)
at java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:1535)
at java.lang.ClassLoader.getClassLoadingLock(ClassLoader.java:463)
at java.lang.ClassLoader.loadClass(ClassLoader.java:404)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.log4j.spi.LoggingEvent.(LoggingEvent.java:165)
at org.apache.log4j.Category.forcedLog(Category.java:391)
at org.apache.log4j.Category.log(Category.java:856)
at org.slf4j.impl.Log4jLoggerAdapter.warn(Log4jLoggerAdapter.java:479)
at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:461)
at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:745)
Can anyone help me to solve this problem?
Please edit the file ${FLUME_HOME}/conf/flume-env.sh, then add following code:
export JAVA_OPTS="-Xms1000m -Xmx12000m -Dcom.sun.management.jmxremote"
You can adjust the options "Xmx" and "Xms".
I have configured my flume agent like below. Somehow, the flume agent doesn't run properly. It keeps hanging without any errors. Is there any problem with the below configuration.
FYI: I have a file named "country" with hard-coded header as state
#Define sources, sink and channels
foo.sources = s1
foo.channels = chn-az chn-oth
foo.sinks = sink-az sink-oth
#
### # # Define a source on agent and connect to channel memory-channel.
foo.sources.s1.type = exec
foo.sources.s1.command = cat /home/hadoop/flume/country.txt
foo.sources.s1.batchSize = 1
foo.sources.s1.channels = chn-ca chn-oth
#selector configuration
foo.sources.s1.selector.type = multiplexing
foo.sources.s1.selector.header = state
foo.sources.s1.selector.mapping.AZ = chn-az
foo.sources.s1.selector.default = chn-oth
#
#
### Define a memory channel on agent called memory-channel.
foo.channels.chn-az.type = memory
foo.channels.chn-oth.type = memory
#
#
##Define sinks that outputs to hdfs.
foo.sinks.sink-az.channel = chn-az
foo.sinks.sink-az.type = hdfs
foo.sinks.sink-az.hdfs.path = hdfs://master:9099/user/hadoop/flume
foo.sinks.sink-az.hdfs.filePrefix = statefilter
foo.sinks.sink-az.hdfs.fileType = DataStream
foo.sinks.sink-az.hdfs.writeFormat = Text
foo.sinks.sink-az.batchSize = 1
foo.sinks.sink-az.rollInterval = 0
#
foo.sinks.sink-oth.channel = chn-oth
foo.sinks.sink-oth.type = hdfs
foo.sinks.sink-oth.hdfs.path = hdfs://master:9099/user/hadoop/flume
foo.sinks.sink-oth.hdfs.filePrefix = statefilter
foo.sinks.sink-oth.hdfs.fileType = DataStream
foo.sinks.sink-oth.batchSize = 1
foo.sinks.sink-oth.rollInterval = 0
Thanks,
Vinoth
Regarding the channels list configured at the source:
foo.sources.s1.channels = chn-ca chn-oth
I think chn-ca should be chn-az.
Nevertheless, I think such a configuration will never work since the "state" header used by the selector is not created by any Flume component. You must introduce an interceptor for that, typically the Regex Extractor Interceptor.
kafka version is 0.7.2,flume version is 1.5.0, flume + kafka plugin: https://github.com/baniuyao/flume-kafka
error info:
2014-08-20 18:55:51,755 (conf-file-poller-0) [ERROR - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:149)] Unhandled error
java.lang.NoSuchMethodError: scala.math.LowPriorityOrderingImplicits.ordered()Lscala/math/Ordering;
at kafka.producer.ZKBrokerPartitionInfo$$anonfun$kafka$producer$ZKBrokerPartitionInfo$$getZKTopicPartitionInfo$1.apply(ZKBrokerPartitionInfo.scala:172)
flume configuration:
agent_log.sources = r1
agent_log.sinks = kafka
agent_log.channels = c1
agent_log.sources.r1.type = exec
agent_log.sources.r1.channels = c1
agent_log.sources.r1.command = tail -f /var/log/test.log
agent_log.channels.c1.type = memory
agent_log.channels.c1.capacity = 1000
agent_log.channels.c1.trasactionCapacity = 100
agent_log.sinks.kafka.type = com.vipshop.flume.sink.kafka.KafkaSink
agent_log.sinks.kafka.channel = c1
agent_log.sinks.kafka.zk.connect = XXXX:2181
agent_log.sinks.kafka.topic = my-replicated-topic
agent_log.sinks.kafka.batchsize = 200
agent_log.sinks.kafka.producer.type = async
agent_log.sinks.kafka.serializer.class = kafka.serializer.StringEncoder
what could be the error? THX
scala.math.LowPriorityOrderingImplicits.ordered()
Perhaps you need to import the Scala standard libarary and have it in your Flume lib directory.