Install logstash using docker
Want to be : logstash consumes data of kafka and stdout to console
Spec
Kafka 0.8.2.1
Logstash 2.3.2
Elasticsearch 2.3.3
Docker
Logstash Code
input {
kafka {
zk_connect => 'remoteZookeeperServer:2181'
topic_id => 'testTopic'
}
}
output {
stdout { codec => rubydebug }
}
Docker logs
{:timestamp=>"2017-03-31T09:13:01.120000+0000", :message=>"Pipeline main started"}
// finished
Docker logs (execute logstash with debug mode)
{:timestamp=>"2017-03-31T08:23:41.987000+0000", :message=>"Reading config file", :config_file=>"/config-dir/logstash-wcs.conf", :level=>:debug, :file=>"logstash/config/loader.rb", :line=>"69", :method=>"local_config"}
{:timestamp=>"2017-03-31T08:23:42.022000+0000", :message=>"Plugin not defined in namespace, checking for plugin file", :type=>"input", :name=>"kafka", :path=>"logstash/inputs/kafka", :level=>:debug, :file=>"logstash/plugin.rb", :line=>"76", :method=>"lookup"}
{:timestamp=>"2017-03-31T08:23:42.363000+0000", :message=>"Plugin not defined in namespace, checking for plugin file", :type=>"codec", :name=>"json", :path=>"logstash/codecs/json", :level=>:debug, :file=>"logstash/plugin.rb", :line=>"76", :method=>"lookup"}
{:timestamp=>"2017-03-31T08:23:42.370000+0000", :message=>"config LogStash::Codecs::JSON/#charset = \"UTF-8\"", :level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"153", :method=>"config_init"}
...
// cannot find warn or error
...
{:timestamp=>"2017-03-31T08:23:42.549000+0000", :message=>"Will start workers for output", :worker_count=>1, :class=>LogStash::Outputs::Stdout, :level=>:debug, :file=>"logstash/output_delegator.rb", :line=>"77", :method=>"register"}
{:timestamp=>"2017-03-31T08:23:42.553000+0000", :message=>"Starting pipeline", :id=>"main", :pipeline_workers=>4, :batch_size=>125, :batch_delay=>5, :max_inflight=>500, :level=>:info, :file=>"logstash/pipeline.rb", :line=>"188", :method=>"start_workers"}
{:timestamp=>"2017-03-31T08:23:42.561000+0000", :message=>"Pipeline main started", :file=>"logstash/agent.rb", :line=>"465", :method=>"start_pipeline"}
{:timestamp=>"2017-03-31T08:23:47.565000+0000", :message=>"Pushing flush onto pipeline", :level=>:debug, :file=>"logstash/pipeline.rb", :line=>"458", :method=>"flush"}
{:timestamp=>"2017-03-31T08:23:52.566000+0000", :message=>"Pushing flush onto pipeline", :level=>:debug, :file=>"logstash/pipeline.rb", :line=>"458", :method=>"flush"}
{:timestamp=>"2017-03-31T08:23:57.568000+0000", :message=>"Pushing flush onto pipeline", :level=>:debug, :file=>"logstash/pipeline.rb", :line=>"458", :method=>"flush"}
{:timestamp=>"2017-03-31T08:24:02.570000+0000", :message=>"Pushing flush onto pipeline", :level=>:debug, :file=>"logstash/pipeline.rb", :line=>"458", :method=>"flush"}
{:timestamp=>"2017-03-31T08:24:07.573000+0000", :message=>"Pushing flush onto pipeline", :level=>:debug, :file=>"logstash/pipeline.rb", :line=>"458", :method=>"flush"}
...
Notice
When I'd tested using kafka2.9.1-0.8.2.1 console consumer with same zookeeper server, the consumer consumed data.
Ask
I think logstash cannot connect to zookeeper server and consume data of kafka.
What's the problem?
Why does logstash make the logs about connection with zookeeper server?
Related
I have installed apache Helix 1.0.0 version. I am able to setup a cluster and add resources.
But when i try to start run-helix-controller.sh it gives below error.
Here is command : ./run-helix-controller.sh --zkSvr localhost:2181 --cluster jbpm-cluster
ERROR
[2020-05-20 06:22:29,773] [INFO ] [main] [org.apache.helix.controller.HelixControllerMain:208] - Cluster manager started, zkServer: lpwaidqu02:2181, clusterName:jbpm-cluster, controllerName:null, mode:STANDALONE
Exception in thread "main" java.lang.NoSuchFieldError: Rebalancer
at org.apache.helix.InstanceType.(InstanceType.java:39)
at org.apache.helix.controller.HelixControllerMain.startHelixController(HelixControllerMain.java:156)
at org.apache.helix.controller.HelixControllerMain.main(HelixControllerMain.java:212)
Have you tried the steps in this guide?
http://helix.apache.org/0.9.7-docs/Quickstart.html
Seems like there are a few steps (like "add cluster") before ./run-helix-controller.sh command.
I am running locally Kafka using the confluentinc/cp-kafka Docker image and I am setting the following logging container environment variables:
KAFKA_LOG4J_ROOT_LOGLEVEL: ERROR
KAFKA_LOG4J_LOGGERS: >-
org.apache.zookeeper=ERROR,
org.apache.kafka=ERROR,
kafka=ERROR,
kafka.cluster=ERROR,
kafka.controller=ERROR,
kafka.coordinator=ERROR,
kafka.log=ERROR,
kafka.server=ERROR,
kafka.zookeeper=ERROR,
state.change.logger=ERROR
and I see in the Kafka logs that Kafka is starting with the following configuration:
===> ENV Variables ...
ALLOW_UNSIGNED=false
COMPONENT=kafka
CONFLUENT_DEB_VERSION=1
CONFLUENT_PLATFORM_LABEL=
CONFLUENT_VERSION=5.4.1
...
KAFKA_LOG4J_LOGGERS=org.apache.zookeeper=ERROR, org.apache.kafka=ERROR, kafka=ERROR, kafka.cluster=ERROR, kafka.controller=ERROR, kafka.coordinator=ERROR, kafka.log=ERROR, kafka.server=ERROR, kafka.zookeeper=ERROR, state.change.logger=ERROR
KAFKA_LOG4J_ROOT_LOGLEVEL=ERROR
...
Still I see further down in the logs the INFO and TRACE log levels. For example:
[2020-03-26 16:22:12,838] INFO [Controller id=1001] Ready to serve as the new controller with epoch 1 (kafka.controller.KafkaController)
[2020-03-26 16:22:12,848] INFO [Controller id=1001] Partitions undergoing preferred replica election: (kafka.controller.KafkaController)
[2020-03-26 16:22:12,849] INFO [Controller id=1001] Partitions that completed preferred replica election: (kafka.controller.KafkaController)
[2020-03-26 16:22:12,855] INFO [Controller id=1001] Skipping preferred replica election for partitions due to topic deletion: (kafka.controller.KafkaController)
How can I really deactivate the logs below a certain level? In the example above, I really want only ERROR logs.
The approach above is the way described in the Confluent documentation.
And the Apache Kafka source code lists all sorts of loggers that I could not influence using the KAFKA_LOG4J_LOGGERS Docker environment variable.
I went and troubleshot the Dockerfile's and inspected the Kafka container. The cause of this behaviour was the YAML multiline string folding.
Hence the provided environment variable (using a YAML multiline value) is at runtime:
KAFKA_LOG4J_LOGGERS=org.apache.zookeeper=ERROR, org.apache.kafka=ERROR, kafka=ERROR, kafka.cluster=ERROR, kafka.controller=ERROR, kafka.coordinator=ERROR, kafka.log=ERROR, kafka.server=ERROR, kafka.zookeeper=ERROR, state.change.logger=ERROR
instead of (no spaces in between):
KAFKA_LOG4J_LOGGERS=org.apache.zookeeper=ERROR,org.apache.kafka=ERROR, kafka=ERROR, kafka.cluster=ERROR,kafka.controller=ERROR, kafka.coordinator=ERROR,kafka.log=ERROR,kafka.server=ERROR,kafka.zookeeper=ERROR,state.change.logger=ERROR
And this was visible inside the container in the generated /etc/kafka/log4j.properties file:
log4j.rootLogger=ERROR, stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c)%n
log4j.logger.kafka.authorizer.logger=WARN
log4j.logger.kafka.cluster=ERROR
log4j.logger.kafka.producer.async.DefaultEventHandler=DEBUG
log4j.logger.kafka.zookeeper=ERROR
log4j.logger.org.apache.kafka=ERROR
log4j.logger.kafka.coordinator=ERROR
log4j.logger.org.apache.zookeeper=ERROR
log4j.logger.kafka.log.LogCleaner=INFO
log4j.logger.kafka.controller=ERROR
log4j.logger.kafka=INFO
log4j.logger.kafka.log=ERROR
log4j.logger.state.change.logger=ERROR
log4j.logger.kafka=ERROR
log4j.logger.kafka.server=ERROR
log4j.logger.kafka.controller=TRACE
log4j.logger.kafka.network.RequestChannel$=WARN
log4j.logger.kafka.request.logger=WARN
log4j.logger.state.change.logger=TRACE
If you really need to split the long line in a YAML multiline value, you would have to use this YAML syntax.
More hints from the code:
here is where the log4j.properties file is generated when a confluent container is run.
these are the default log levels that Kafka will start with.
these should be all the loggers supported by Kafka
I am trying to run a spark job with PySpark through Jupyter notebook running in Docker. Workers are located on separate machines in the same network. I am performing a take operation on RDD:
data.take(number_of_elements)
When the number_of_elements is 2000 everything works fine. When it is 20000 an exception occurs. From my point of view it breaks when the size of the result exceeds 2GB (or it seems for me so). The idea about 2GB comes from that spark can send results smaller than 2GB in one block and when the result is bigger than 2GB another mechanism starts to work and something breaks there (see here). Here is the exception from executor log:
19/11/05 10:27:14 INFO CodeGenerator: Code generated in 205.7623 ms
19/11/05 10:27:40 INFO PythonRunner: Times: total = 25421, boot = 3, init = 1751, finish = 23667
19/11/05 10:27:42 INFO MemoryStore: Block taskresult_4 stored as bytes in memory (estimated size 927.7 MB, free 6.4 GB)
19/11/05 10:27:42 INFO Executor: Finished task 0.0 in stage 3.0 (TID 4). 972788748 bytes result sent via BlockManager)
19/11/05 10:27:49 ERROR TransportRequestHandler: Error sending result ChunkFetchSuccess{streamChunkId=StreamChunkId{streamId=1585998572000, chunkIndex=0}, buffer=org.apache.spark.storage.BlockManagerManagedBuffer#4399ad49} to /10.0.0.9:56222; closing connection
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
at org.apache.spark.util.io.ChunkedByteBufferFileRegion.transferTo(ChunkedByteBufferFileRegion.scala:64)
at org.apache.spark.network.protocol.MessageWithHeader.transferTo(MessageWithHeader.java:121)
at io.netty.channel.socket.nio.NioSocketChannel.doWriteFileRegion(NioSocketChannel.java:355)
at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:224)
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:382)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:934)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:362)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:901)
at io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1321)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749)
at io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749)
at io.netty.channel.ChannelDuplexHandler.flush(ChannelDuplexHandler.java:117)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749)
at io.netty.channel.DefaultChannelPipeline.flush(DefaultChannelPipeline.java:983)
at io.netty.channel.AbstractChannel.flush(AbstractChannel.java:248)
at io.netty.channel.nio.AbstractNioByteChannel$1.run(AbstractNioByteChannel.java:284)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
As we can see from the log executor tries to send result to 10.0.0.9:56222. It fails because the port is not opened in docker compose. 10.0.0.9 is an IP address of a master node but port 56222 is random though I explicitly set up all ports I can find in documentation to disable random port selection:
spark = SparkSession.builder\
.master('spark://spark.cyber.com:7077')\
.appName('My App')\
.config('spark.task.maxFailures', '16')\
.config('spark.driver.port', '20002')\
.config('spark.driver.host', 'spark.cyber.com')\
.config('spark.driver.bindAddress', '0.0.0.0')\
.config('spark.blockManager.port', '6060')\
.config('spark.driver.blockManager.port', '6060')\
.config('spark.shuffle.service.port', '7070')\
.config('spark.driver.maxResultSize', '14g')\
.getOrCreate()
I mapped these ports with docker compose:
version: "3"
services:
jupyter:
image: jupyter/pyspark-notebook:latest
ports:
- "4040-4050:4040-4050"
- "6060:6060"
- "7070:7070"
- "8888:8888"
- "20000-20010:20000-20010"
You should probably configure you spark driver memory to follow your docker container memory settings
I added
.config('spark.driver.memory', '14g')
as #ML_TN proposed and everything works now.
From my point of view it is strange that the memory setting affects the ports that spark uses.
I have been trying to use use neo4j community in a container and am getting errors. I think this might more a docker usage issues rather than neo4j usage.
I have built a container image from https://github.com/neo4j/docker-neo4j-publish 2.3.9, 3.3.3, 3.3.4 and 3.3.5 (only differences being some new ports in later versions). I have even pulled a native 3.3.3 from dockerhub.com
mkdir /tmp/data
chmod 777 /tmp/data
docker run --detach=true --name=neo4j --publish=7474:7474 --publish=7687:7687 --publish=7473:7473 --volume=/tmp/data:/data neo4j:3.3.3
docker exec -it neo4j find / -name '*.log'
and although it seems to be working with
neo4j> CREATE (n);
0 rows available after 50 ms, consumed after another 0 ms
Added 1 nodes
neo4j> CREATE (m),(o);
0 rows available after 15 ms, consumed after another 0 ms
Added 2 nodes
neo4j> MATCH (n) RETURN n;
+----+
| n |
+----+
| () |
| () |
| () |
+----+
3 rows available after 21 ms, consumed after another 8 ms
I actually get errors like this:
docker exec -it neo4j neo4j status
Neo4j is not running
Now this one looks like I am mistakenly trying to start another instance of Neo4j over a running instance:
docker exec -it neo4j neo4j console
Active database: graph.db
Directories in use:
home: /var/lib/neo4j
config: /var/lib/neo4j/conf
logs: /var/lib/neo4j/logs
plugins: /var/lib/neo4j/plugins
import: /var/lib/neo4j/import
data: /var/lib/neo4j/data
certificates: /var/lib/neo4j/certificates
run: /var/lib/neo4j/run
Starting Neo4j.
2018-04-15 06:30:13.119+0000 WARN Unknown config option: causal_clustering.discovery_listen_address
2018-04-15 06:30:13.123+0000 WARN Unknown config option: causal_clustering.raft_advertised_address
2018-04-15 06:30:13.123+0000 WARN Unknown config option: causal_clustering.raft_listen_address
2018-04-15 06:30:13.123+0000 WARN Unknown config option: ha.host.coordination
2018-04-15 06:30:13.124+0000 WARN Unknown config option: causal_clustering.transaction_advertised_address
2018-04-15 06:30:13.124+0000 WARN Unknown config option: causal_clustering.discovery_advertised_address
2018-04-15 06:30:13.124+0000 WARN Unknown config option: ha.host.data
2018-04-15 06:30:13.124+0000 WARN Unknown config option: causal_clustering.transaction_listen_address
2018-04-15 06:30:13.146+0000 INFO ======== Neo4j 3.3.3 ========
2018-04-15 06:30:13.186+0000 INFO Starting...
2018-04-15 06:30:13.997+0000 INFO Bolt enabled on 0.0.0.0:7687.
2018-04-15 06:30:14.094+0000 ERROR Failed to start Neo4j: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase#44a59da3' was successfully initialized, but failed to start. Please see the attached cause exception "Store and its lock file has been locked by another process: /var/lib/neo4j/data/databases/graph.db/store_lock. Please ensure no other process is using this database, and that the directory is writable (required even for read-only access)". Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase#44a59da3' was successfully initialized, but failed to start. Please see the attached cause exception "Store and its lock file has been locked by another process: /var/lib/neo4j/data/databases/graph.db/store_lock. Please ensure no other process is using this database, and that the directory is writable (required even for read-only access)".
org.neo4j.server.ServerStartupException: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase#44a59da3' was successfully initialized, but failed to start. Please see the attached cause exception "Store and its lock file has been locked by another process: /var/lib/neo4j/data/databases/graph.db/store_lock. Please ensure no other process is using this database, and that the directory is writable (required even for read-only access)".
Does anybody have experience with Neo4j's docker implementation? Is it a single threaded issue meaning I need to call the CLI tools differently from the container?
The neo4j status command only works if you've started neo4j with neo4j start. Start creates a neo4j.pid file that status uses to see if neo4j is running. Starting under docker uses the console option instead of the start option. This does not create the PID file, so the status doesn't work. But that hardly matters, because neo4j is just about the only process running; if neo4j dies, the container will exit. If docker ps -a says that the container is up, then neo4j is up.
I'm trying to use graylog2 to collect logs from docker containers. Docs says that only UDP GELF input is supported for this purpose.
I'm using docker-compose to run the graylog server. See gist for all files used: https://gist.github.com/olegabr/7f5190c453bb63c71dabf151d2373c2f.
And I'm using this command to test it:
sendip -p ipv4 -is 127.0.0.1 -p udp -us 5070 -ud 12201 -d '{"version": "1.1","host":"example.org","short_message":"Short message","full_message":"Backtrace here\n\nmore stuff","level":1,"_user_id":9001,"_some_info":"foo","_some_env_var":"bar"}' -v 127.0.0.1
Server receives this message, but it can not process it. I see following in the graylog2 logs:
2016-12-09 11:53:20,125 WARN : org.graylog2.bindings.providers.DefaultStreamProvider - Unable to load default stream, tried 1 times, retrying every 500ms. Processing is blocked until this succeeds.
2016-12-09 11:53:25,129 WARN : org.graylog2.bindings.providers.DefaultStreamProvider - Unable to load default stream, tried 11 times, retrying every 500ms. Processing is blocked until this succeeds.
e.t.c. many many similar lines.
The API call curl http://admin:123456#127.0.0.1:9000/api/count/total returns
{"events":0}
In the server logs I see that the default stream was initialized:
mongo_1 | 2016-12-09T11:51:12.522+0000 I INDEX [conn3] build index on: graylog.pipeline_processor_pipelines_streams properties: { v: 2, unique: true, key: { stream_id: 1 }, name: "stream_id_1", ns: "graylog.pipeline_processor_pipelines_streams" }
graylog_1 | 2016-12-09 11:51:13,408 INFO : org.graylog2.periodical.Periodicals - Starting [org.graylog.plugins.pipelineprocessor.periodical.LegacyDefaultStreamMigration] periodical, running forever.
graylog_1 | 2016-12-09 11:51:13,424 INFO : org.graylog.plugins.pipelineprocessor.periodical.LegacyDefaultStreamMigration - Legacy default stream has no connections, no migration needed.
graylog_1 | 2016-12-09 11:51:13,487 INFO : org.graylog2.migrations.V20160929120500_CreateDefaultStreamMigration - Successfully created default stream: All messages
graylog_1 | 2016-12-09 11:51:13,653 INFO : org.graylog2.migrations.V20161125142400_EmailAlarmCallbackMigration - No streams needed to be migrated.
graylog_1 | 2016-12-09 11:51:13,662 INFO : org.graylog2.migrations.V20161125161400_AlertReceiversMigration - No streams needed to be migrated.
graylog_1 | 2016-12-09 11:51:13,672 INFO : org.graylog2.migrations.V20161130141500_DefaultStreamRecalcIndexRanges - Cluster not connected yet, delaying migration until it is reachable.
So, why it can not be loaded when the message arrives? Why it is needed in the first place?
I've tried to find similar reports in web but with no success.
This has nothing to do with the UDP input per se.
Graylog 2.2.0-beta.1 is broken and shouldn't be used. Please downgrade to Graylog 2.1.2 (the latest stable version) or wait for Graylog 2.2.0-beta.2.
See https://groups.google.com/forum/#!searchin/graylog2/docker|sort:date/graylog2/gCycC3_K3vU/EL-Lz_uNDQAJ for a related post on the Graylog mailing list.
same trouble
just setup graylog and configure input gelf udp 12209 port
then test it twice by:
docker run --log-driver=gelf --log-opt gelf-address=udp://127.0.0.1:12209 busybox echo Hello Graylog
in UI i saw:
2 messages in process buffe
2 unprocessed messages are currently in the journal, in 1 segments.
0 messages have been appended in the last second, 0 messages have been read in the last second.
and still getting:
2016-12-09 12:41:23,715 INFO : org.graylog2.inputs.InputStateListener - Input [GELF UDP/584aa67308813b00010d009e] is now RUNNING
2016-12-09 12:41:43,666 WARN : org.graylog2.bindings.providers.DefaultStreamProvider - Unable to load default stream, tried 1 times, retrying every 500ms. Processing is blocked until this succeeds.
anyone have found solution ?