flume Command [tail -F] exited with 1 - flume

Trying to run the following command in Linux:
bin/flume-ng agent -n a1 -c conf -f conf/flume-tail.properties -Dflume.root.logger=INFO,console
However, the processing stops at:
2016-06-26 09:03:44,610 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:138)] Starting new configuration:{ sourceRunners:{r1=EventDrivenSourceRunner: { source:org.apache.flume.source.ExecSource{name:r1,state:IDLE} }} sinkRunners:{k1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor#3244eabe counterGroup:{ name:null counters:{} } }} channels:{c1=org.apache.flume.channel.MemoryChannel{name: c1}} }
2016-06-26 09:03:44,676 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:145)] Starting Channel c1
2016-06-26 09:03:44,744 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:120)] Monitored counter group for type: CHANNEL, name: c1: Successfully registered new MBean.
2016-06-26 09:03:44,746 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:96)] Component type: CHANNEL, name: c1 started
2016-06-26 09:03:44,747 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:173)] Starting Sink k1
2016-06-26 09:03:44,747 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:184)] Starting Source r1
2016-06-26 09:03:44,748 (lifecycleSupervisor-1-3) [INFO - org.apache.flume.source.ExecSource.start(ExecSource.java:169)] Exec source starting with command:tail -F
2016-06-26 09:03:44,766 (lifecycleSupervisor-1-3) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:120)] Monitored counter group for type: SOURCE, name: r1: Successfully registered new MBean.
2016-06-26 09:03:44,766 (lifecycleSupervisor-1-3) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:96)] Component type: SOURCE, name: r1 started
2016-06-26 09:03:44,785 (pool-3-thread-1) [INFO - org.apache.flume.source.ExecSource$ExecRunnable.run(ExecSource.java:376)] Command [tail -F] exited with 1
Could anyone help me address this issue?

Check whether the user has the permissions to run tail -f on your target dir.

Related

Why starting Artifactory fails

I want to install artifactory on an ubuntu docker image manually without using the artifactory image from docker hub.
what I have done so far is :
Get an ubuntu image with JDK 11 installed.
I used apt-get I have installed the artifactory.
but when starting the artifactory service with service start artifactory I get the following logs with errors:
root#f01a31f43dc0:/# service artifactory start
2021-12-15T23:57:37.545Z [shell] [INFO ] [] [artifactory:81 ] [main] - Starting Artifactory tomcat as user artifactory...
2021-12-15T23:57:37.590Z [shell] [INFO ] [] [installerCommon.sh:1519 ] [main] - Checking open files and processes limits
2021-12-15T23:57:37.637Z [shell] [INFO ] [] [installerCommon.sh:1522 ] [main] - Current max open files is 1048576
2021-12-15T23:57:37.694Z [shell] [INFO ] [] [installerCommon.sh:1533 ] [main] - Current max open processes is unlimited
.shared.security value is of wrong data type. Correct type should be !!map
.shared.node value is of wrong data type. Correct type should be !!map
.shared.database value is of wrong data type. Correct type should be !!map
yaml validation failed
2021-12-15T23:57:37.798Z [shell] [WARN ] [] [installerCommon.sh:721 ] [main] - System.yaml validation failed
Database connection check failed Could not determine database type
2021-12-15T23:57:38.172Z [shell] [INFO ] [] [installerCommon.sh:3381 ] [main] - Setting JF_SHARED_NODE_ID to f01a31f43dc0
2021-12-15T23:57:38.424Z [shell] [INFO ] [] [installerCommon.sh:3381 ] [main] - Setting JF_SHARED_NODE_IP to 172.17.0.2
2021-12-15T23:57:38.652Z [shell] [INFO ] [] [installerCommon.sh:3381 ] [main] - Setting JF_SHARED_NODE_NAME to f01a31f43dc0
2021-12-15T23:57:39.348Z [shell] [INFO ] [] [artifactoryCommon.sh:186 ] [main] - Using Tomcat template to generate : /opt/jfrog/artifactory/app/artifactory/tomcat/conf/server.xml
2021-12-15T23:57:39.711Z [shell] [INFO ] [] [artifactoryCommon.sh:1008 ] [main] - Resolved ${artifactory.port||8081} to default value : 8081
2021-12-15T23:57:39.959Z [shell] [INFO ] [] [artifactoryCommon.sh:1008 ] [main] - Resolved ${artifactory.tomcat.connector.sendReasonPhrase||false} to default value : false
2021-12-15T23:57:40.244Z [shell] [INFO ] [] [artifactoryCommon.sh:1008 ] [main] - Resolved ${artifactory.tomcat.connector.maxThreads||200} to default value : 200
2021-12-15T23:57:40.705Z [shell] [INFO ] [] [artifactoryCommon.sh:1008 ] [main] - Resolved ${artifactory.tomcat.maintenanceConnector.port||8091} to default value : 8091
2021-12-15T23:57:40.997Z [shell] [INFO ] [] [artifactoryCommon.sh:1008 ] [main] - Resolved ${artifactory.tomcat.maintenanceConnector.maxThreads||5} to default value : 5
2021-12-15T23:57:41.278Z [shell] [INFO ] [] [artifactoryCommon.sh:1008 ] [main] - Resolved ${artifactory.tomcat.maintenanceConnector.acceptCount||5} to default value : 5
2021-12-15T23:57:41.751Z [shell] [INFO ] [] [artifactoryCommon.sh:1008 ] [main] - Resolved ${access.http.port||8040} to default value : 8040
2021-12-15T23:57:42.041Z [shell] [INFO ] [] [artifactoryCommon.sh:1008 ] [main] - Resolved ${access.tomcat.connector.sendReasonPhrase||false} to default value : false
2021-12-15T23:57:42.341Z [shell] [INFO ] [] [artifactoryCommon.sh:1008 ] [main] - Resolved ${access.tomcat.connector.maxThreads||50} to default value : 50
2021-12-15T23:57:42.906Z [shell] [INFO ] [] [systemYamlHelper.sh:527 ] [main] - Resolved JF_PRODUCT_HOME (/opt/jfrog/artifactory) from environment variable
2021-12-15T23:57:43.320Z [shell] [INFO ] [] [artifactoryCommon.sh:1008 ] [main] - Resolved ${shared.tomcat.workDir||/opt/jfrog/artifactory/var/work/artifactory/tomcat} to default value : /opt/jfrog/artifact
ory/var/work/artifactory/tomcat
========================
JF Environment variables
========================
JF_SHARED_NODE_ID : f01a31f43dc0
JF_SHARED_NODE_IP : 172.17.0.2
JF_ARTIFACTORY_PID : /var/run/artifactory.pid
JF_SYSTEM_YAML : /opt/jfrog/artifactory/var/etc/system.yaml
JF_PRODUCT_HOME : /opt/jfrog/artifactory
JF_ROUTER_TOPOLOGY_LOCAL_REQUIREDSERVICETYPES : jfrt,jfac,jfmd,jffe,jfob
JF_SHARED_NODE_NAME : f01a31f43dc0
2021-12-15T23:57:45.827Z [shell] [ERROR] [] [installerCommon.sh:3267 ] [main] - ##############################################################################
2021-12-15T23:57:45.890Z [shell] [ERROR] [] [installerCommon.sh:3268 ] [main] - Ownership mismatch. You can try executing following instruction and do a restart
2021-12-15T23:57:45.959Z [shell] [ERROR] [] [installerCommon.sh:3269 ] [main] - Command : chown -R artifactory:artifactory /opt/jfrog/artifactory/var/log
2021-12-15T23:57:46.029Z [shell] [ERROR] [] [installerCommon.sh:3270 ] [main] - ##############################################################################
I'm not sure what I'm messing in this installation process.
The error is clear that there is permission issue on /opt/jfrog/artifactory/var/log folder and you should be running the chown -R artifactory:artifactory /opt/jfrog/artifactory/var/log command to solve it

Filebeat: Failed to start crawler: starting input failed: Error while initializing input: Can only start an input when all related states are finished

I have a job that starts several docker containers periodically and for each container I also start a filebeat docker container to gather the logs and save them in elastic search.
Filebeat version 7.9 has been used.
Docker containers are started from java application using spotify docker client and terminated when job finishes.
The filebeat configuration is the following and it monitors only a specific docker container:
filebeat.inputs:
- paths: ${logs_paths}
include_lines: ['^{']
json.message_key: log
json.keys_under_root: true
json.overwrite_keys: true
json.add_error_key: true
type: log
scan_frequency: 10s
ignore_older: 15m
- paths: ${logs_paths}
exclude_lines: ['^{']
json.message_key: log
type: log
json.keys_under_root: true
json.overwrite_keys: true
json.add_error_key: true
scan_frequency: 10s
ignore_older: 15m
max_bytes: 20000000
processors:
- decode_json_fields:
fields: ["log"]
target: ""
output.elasticsearch:
hosts: ${elastic_host}
username: "something"
password: "else"
logs_paths:
- /var/lib/docker/containers/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5-json.log
From time to time we observe that one filebeat container is crashing immediately after starting with the following error. Although the job runs the same docker images each time, the filebeat error might appear to any of them:
2020-12-09T16:00:15.784Z INFO instance/beat.go:640 Home path: [/usr/share/filebeat] Config path: [/usr/share/filebeat] Data path: [/usr/share/filebeat/data] Logs path: [/usr/share/filebeat/logs]
2020-12-09T16:00:15.864Z INFO instance/beat.go:648 Beat ID: 03ef7f54-2768-4d93-b7ca-c449e94b239c
2020-12-09T16:00:15.868Z INFO [seccomp] seccomp/seccomp.go:124 Syscall filter successfully installed
2020-12-09T16:00:15.868Z INFO [beat] instance/beat.go:976 Beat info {"system_info": {"beat": {"path": {"config": "/usr/share/filebeat", "data": "/usr/share/filebeat/data", "home": "/usr/share/filebeat", "logs": "/usr/share/filebeat/logs"}, "type": "filebeat", "uuid": "03ef7f54-2768-4d93-b7ca-c449e94b239c"}}}
2020-12-09T16:00:15.869Z INFO [beat] instance/beat.go:985 Build info {"system_info": {"build": {"commit": "b2ee705fc4a59c023136c046803b56bc82a16c8d", "libbeat": "7.9.0", "time": "2020-08-11T20:11:11.000Z", "version": "7.9.0"}}}
2020-12-09T16:00:15.869Z INFO [beat] instance/beat.go:988 Go runtime info {"system_info": {"go": {"os":"linux","arch":"amd64","max_procs":4,"version":"go1.14.4"}}}
2020-12-09T16:00:15.871Z INFO [beat] instance/beat.go:992 Host info {"system_info": {"host": {"architecture":"x86_64","boot_time":"2020-10-28T10:03:29Z","containerized":true,"name":"638de114b513","ip":["someIP"],"kernel_version":"4.4.0-190-generic","mac":["someMAC"],"os":{"family":"redhat","platform":"centos","name":"CentOS Linux","version":"7 (Core)","major":7,"minor":8,"patch":2003,"codename":"Core"},"timezone":"UTC","timezone_offset_sec":0}}}
2020-12-09T16:00:15.876Z INFO [beat] instance/beat.go:1021 Process info {"system_info": {"process": {"capabilities": {"inheritable":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"permitted":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"effective":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"bounding":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"ambient":null}, "cwd": "/usr/share/filebeat", "exe": "/usr/share/filebeat/filebeat", "name": "filebeat", "pid": 1, "ppid": 0, "seccomp": {"mode":"filter"}, "start_time": "2020-12-09T16:00:14.670Z"}}}
2020-12-09T16:00:15.876Z INFO instance/beat.go:299 Setup Beat: filebeat; Version: 7.9.0
2020-12-09T16:00:15.876Z INFO [index-management] idxmgmt/std.go:184 Set output.elasticsearch.index to 'someIndex' as ILM is enabled.
2020-12-09T16:00:15.877Z INFO eslegclient/connection.go:99 elasticsearch url: someURL
2020-12-09T16:00:15.878Z INFO [publisher] pipeline/module.go:113 Beat name: 638de114b513
2020-12-09T16:00:15.885Z INFO [monitoring] log/log.go:118 Starting metrics logging every 30s
2020-12-09T16:00:15.886Z INFO instance/beat.go:450 filebeat start running.
2020-12-09T16:00:15.893Z INFO memlog/store.go:119 Loading data file of '/usr/share/filebeat/data/registry/filebeat' succeeded. Active transaction id=0
2020-12-09T16:00:15.893Z INFO memlog/store.go:124 Finished loading transaction log file for '/usr/share/filebeat/data/registry/filebeat'. Active transaction id=0
2020-12-09T16:00:15.893Z INFO [registrar] registrar/registrar.go:108 States Loaded from registrar: 0
2020-12-09T16:00:15.893Z INFO [crawler] beater/crawler.go:71 Loading Inputs: 2
2020-12-09T16:00:15.894Z INFO log/input.go:157 Configured paths: [/var/lib/docker/containers/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5-json.log]
2020-12-09T16:00:15.895Z INFO [crawler] beater/crawler.go:141 Starting input (ID: 3906827571448963007)
2020-12-09T16:00:15.895Z INFO log/harvester.go:297 Harvester started for file: /var/lib/docker/containers/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5-json.log
2020-12-09T16:00:15.902Z INFO beater/crawler.go:148 Stopping Crawler
2020-12-09T16:00:15.902Z INFO beater/crawler.go:158 Stopping 1 inputs
2020-12-09T16:00:15.902Z INFO [crawler] beater/crawler.go:163 Stopping input: 3906827571448963007
2020-12-09T16:00:15.902Z INFO input/input.go:136 input ticker stopped
2020-12-09T16:00:15.902Z INFO log/harvester.go:320 Reader was closed: /var/lib/docker/containers/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5-json.log. Closing.
2020-12-09T16:00:15.902Z INFO beater/crawler.go:178 Crawler stopped
2020-12-09T16:00:15.902Z INFO [registrar] registrar/registrar.go:131 Stopping Registrar
2020-12-09T16:00:15.902Z INFO [registrar] registrar/registrar.go:165 Ending Registrar
2020-12-09T16:00:15.903Z INFO [registrar] registrar/registrar.go:136 Registrar stopped
2020-12-09T16:00:15.912Z INFO [monitoring] log/log.go:153 Total non-zero metrics {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":80,"time":{"ms":80}},"total":{"ticks":230,"time":{"ms":232},"value":0},"user":{"ticks":150,"time":{"ms":152}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":8},"info":{"ephemeral_id":"cae44857-494c-40e7-bf6a-e06e2cf40759","uptime":{"ms":290}},"memstats":{"gc_next":16703568,"memory_alloc":8518080,"memory_total":40448184,"rss":73908224},"runtime":{"goroutines":11}},"filebeat":{"events":{"added":2,"done":2},"harvester":{"closed":1,"open_files":0,"running":0,"started":1}},"libbeat":{"config":{"module":{"running":0}},"output":{"type":"elasticsearch"},"pipeline":{"clients":0,"events":{"active":0,"filtered":2,"total":2}}},"registrar":{"states":{"current":1,"update":2},"writes":{"success":2,"total":2}},"system":{"cpu":{"cores":4},"load":{"1":1.79,"15":1.21,"5":1.54,"norm":{"1":0.4475,"15":0.3025,"5":0.385}}}}}}
2020-12-09T16:00:15.912Z INFO [monitoring] log/log.go:154 Uptime: 292.790204ms
2020-12-09T16:00:15.912Z INFO [monitoring] log/log.go:131 Stopping metrics logging.
2020-12-09T16:00:15.913Z INFO instance/beat.go:456 filebeat stopped.
2020-12-09T16:00:15.913Z ERROR instance/beat.go:951 Exiting: Failed to start crawler: starting input failed: Error while initializing input: Can only start an input when all related states are finished: {Id: native::4096794-64769, Finished: false, Fileinfo: &{40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5-json.log 0 416 {874391692 63743126415 0x608b880} {64769 4096794 1 33184 0 0 0 0 0 4096 0 {1607529615 874391692} {1607529615 874391692} {1607529615 874391692} [0 0 0]}}, Source: /var/lib/docker/containers/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5-json.log, Offset: 0, Timestamp: 2020-12-09 16:00:15.896210395 +0000 UTC m=+0.302799924, TTL: -1ns, Type: log, Meta: map[], FileStateOS: 4096794-64769}
Exiting: Failed to start crawler: starting input failed: Error while initializing input: Can only start an input when all related states are finished: {Id: native::4096794-64769, Finished: false, Fileinfo: &{40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5-json.log 0 416 {874391692 63743126415 0x608b880} {64769 4096794 1 33184 0 0 0 0 0 4096 0 {1607529615 874391692} {1607529615 874391692} {1607529615 874391692} [0 0 0]}}, Source: /var/lib/docker/containers/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5/40c453871c01f0581b832e0452659553b6be2ac4dc1ac8bfaf2b5478bca1cec5-json.log, Offset: 0, Timestamp: 2020-12-09 16:00:15.896210395 +0000 UTC m=+0.302799924, TTL: -1ns, Type: log, Meta: map[], FileStateOS: 4096794-64769}
Does anyone have an idea what might cause this?

Appending to file not showing up on logstash or elasticsearch output

I spun up logstash and elasticsearch docker containers using images from elastic.co. When I append to the file which I have set as my input file I don't see any output from logstash or elasticsearch. This page didn't help much and couldn't find my exact problem on google or stackoverflow.
This is how I started my containers:
docker run \
--name elasticsearch \
-p 9200:9200 \
-p 9300:9300 \
-e "discovery.type=single-node" \
docker.elastic.co/elasticsearch/elasticsearch:6.3.1
docker run \
--name logstash \
--rm -it -v $(pwd)/elk/logstash.yml:/usr/share/logstash/config/logstash.yml \
-v $(pwd)/elk/pipeline.conf:/usr/share/logstash/pipeline/pipeline.conf \
docker.elastic.co/logstash/logstash:6.3.1
This is my pipeline configuration file:
input {
file {
path => "/pepper/logstash-tutorial.log"
}
}
output {
elasticsearch {
hosts => "http://x.x.x.x:9200/"
}
stdout {
codec => "rubydebug"
}
}
Logstash and elasticsearch started fine it seems.
Sample logstash startup output:
[INFO ][logstash.inputs.beats ] Beats inputs: Starting input listener {:address=>"0.0.0.0:5044"}
[INFO ][logstash.pipeline ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x6dc306e7 sleep>"}
[INFO ][logstash.agent ] Pipelines running {:count=>2, :running_pipelines=>[:main, :".monitoring-logstash"], :non_running_pipelines=>[]}
[INFO ][org.logstash.beats.Server] Starting server on port: 5044
[INFO ][logstash.inputs.metrics ] Monitoring License OK
[INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
Sample elasticsearch startup output:
[INFO ][o.e.c.r.a.AllocationService] [FJImg8Z] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.monitoring-logstash-6-2018.07.10][0]] ...]).
So when I make changes to logstash-tutorial.log, I don't see any terminal output from logstash or elasticsearch. How to get output or configure logstash and elasticsearch correctly?
The answer is on the same page that you have referred. Take a look at start_position
Choose where Logstash starts initially reading files: at the beginning or at the end. The default behavior treats files like live streams and thus starts at the end. If you have old data you want to import, set this to beginning.
Set the start position as below:
input {
file {
path => "/pepper/logstash-tutorial.log"
start_position => "beginning"
}
}
output {
elasticsearch {
hosts => "http://x.x.x.x:9200/"
}
stdout {
codec => "rubydebug"
}
}

Flume agent - using tail -F

I am new to Apache Flume.
I have created my agent like:
agent.sources=exec-source
agent.sinks=hdfs-sink
agent.channels=ch1
agent.sources.exec-source.type=exec
agent.sources.exec-source.command=tail -F /var/log/apache2/access.log
agent.sinks.hdfs-sink.type=hdfs
agent.sinks.hdfs-sink.hdfs.path=hdfs://<Host-Name of name node>/
agent.sinks.hdfs-sink.hdfs.filePrefix=apacheaccess
agent.channels.ch1.type=memory
agent.channels.ch1.capacity=1000
agent.sources.exec-source.channels=ch1
agent.sinks.hdfs-sink.channel=ch1
And the ouput i am getting is :
13/01/22 17:31:48 INFO lifecycle.LifecycleSupervisor: Starting lifecycle supervisor 1
13/01/22 17:31:48 INFO node.FlumeNode: Flume node starting - agent
13/01/22 17:31:48 INFO properties.PropertiesFileConfigurationProvider: Configuration provider starting
13/01/22 17:31:48 INFO nodemanager.DefaultLogicalNodeManager: Node manager starting
13/01/22 17:31:48 INFO lifecycle.LifecycleSupervisor: Starting lifecycle supervisor 9
13/01/22 17:31:48 INFO properties.PropertiesFileConfigurationProvider: Reloading configuration file:conf/flume_exec.conf
13/01/22 17:31:48 INFO conf.FlumeConfiguration: Added sinks: hdfs-sink Agent: agent
13/01/22 17:31:48 INFO conf.FlumeConfiguration: Processing:hdfs-sink
13/01/22 17:31:48 INFO conf.FlumeConfiguration: Processing:hdfs-sink
13/01/22 17:31:48 INFO conf.FlumeConfiguration: Processing:hdfs-sink
13/01/22 17:31:48 INFO conf.FlumeConfiguration: Processing:hdfs-sink
13/01/22 17:31:48 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [agent]
13/01/22 17:31:48 INFO properties.PropertiesFileConfigurationProvider: Creating channels
13/01/22 17:31:48 INFO properties.PropertiesFileConfigurationProvider: created channel ch1
13/01/22 17:31:48 INFO sink.DefaultSinkFactory: Creating instance of sink: hdfs-sink, type: hdfs
13/01/22 17:31:48 INFO hdfs.HDFSEventSink: Hadoop Security enabled: false
13/01/22 17:31:48 INFO nodemanager.DefaultLogicalNodeManager: Starting new configuration:{ sourceRunners:{exec-source=EventDrivenSourceRunner: { source:org.apache.flume.source.ExecSource{name:exec-source,state:IDLE} }} sinkRunners:{hdfs-sink=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor#715d44 counterGroup:{ name:null counters:{} } }} channels:{ch1=org.apache.flume.channel.MemoryChannel{name: ch1}} }
13/01/22 17:31:48 INFO nodemanager.DefaultLogicalNodeManager: Starting Channel ch1
13/01/22 17:31:48 INFO instrumentation.MonitoredCounterGroup: Monitoried counter group for type: CHANNEL, name: ch1, registered successfully.
13/01/22 17:31:48 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: ch1 started
13/01/22 17:31:48 INFO nodemanager.DefaultLogicalNodeManager: Starting Sink hdfs-sink
13/01/22 17:31:48 INFO nodemanager.DefaultLogicalNodeManager: Starting Source exec-source
13/01/22 17:31:48 INFO source.ExecSource: Exec source starting with command:tail -F /var/log/apache2/access.log
13/01/22 17:31:48 INFO instrumentation.MonitoredCounterGroup: Monitoried counter group for type: SINK, name: hdfs-sink, registered successfully.
13/01/22 17:31:48 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: hdfs-sink started
But it's not writing logs to HDFS.
When I run cat /var/log/apache2/access.log instead of tail –F /var/log/apache2/access.log it runs and my files are created on HDFS.
Please help me out.
"tail -F" by default prints only last 10 lines of file at start. It seems that 10 lines is not enough to fill HDFS block, so you don't see anything written by Flume.
You can:
Try "tail -n $X -F" to print last X lines at start (value of X will vary depending on size of block in your HDFS setup)
Wait until access.log will grow big enough while Flume is running (again, time to wait will depend on size of block and rate of access.log growing; in production mode it will be fast enough, I think)
Add follwing lines to your flume.conf. It will force Flume to roll new file every 10 seconds regardless of size of written data (assuming it is not zero):
agent.sinks.hdfs-sink.hdfs.rollInterval = 10
agent.sinks.hdfs_sink.hdfs.rollSize = 0

How can I get logs collected on console using Flume NG?

I'm testing Flume NG (1.2.0) for collecting logs. It's a simple test that Flume collects a log file flume_test.log and prints collected logs to console as sysout. conf/flume.conf is:
agent.sources = tail
agent.channels = memoryChannel
agent.sinks = loggerSink
agent.sources.tail.type = exec
agent.sources.tail.command = tail -f /Users/pj/work/flume_test.log
agent.sources.tail.channels = memoryChannel
agent.sinks.loggerSink.channel = memoryChannel
agent.sinks.loggerSink.type = logger
agent.channels.memoryChannel.type = memory
agent.channels.memoryChannel.capacity = 100
And I ran Flume as following:
$ $FLUME_HOME/bin/flume-ng agent --conf $FLUME_HOME/conf --conf-file $FLUME_HOME/conf/flume.conf --name agent1 -Dflume.root.logger=DEBUG,console
After running Flume logs on console are:
Info: Sourcing environment configuration script /usr/local/lib/flume-ng/conf/flume-env.sh
+ exec /Library/Java/JavaVirtualMachines/jdk1.7.0_07.jdk/Contents/Home/bin/java -Xmx20m -Dflume.root.logger=DEBUG,console -cp '/usr/local/lib/flume-ng/conf:/usr/local/lib/flume-ng/lib/*' -Djava.library.path= org.apache.flume.node.Application --conf-file /usr/local/lib/flume-ng/conf/flume.conf --name agent1
2012-09-12 18:23:52,049 (main) [INFO - org.apache.flume.lifecycle.LifecycleSupervisor.start(LifecycleSupervisor.java:67)] Starting lifecycle supervisor 1
2012-09-12 18:23:52,052 (main) [INFO - org.apache.flume.node.FlumeNode.start(FlumeNode.java:54)] Flume node starting - agent1
2012-09-12 18:23:52,054 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.start(DefaultLogicalNodeManager.java:187)] Node manager starting
2012-09-12 18:23:52,056 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.lifecycle.LifecycleSupervisor.start(LifecycleSupervisor.java:67)] Starting lifecycle supervisor 9
2012-09-12 18:23:52,054 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.conf.file.AbstractFileConfigurationProvider.start(AbstractFileConfigurationProvider.java:67)] Configuration provider starting
2012-09-12 18:23:52,056 (lifecycleSupervisor-1-0) [DEBUG - org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.start(DefaultLogicalNodeManager.java:191)] Node manager started
2012-09-12 18:23:52,057 (lifecycleSupervisor-1-1) [DEBUG - org.apache.flume.conf.file.AbstractFileConfigurationProvider.start(AbstractFileConfigurationProvider.java:86)] Configuration provider started
2012-09-12 18:23:52,058 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:188)] Checking file:/usr/local/lib/flume-ng/conf/flume.conf for changes
2012-09-12 18:23:52,058 (conf-file-poller-0) [INFO - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:195)] Reloading configuration file:/usr/local/lib/flume-ng/conf/flume.conf
2012-09-12 18:23:52,063 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:902)] Added sinks: loggerSink Agent: agent
2012-09-12 18:23:52,063 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:988)] Processing:loggerSink
2012-09-12 18:23:52,063 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:992)] Created context for loggerSink: type
2012-09-12 18:23:52,063 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:988)] Processing:loggerSink
2012-09-12 18:23:52,063 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:295)] Starting validation of configuration for agent: agent, initial-configuration: AgentConfiguration[agent]
SOURCES: {tail={ parameters:{command=tail -f /Users/pj/work/flume_test.log, channels=memoryChannel, type=exec} }}
CHANNELS: {memoryChannel={ parameters:{capacity=100, type=memory} }}
SINKS: {loggerSink={ parameters:{type=logger, channel=memoryChannel} }}
2012-09-12 18:23:52,068 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateChannels(FlumeConfiguration.java:450)] Created channel memoryChannel
2012-09-12 18:23:52,082 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSinks(FlumeConfiguration.java:649)] Creating sink: loggerSink using LOGGER
2012-09-12 18:23:52,085 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:353)] Post validation configuration for agent
AgentConfiguration created without Configuration stubs for which only basic syntactical validation was performed[agent]
SOURCES: {tail={ parameters:{command=tail -f /Users/pj/work/flume_test.log, channels=memoryChannel, type=exec} }}
CHANNELS: {memoryChannel={ parameters:{capacity=100, type=memory} }}
AgentConfiguration created with Configuration stubs for which full validation was performed[agent]
SINKS: {loggerSink=ComponentConfiguration[loggerSink]
CONFIG:
CHANNEL:memoryChannel
}
2012-09-12 18:23:52,085 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:117)] Channels:memoryChannel
2012-09-12 18:23:52,085 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:118)] Sinks loggerSink
2012-09-12 18:23:52,085 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:119)] Sources tail
2012-09-12 18:23:52,085 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:122)] Post-validation flume configuration contains configuration for agents: [agent]
2012-09-12 18:23:52,085 (conf-file-poller-0) [WARN - org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:227)] No configuration found for this host:agent1
I think Flume started normally, so I put a bunch of lines to flume_test.log continuously. But it doesn't print added lines to flume_test.log on console.
What is the problem with this test? Thanks for any comments and corrections.
The problem was name mismatch between the agent name in flume.conf (agent) and the agent name after --name (agent1) in the startup script.
After changing the name option from --name agent1 to --name agent, problem solved.
Thanks for my colleague Lenny.

Resources