Capturing networking traffic using Tshark and Flume - flume

Hello everyone i'm trying to capture a network traffic using tshark and i'm using apache flume to send those results to spark .
Problem is when i use the exec source in flume's configuration flume doesn't work it stops instantly right after starting
My configuration :
agent.sources = tsharkSource
agent.channels = memoryChannel
agentasinks = avroSink
# Configuring the sources to pull the bashes from Tshark
agent.sources.tsharkSource.type = exec
agent.sources.tsharkSource.command = tshark -T json
agent.sources.tsharkSource.channels = memoryChannel
# Configuring the sink to push logs to spark change hostname to 116's ip adress
agent.sinks.avroSink.type = avro
agent.sinks.avroSink.channel = memoryChannel
agent.sinks.avroSink.hostname = 192.168.1.112
agent.sinks.avroSink.port= 6969
# Configuring the memory channel
agent.channels.memoryChannel.type = memory
agent.channels.memoryChannel.capacity = 1000
agent.channels.memoryChannel.transactionCapacity = 100
the shell output :
flume-ng agent -f conf/flume-conf.properties -n agent
Warning: No configuration directory set! Use --conf <dir> to override.
Info: Including Hive libraries found via () for Hive access
+ exec /usr/lib/jum/jaua-1.11.0-openjdk-amd64//bin/jaua -Xmx20m -cp
1/home/oshiflume/flume/apache-flume-1.8.0-bin/lib/*:/libfle -
Djava.library.path= org.apache.flume.node.Application -f conf/flume-
conf.properties -n agent
log4j:WARN No appenders could be found for logger (org.apache.flume.node.Application).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.htmlDnoconfig for more info.

Your execution command is
flume-ng agent -f conf/flume-conf.properties -n agent
I see two mistakes here. First, you must specify the configuration directory with -c conf and generally flume configuration files are named as some-config.conf
The warning on the console is No configuration directory set! Use --conf, -c and --conf are the same thing.
You may want to rename your configuration file from flume-conf.properties to flume.conf
As a solution you can try this command:
flume-ng agent -c conf -f conf/flume.conf -n agent
If you want to display the logs after execution use this command
flume-ng agent -c conf -f conf/flume.conf -n agent -Dflume.root.logger=INFO,console
To display the logs log4j.properties must be in your conf directory as conf/log4j.properties.
My properties are below:
flume.root.logger=INFO,LOGFILE
flume.log.dir=./logs
flume.log.file=flume.log
log4j.logger.org.apache.flume.lifecycle = INFO
log4j.logger.org.jboss = WARN
log4j.logger.org.mortbay = INFO
log4j.logger.org.apache.avro.ipc.NettyTransceiver = WARN
log4j.logger.org.apache.hadoop = INFO
log4j.logger.org.apache.hadoop.hive = ERROR
# Define the root logger to the system property "flume.root.logger".
log4j.rootLogger=${flume.root.logger}
log4j.appender.LOGFILE=org.apache.log4j.RollingFileAppender
log4j.appender.LOGFILE.MaxFileSize=100MB
log4j.appender.LOGFILE.MaxBackupIndex=10
log4j.appender.LOGFILE.File=${flume.log.dir}/${flume.log.file}
log4j.appender.LOGFILE.layout=org.apache.log4j.PatternLayout
log4j.appender.LOGFILE.layout.ConversionPattern=%d{dd MMM yyyy HH:mm:ss,SSS} %-5p [%t] (%C.%M:%L) %x - %m%n
log4j.appender.DAILY=org.apache.log4j.rolling.RollingFileAppender
log4j.appender.DAILY.rollingPolicy=org.apache.log4j.rolling.TimeBasedRollingPolicy
log4j.appender.DAILY.rollingPolicy.ActiveFileName=${flume.log.dir}/${flume.log.file}
log4j.appender.DAILY.rollingPolicy.FileNamePattern=${flume.log.dir}/${flume.log.file}.%d{yyyy-MM-dd}
log4j.appender.DAILY.layout=org.apache.log4j.PatternLayout
log4j.appender.DAILY.layout.ConversionPattern=%d{dd MMM yyyy HH:mm:ss,SSS} %-5p [%t] (%C.%M:%L) %x - %m%n
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d (%t) [%p - %l] %m%n

Related

Can we print the configurations on which the MLflow server has started?

I am using the following command to start the MLflow server:
mlflow server --backend-store-uri postgresql://mlflow_user:mlflow#localhost/mlflow --artifacts-destination <S3 bucket location> --serve-artifacts -h 0.0.0.0 -p 8000
Before production deployment, we have a requirement that we need to print or fetch the under what configurations the server is running. For example, the above command uses localhost postgres connection and S3 bucket.
Is there a way to achieve this?
Also, how do I set the server's environment as "production"? So finally I should see a log like this:
[LOG] Started MLflow server:
Env: production
postgres: localhost:5432
S3: <S3 bucket path>
You can wrap it in a bash script or in a Makefile script, e.g.
start_mlflow_production_server:
#echo "Started MLflow server:"
#echo "Env: production"
#echo "postgres: localhost:5432"
#echo "S3: <S3 bucket path>"
#mlflow server --backend-store-uri postgresql://mlflow_user:mlflow#localhost/mlflow --artifacts-destination <S3 bucket location> --serve-artifacts -h 0.0.0.0 -p 8000
Additionally, it you can set and use environment variables specific to that server and print and use those in the command.

Use environment variables in wildlfy datasource definition (file)

I want to repackage my WAR application as self containing docker-image - currently still deploying as war to wildfly 19.
Since I don´t want to have the database password and/or URL be part of the docker image I want it to be configurable from outside - as environment variable.
So my current docker image includes a wildfly datasource definition as -ds.xml file with env placeholders since according to
https://blog.imixs.org/2017/03/17/use-environment-variables-wildfly-docker-container/
and other sources this should be possible.
My DS file is
<datasources xmlns="http://www.jboss.org/ironjacamar/schema">
<datasource jndi-name="java:jboss/datasources/dbtDS" pool-name="benchmarkDS">
<driver>dbt-datasource.ear_com.mysql.jdbc.Driver_5_1</driver>
<connection-url>${DB_CONNECTION_URL,env.DB_CONNECTION_URL}</connection-url>
<security>
<user-name>${DB_USERNAME,env.DB_USERNAME}</user-name>
<password>${DB_PASSWORD,env.DB_PASSWORD}</password>
</security>
<pool>[...]</pool>
</datasource>
</datasources>
But starting the docker container leads always to not recognizing the environment variables:
11:00:38,790 WARN [org.jboss.jca.core.connectionmanager.pool.strategy.OnePool] (JCA PoolFiller) IJ000610: Unable to fill pool: java:jboss/datasources/dbtDS: javax.resource.ResourceException: IJ031084: Unable to create connection
at org.jboss.ironjacamar.jdbcadapters#1.4.22.Final//org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.createLocalManagedConnection(LocalManagedConnectionFactory.java:345)
[...]
Caused by: javax.resource.ResourceException: IJ031083: Wrong driver class [com.mysql.jdbc.Driver] for this connection URL []
at org.jboss.ironjacamar.jdbcadapters#1.4.22.Final//org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.createLocalManagedConnection(LocalManagedConnectionFactory.java:323)
last line says, that DS_CONNECTION_URL seems to be empty - tried several combinations - believe me.
Wrong driver class [com.mysql.jdbc.Driver] for this connection URL []
I´m starting my container with
docker run --name="dbt" --rm -it -p 8080:8080 -p 9990:9990 -e DB_CONNECTION_URL="jdbc:mysql://127.0.0.1:13306/dbt?serverTimezone=UTC" -e DB_USERNAME="dbt" -e DB_PASSWORD="_dbt" dbt
I even modified the standalone.sh to output environments and DB_CONNECTION_URL IS there.
=========================================================================
JBoss Bootstrap Environment
JBOSS_HOME: /opt/jboss/wildfly
JAVA: /usr/lib/jvm/java/bin/java
DB_CONNECTION_URL: jdbc:mysql://127.0.0.1:13306/dbt?serverTimezone=UTC JAVA_OPTS: -server -Xms64m -Xmx512m -XX:MetaspaceSize=96M -XX:MaxMetaspaceSize=256m -Djava.net.preferIPv4Stack=true -Djboss.modules.system.pkgs=org.jboss.byteman -Djava.awt.headless=true --add-exports=java.base/sun.nio.ch=ALL-UNNAMED --add-exports=jdk.unsupported/sun.misc=ALL-UNNAMED --add-exports=jdk.unsupported/sun.reflect=ALL-UNNAMED
=========================================================================
11:00:34,362 INFO [org.jboss.modules] (main) JBoss Modules version 1.10.1.Final
11:00:34,854 INFO [org.jboss.msc] (main) JBoss MSC version 1.4.11.Final
11:00:34,863 INFO [org.jboss.threads] (main) JBoss Threads version 2.3.3.Final
[...]
So what am I doing wrong to enable wildfly to replace placeholders in my DS file??
They seem to be processed - since they evaluate to empty. But they should contain something...
Any suggestions appreciated.
Current Dockerfile
[...] building step above [...]
FROM jboss/wildfly:20.0.1.Final
USER root
RUN yum -y install zip wget && yum clean all
RUN sed -i 's/echo " JAVA_OPTS/echo " DB_CONNECTION_URL: $DB_CONNECTION_URL JAVA_OPTS/g' /opt/jboss/wildfly/bin/standalone.sh && \
cat /opt/jboss/wildfly/bin/standalone.sh
RUN sed -i 's/<spec-descriptor-property-replacement>false<\/spec-descriptor-property-replacement>/<spec-descriptor-property-replacement>true<\/spec-descriptor-property-replacement><jboss-descriptor-property-replacement>true<\/jboss-descriptor-property-replacement><annotation-property-replacement>true<\/annotation-property-replacement>/g' /opt/jboss/wildfly/standalone/configuration/standalone.xml
USER jboss
COPY --from=0 /_build/dbt-datasource.ear /opt/jboss/wildfly/standalone/deployments/
ADD target/dbt.war /opt/jboss/wildfly/standalone/deployments/
Answere to myself - perhaps good to know for others later:
Placeholder in -ds.xml files are NOT supported(!).
I added the same datasource definition in the standalone.xml by patching with sed and now it works without further modification more or less out of the box.

ActiveMQ within Wildfly on a Docker container gives: Invalid "host" value "0.0.0.0" detected

I have Wildfly running in a Docker container.
Within Wildfly the messaging-activemq subsystem is active.
The subsystem and extension defaults are taken from the standalone-full.xml file.
After starting wildfly, following output is displayed
[org.apache.activemq.artemis.jms.server] (ServerService Thread Pool -- 64)
AMQ121005: Invalid "host" value "0.0.0.0" detected for "http-connector" connector.
Switching to "eeb79399d447".
If this new address is incorrect please manually configure the connector to use the proper one.
The eeb79399d447 is the docker container id.
It's also impossible to connect to jms from my java client. While connecting it gives the following error.
AMQ214016: Failed to create netty connection: java.net.UnknownHostException: eeb79399d447
When I start wildfly on my local workstation (outside docker) the problem does not occur and I can connect to jms and send my messages.
Here are a few options. Option 1 & 2 may be what you asked for, but in the end didn't work for me. Option 3 however, I think will better address your intent.
Option 1) You can do this by adding some scripting to your docker image ( and not touching your standalone-full.xml. The basic idea ( credit goes to git-hub user kwart ) is to make a docker entry point that can determine the IPv4 address of the docker container before calling standalone.sh.
see : https://github.com/kwart/dockerfiles/tree/master/wildfly-ext and check out the usage of WILDFLY_BIND_ADDR. I forked it.
Notes:
GetIp.java will print out the IPv4 address ( and is copied into the container )
dockerentry-point.sh calls GetIp.java as needed
WILDFLY_BIND_ADDR=${WILDFLY_BIND_ADDR:-0.0.0.0}
if [ "${WILDFLY_BIND_ADDR}" = "auto" ]; then
WILDFLY_BIND_ADDR=`java -cp /opt/jboss GetIp`
fi
Option 2) Alternatively, using some script-fu, you may be able to do everything you need in a Dockerfile:
#CMD ["/opt/jboss/wildfly/bin/standalone.sh", "-c", "standalone-full.xml", "-b", "0.0.0.0", "-bmanagement", "0.0.0.0"]
CMD ["sh", "-c", "DOCKER_IPADDR=$(hostname --ip-address) && echo IP Address was $DOCKER_IPADDR && /opt/jboss/wildfly/bin/standalone.sh -c standalone-full.xml -b=$DOCKER_IPADDR -bmanagement=$DOCKER_IPADDR"]
Your mileage may very.
I was working with the helloworld-jms quickstart from the WildFly docs, and had to jump through some extra hoops to get the JMS queue created. Even at that point, the sample java code wasn't able to connect with either option 1 or option 2.
Option 3) ( This worked for me btw ) Start your container with binding to 0.0.0.0, expose your 8080 port for your JMS client running on the host, and add an entry in your hosts' /etc/hosts file:
Dockerfile:
FROM jboss/wildfly
# CP foo.war /opt/jboss/wildfly/standalone/deployments/
RUN /opt/jboss/wildfly/bin/add-user.sh admin admin --silent
RUN /opt/jboss/wildfly/bin/add-user.sh -a quickstartUser quickstartPwd1! --silent
RUN echo "quickstartUser=guest" >> /opt/jboss/wildfly/standalone/configuration/application-roles.properties
# use standalone-full.xml to enable the JMS feature
CMD ["/opt/jboss/wildfly/bin/standalone.sh", "-c", "standalone-full.xml", "-b", "0.0.0.0", "-bmanagement", "0.0.0.0"]
Build & run ( expose 8080 if your client is on your host machine )
docker build -t mywildfly .
docker run -it --rm --name jboss -p127.0.0.1:8080:8080 -p127.0.0.1:9990:9990 my_wildfly
Then on the host machine ( I'm running OSX; my jboss container's id was 46d04508b92b ) add an entry in your /etc/hosts for the docker-host-name that points to 127.0.0.1:
127.0.0.1 46d04508b92b # <-- replace with your container's id
Once the wildfly container is running, you create/configure the testQueue via scripts or in the management console. My config came from https://github.com/wildfly/quickstart.git under the helloworld-jms folder:
docker cp configure-jms.cli jboss:/tmp/
docker exec jboss /opt/jboss/wildfly/bin/jboss-cli.sh --connect --file=/tmp/configure-jms.cli
and SUCCESS from mvn clean compile exec:java the host machine (from w/in the helloworld-jms folder):
Mar 28, 2018 9:03:15 PM org.jboss.as.quickstarts.jms.HelloWorldJMSClient main
INFO: Found destination "jms/queue/test" in JNDI
Mar 28, 2018 9:03:16 PM org.jboss.as.quickstarts.jms.HelloWorldJMSClient main
INFO: Sending 1 messages with content: Hello, World!
Mar 28, 2018 9:03:16 PM org.jboss.as.quickstarts.jms.HelloWorldJMSClient main
INFO: Received message with content Hello, World!
You need to edit the standalone-full.xml to cope with jms behind NAT and when you run the docker container pass though the ip and port that your jms client can use to connect, which is the ip of the machine running docker in Dockers' default config

Telegraf - inputs.procstat pgrep plugin issue

Telegraf v1.0.1
I'm not able to see telegraf[._] (tree) metric anymore after I enabled [[inputs.procstat]] plugin.
Telegraf is installed successfully. Process is running. I'm pretty much using the normal settings for inputs plugins and output plugin.
This is what I got:
ubuntu#jenkins:/tmp/giga_aks_testing/ansible$ grep -C 2 jenkins /etc/telegraf/telegraf.d/telegraf-custom-host-services-processes.conf; echo ; ps -eAf|grep jenkins; echo; pgrep -f jenkins; echo; cat -n /var/log/telegraf/telegraf.log; echo date; echo; ps -eAf|grep telegraf; echo ; sudo service telegraf status
[[inputs.procstat]]
exe = "jenkins"
prefix = "pgrep_serviceprocess"
root 2875 3685 0 2016 pts/3 00:00:00 sudo su jenkins
root 2876 2875 0 2016 pts/3 00:00:00 su jenkins
jenkins 2877 2876 0 2016 pts/3 00:00:00 bash
jenkins 11645 1 0 2016 ? 00:00:01 /usr/bin/daemon --name=jenkins --inherit --env=JENKINS_HOME=/var/lib/jenkins --output=/var/log/jenkins/jenkins.log --pidfile=/var/run/jenkins/jenkins.pid -- /usr/bin/java -Djava.awt.headless=true -jar /usr/share/jenkins/jenkins.war --webroot=/var/cache/jenkins/war --httpPort=8080
jenkins 11647 11645 0 2016 ? 05:33:22 /usr/bin/java -Djava.awt.headless=true -jar /usr/share/jenkins/jenkins.war --webroot=/var/cache/jenkins/war --httpPort=8080
ubuntu 21973 26885 0 06:57 pts/0 00:00:00 grep --color=auto jenkins
2875
2876
11645
11647
1 2017-01-07T06:54:00Z E! Error: procstat getting process, exe: [jenkins] pidfile: [] pattern: [] user: [] Failed to execute /usr/bin/pgrep. Error: 'exit status 1'
2 2017-01-07T06:55:00Z E! Error: procstat getting process, exe: [jenkins] pidfile: [] pattern: [] user: [] Failed to execute /usr/bin/pgrep. Error: 'exit status 1'
3 2017-01-07T06:56:00Z E! Error: procstat getting process, exe: [jenkins] pidfile: [] pattern: [] user: [] Failed to execute /usr/bin/pgrep. Error: 'exit status 1'
4 2017-01-07T06:57:00Z E! Error: procstat getting process, exe: [jenkins] pidfile: [] pattern: [] user: [] Failed to execute /usr/bin/pgrep. Error: 'exit status 1'
date
telegraf 19336 1 0 05:45 pts/0 00:00:04 /usr/bin/telegraf -pidfile /var/run/telegraf/telegraf.pid -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraftelegraf.d
ubuntu 21977 26885 0 06:57 pts/0 00:00:00 grep --color=auto telegraf
telegraf Process is running [ OK ]
ubuntu#jenkins:/tmp/giga_aks_testing/ansible$
Why, the log file is showing an error when the jenkins process is running and pgrep -f jenkins is returning valid result.
PS: [[inputs.procstat]] plugin uses pgrep -f <exe_value_pattern> for it's logic if pattern = method is used, and pgrep <executable> if exe = method is used.
The full /etc/telegraf/telegraf.d/telegraf-custom-host-services-processes.conf file is:
[[inputs.procstat]]
exe = "jenkins"
prefix = "pgrep_serviceprocess"
[[inputs.procstat]]
exe = "telegraf"
prefix = "pgrep_serviceprocess"
[[inputs.procstat]]
exe = "sshd"
prefix = "pgrep_serviceprocess"
OK. Seems like this is an OPEN bug.
Telegraf with [[inputs.procstat]] plugin entry won't barf if there's only one plugin in one file.
If you specify multiple entries, even if those exe = <executables_processes> are running, Telegraf will start spitting those errors out (PS: It won't stop Telegraf service from working though).
To fix the errors, this is what I did:
[[inputs.procstat]]
exe = "telegraf|.*"
prefix = "pgrep_serviceprocess"
Now, as pgrep is used for Telegraf's [[inputs.procstat]] plugin, it'll do this at OS level: pgrep "telegraf|.*".
Now, you can also just give exe = "." (simplest) or like exe = ".*" but practically those will not be easy to find out who actually is trying to do a grep on all processes running on the system.
NOTE: .* (will find every single processes running on the machine), so use it until we get a proper fix for this.
Related Source code Github file: https://github.com/influxdata/telegraf/blob/master/plugins/inputs/procstat/procstat.go
Related issue: https://github.com/influxdata/telegraf/issues/586
I still couldn't find, why "telegraf.x.x" metrics are not available after I enabled [[inputs.procstat]] input. Is that due to a separate file? I'm not sure. But, I can see procstat.x.x metric tree but telegraf.x.x metric tree is not visible now.
OR better,
One can also use:
[[inputs.procstat]]
pattern = "."
prefix = "pgrep_serviceprocess"
The above will do: pgrep -f "." where pattern is . (to catch everything aka every processs/cmd/service running on a machine).
OR (but the following is not scalable solution as you have to know for which user. In some boxes, Jenkins may be running using a user other than jenkins).
[[inputs.procstat]]
user = "jenkins"
prefix = "pgrep_serviceprocess"
The above will do: pgrep -u "jenkins" where user is jenkins (to catch everything aka every processs/cmd/service running on a machine).
To check whether jenkins is running or not or if enhanceio is running or not, you can use [[inputs.exec]] plugin as well. I simply used: [[inputs.filestat]] plugin and it worked when I looked for the pid file for both tools. https://github.com/influxdata/telegraf/tree/master/plugins/inputs/filestat

Nagios Percona Monitoring Plugin

I was reading a blog post on Percona Monitoring Plugins and how you can somehow monitor a Galera cluster using pmp-check-mysql-status plugin. Below is the link to the blog demonstrating that:
https://www.percona.com/blog/2013/10/31/percona-xtradb-cluster-galera-with-percona-monitoring-plugins/
The commands in this tutorial are run on the command line. I wish to try these commands in a Nagios .cfg file e.g, monitor.cfg. How do i write the services for the commands used in this tutorial?
This was my attempt and i cannot figure out what the best parameters to use for check_command on the service. I am suspecting that where the problem is.
So inside my /etc/nagios3/conf.d/monitor.cfg file, i have the following:
define host{
use generic-host
host_name percona-server
alias percona
address 127.0.0.1
}
## Check for a Primary Cluster
define command{
command_name check_mysql_status
command_line /usr/lib/nagios/plugins/pmp-check-
mysql-status -x wsrep_cluster_status -C == -T str -c non-Primary
}
define service{
use generic-service
hostgroup_name mysql-servers
service_description Cluster
check_command pmp-check-mysql-
status!wsrep_cluster_status!==!str!non-Primary
}
When i run the command Nagios and go to monitor it, i get this message in the Nagios dashboard:
status: UNKNOWN; /usr/lib/nagios/plugins/pmp-check-mysql-status: 31:
shift: can't shift that many
You verified that:
/usr/lib/nagios/plugins/pmp-check-mysql-status -x wsrep_cluster_status -C == -T str -c non-Primary
works fine on command line on the target host? I suspect there's a shell escape issue with the ==
Does this work well for you? /usr/lib64/nagios/plugins/pmp-check-mysql-status -x wsrep_flow_control_paused -w 0.1 -c 0.9

Resources