Flume... How to configure source and sink in different VMs? - flume

I wish to configure the avro SOurce in VM1 and Sink in VM2.
The Source should be able to connect to the channels in VM2 and channels will send data to sink in VM2. How should I proceed to configure this?
Please help.

Related

Collect metrics from a Prometheus server to telegraf

I have a prometheus server running on a K8s instance and telegraf on a different cluster. Is there some way to pull metrics from the prometheus server using telegraf? I know there is telegraf support for scraping metrics from prometheus clients but I am looking to get these metrics from the prometheus server.
Thanks
there is this thing inside data sources, called scrapers, its a tab, you just need to put the url of the server.
I am trying to configure this using cli, but i can only do it with gui
There is a prometheus remote write parser (https://github.com/influxdata/telegraf/tree/master/plugins/parsers/prometheusremotewrite), I think it will be included in the 1.19.0 release of Telegraf. If you want to try it out now you can use a nightly build. (https://github.com/influxdata/telegraf#nightly-builds)
Configure your prometheus remote write towards telegraf and configure the input plugin to listen for traffic on the port that you configured. For convenience sake, have the output plugin configured to file so you can see the metrics in a file, almost immediately

Mosquitto - subscribe to topics on a local bridge

I'm an MQTT newbie, so maybe this is obivous but I'm not getting it.
I've got IoT devices that publish data to a cloud MQTT broker. I can't change that. I want to be able to get the messages from the cloud broker and pass them to IoT Hub in Azure. Here's what I've done so far:
Configured a VM running CentOS to host my Mosquitto server
Installed Mosquitto and configured as a bridge to IoT Hub (IoTHubBridge)
Created a separate Mosquitto config to bridge to the cloud MQTT broker (CloudBridge)
Note that both Mosquitto bridge instances are running on the same VM.
So far, so good. IoT Hub can receive test messages that pass through IoTHubBridge and CloudBridge receives messages from the cloud broker. Here's where I'm stuck - how do I get messages to pass from CloudBridge to IoTHubBridge?
Thanks!
As hashed out in the comments.
There is no need for 2 MQTT brokers here. You should configure both bridges in a single broker, that way with the right topic declarations for the bridges messages should just flow between the IoT Hub and Cloud brokers.
This does assume that the topic/message structure for the cloud broker is compatible with what you need to send to IoT hub. The bridge will allow you to add/remove a prefix from the topic but not totally remap it. And there is no way to change the payload format.
If you need to make changes to the payload format or major changes to the topic structure then a bridge is not the right solution. You will need to create an application that subscribes to the cloud broker and then republishes the transformed message to the IoT Hub broker. There are many ways to do this in any number of languages but I might suggest you look at something like Node-RED if you are not already comfortable with an existing language/MQTT client combination.

Flume agent: How flume agent gets data from a webserver located in different physical server

I am trying to understand Flume and referring to the official page of flume at flume.apache.org
In particular, referring to this section, I am bit confused in this.
Do we need to run the flume agent on the actual webserver or can we run flume agents in a different physical server and acquire data from webserver?
If above is correct, then how flume agent gets the data from webserver logs? How can webserver make its data available to the flume agent ?
Can anyone help understand this?
The Flume agent must pull data from a source, publish to a channel, which then writes to a sink.
You can install Flume agent in either a local or remote configuration. But, keep in mind that having it remote will add some network latency to your event processing, if you are concerned about that. You can also "multiplex" Flume agents to have one remote aggregation agent, then individual local agents on each web server.
Assuming a flume agent is locally installed using a Spooldir or exec source, it'll essentially tail any file or run that command locally. This is how it would get data from logs.
If the Flume agent is setup as a Syslog or TCP source (see Data ingestion section on network sources), then it can be on a remote machine, and you must establish a network socket in your logging application to publish messages to the other server. This is a similar pattern to Apache Kafka.

Kafka source and HDFS sink in Spring cloud Data flow

I am using Kafka as the source and I want to write the messages on Kafka to HDFS using HDFS sink.But I see the file getting created on the HDFS but the message on Kafka is not written to the HDFS file.Please find below the Stream DSL.
stream create --definition ":streaming > hdfs --spring.hadoop.fsUri=hdfs://127.0.0.1:50071 --hdfs.directory=/ws/output --hdfs.file-name=kafkastream --hdfs.file-extension=txt --spring.cloud.stream.bindings.input.consumer.headerMode=raw" --name mykafkastream
Please help me resolve this.
It could be that the data isn't written to the hdfs disk yet. You can force a flush/sync while you are testing. Try setting --hdfs.enable-sync=true --hdfs.flush-timeout=10000 that way the data is written to hdfs every 10s whether the buffer is full or not.

Can flume source act as a jms consumer

I have just started looking into flume for writing messages to hdfs using the hdfs sink. I am wondering if flume source can act as a jms consumer for my message broker.
Does flume provide integration with message broker.
Or do i need to write a custom jms client that will push messages to a flume source.
Flume 1.3 does not provide JMS source out of the box. However, it seems that such component will be shipped with the next version, Flume 1.4: https://issues.apache.org/jira/browse/FLUME-924.
Meanwhile you can get the source code of this new component here and build it. AFAIK, Flume 1.4 does not break its interfaces, so probably the component will work with 1.3 as well.

Resources