Unable to read Pub/Sub messages from local emulator in Apache beam - docker

I am trying to run a simple Apache Beam pipeline with the DirectRunner that reads from a Pub/Sub subscription and writes the messages to disk.
The pipeline works fine when I run it against GCP, however when I try to run it against my local Pub/Sub emulator, it doesn't seem to be doing anything.
I am using a custom Options class that extends the org.apache.beam.sdk.io.gcp.pubsub.PubsubOptions class.
public interface Options extends PubsubOptions {
#Description("Pub/Sub subscription to read the input from")
#Required
ValueProvider<String> getInputSubscription();
void setInputSubscription(ValueProvider<String> valueProvider);
}
The pipeline is quite simple
pipeline
.apply("Read Pub/Sub Messages", PubsubIO.readMessagesWithAttributes()
.fromSubscription(options.getInputSubscription()))
.apply("Add a fixed window", Window.into(FixedWindows.of(Duration.standardSeconds(WINDOW_SIZE))))
.apply("Convert Pub/Sub To String", new PubSubMessageToString())
.apply("Write Pub/Sub messages to local disk", new WriteOneFilePerWindow());
The pipeline is executed with the following options
mvn compile exec:java \
-Dexec.mainClass=DefaultPipeline \
-Dexec.cleanupDaemonThreads=false \
-Dexec.args=" \
--project=my-project \
--inputSubscription=projects/my-project/subscriptions/my-subscription \
--pubsubRootUrl=http://127.0.0.1:8681 \
--runner=DirectRunner"
I am using this Pub/Sub emulator docker image and executing it with the following command:
docker run --rm -ti -p 8681:8681 -e PUBSUB_PROJECT1=my-project,topic:my-subscription marcelcorso/gcloud-pubsub-emulator:latest
Is there more configuration required to make this work?

Turns out that an Apache Beam pipeline is unable to read from a local Pub/Sub emulator if you have GOOGLE_APPLICATION_CREDENTIALS environment variable set.
Once I removed this environment variable which was pointing to a GCP service account, the pipeline worked seamlessly with the local Pub/Sub emulator.

You can troubleshoot the local emulator by issuing manual HTTP requests to it (via curl), like so:
$ curl -d '{"messages": [{"data": "c3Vwc3VwCg=="}]}' -H "Content-Type: application/json" -X POST localhost:8681/v1/projects/my-project/topics/topic:publish
{
"messageIds": ["5"]
}
$
$ curl -d '{"returnImmediately":true, "maxMessages":1}' -H "Content-Type: application/json" -X POST localhost:8681/v1/projects/my-project/subscriptions/my-subscription:pull
{
"receivedMessages": [{
"ackId": "projects/my-project/subscriptions/my-subscription:9",
"message": {
"data": "c3Vwc3VwCg==",
"messageId": "5",
"publishTime": "2019-04-30T17:26:09Z"
}
}]
}
$
Or by pointing the gcloud command-line tool at it:
$ CLOUDSDK_API_ENDPOINT_OVERRIDES_PUBSUB=localhost:8681 gcloud pubsub topics list
Also, note that when the emulator comes up, it creates the topic and subscription from scratch, so there are no messages on them. If your pipeline expects to immediately pull messages on the subscription, that would explain why it seems “stuck”. Note that when you run the pipeline at GCP, the topic and subscription you use there may already have messages on them.

Related

how to setup Apache Flume with HTTP source and save data in local using File_Roll Sink

The objective of this Question is How to create an Apache Flume setup in which we will get data from HTTP Flume Source and using File-Roll Flume Sink, we will save that data. Here we have taken input data from the user. After receiving the data from the user, we will save that data in a text file.
This configuration file is used to configure the Flume service. Using this file the Flume service runs with HTTP and saves the information in the files.
Http_Source.conf
# Base Config
a1.sources=r1
a1.sinks=k1
a1.channels=c1
# Configure the source
a1.sources.r1.type=http
#a1.sources.r1.bind=localhost
a1.sources.r1.port=8888
# Sink Configuration
a1.sinks.k1.type=file_roll
a1.sinks.k1.sink.rollInterval=60
a1.sinks.k1.sink.directory=/home/flumedata/
# Channel configuration
a1.channels.c1.type=memory
a1.channels.c1.capacity=1000
# Link stuff together
a1.sources.r1.channels=c1
a1.sinks.k1.channel=c1
Now using this command to run the Flume service.
./bin/flume-ng agent --conf $FLUME_HOME/conf --conf-file $FLUME_HOME/conf/nc_source.conf --name a1 -Dflume.root.logger=INFO,console
Now after the flume service starting, the client sends data to the flume...
curl --location --request POST 'http://localhost:8888' \
--header 'Content-Type: application/json' \
--data-raw '[{"body": "type here data to send flume"}]'
Creates the data file at the location mentioned in the config file (/home/flumedata/).

FORBIDDEN/12/index read-only / allow delete (api) problem

When importing items into my Rails app I keep getting the above error being raised by SearchKick on behalf of Elasticsearch.
I'm running Elasticsearch in a Docker. I start my app by running docker-compose up. I've tried running the command recommended above but i just get "No such file or directory" returned. Any ideas?
I do have port 9200 exposed to outside but nothing seems to help. Any ideas?
Indeed, running curl -XPUT -H "Content-Type: application/json" http://localhost:9200/_all/_settings -d '{"index.blocks.read_only_allow_delete": null}' as suggested by #Nishant Saini resolves the very similar issue I ran just into.
I hit disk watermarks limits on my machine.
Use the following command in linux:
curl -s -H 'Content-Type: application/json' -XPUT 'http://localhost:9200/_all/_settings?pretty' -d ' {
"index":{
"blocks" : {"read_only_allow_delete":"false"}
}
}'
the same command in Kibana's DEV TOOL format :
PUT _all/_settings
{
"index":{
"blocks" : {"read_only_allow_delete":"false"}
}
}

spring cloud dataflow http post errors

trying the a demo on getting-started from Spring-Cloud DataFlow's website, I debugged the Spring-Cloud DataFlow local server 1.4.0.RELEASE in IDEA, ran the SCDF shell with command line(windows),and i finished the steps as follows in shell:
app register --name http --type source --uri maven://org.springframework.cloud.stream.app:http-source-rabbit:1.2.0.RELEASE
app register --name log --type sink --uri maven://org.springframework.cloud.stream.app:log-sink-rabbit:1.1.0.RELEASE
stream create --name httptest --definition "http --server.port=9000 | log" --deploy
http post --target http://localhost:9000 --data "hello world"
step 1-3 were fine, but when I ran the step 4, I kept getting error messages like:
{"exception":"java.nio.charset.UnsupportedCharsetException",
"path":"/",
"error":"Internal Server Error",
"Message":"x-ibm1166",
"timestamp":1524899078020,
"status":500
}

Error face following tutorial on REST persistent data Store on Hyperledger composer

https://i.imgur.com/nGh5orv.png
I am setting this up in a AWS ec2 environment.Everything works fine till I tried doing a multi-user mode.
I am facing this issue where I had setup the mongoldb persistent data store following the tutorials.
Here is my setup on the envvars.txt
COMPOSER_CARD=admin#property-network
COMPOSER_NAMESPACES=never
COMPOSER_AUTHENTICATION=true
COMPOSER_MULTIUSER=true
COMPOSER_PROVIDERS='{
"github": {
"provider": "github",
"module": "passport-github",
"clientID": "xxxx",
"clientSecret": "xxxx
"authPath": "/auth/github",
"callbackURL": "/auth/github/callback",
"successRedirect": "/",
"failureRedirect": "/"
}
}'
COMPOSER_DATASOURCES='{
"db": {
"name": "db",
"connector": "mongodb",
"host": "mongo"
}
}'
And I had changed the connection profile of both h1lfv1 and admin#xxx-network to 0.0.0.0 as seen here.
https://github.com/hyperledger/composer/issues/1784
I tried his solution here and it doesn't work.
Thank you!
Currently there's an issue with admin re-enrolling (strictly an issue with REST server) even though the admin card has a certificate (it ignores it - but fixed in 0.18.x).
Further, there's a hostname resolution issue which you'll need to address because Docker needs to be able to resolve the container names from within the persistent REST server container - we will need to change the hostnames to represent the docker resolvable hostnames as they are current set to localhost values - (example shows a newly issued 'restadmin' card that was created for the purposes of using it to start the REST server and using the standard 'Developer setup' Composer environment):
Create a REST Adninistrator identity restadmin and an associated business network card (used to launch the REST server later).
composer participant add -c admin#property-network -d '{"$class":"org.hyperledger.composer.system.NetworkAdmin", "participantId":"restadmin"}'
Issue a 'restadmin' identity, mapped to the above participant:
composer identity issue -c admin#property-network -f restadmin.card -u restadmin -a "resource:org.hyperledger.composer.system.NetworkAdmin#restadmin"
Import and test the card:
composer card import -f restadmin.card
composer network ping -c restadmin#property-network
run this one-liner to carry out the resolution changes easily:
sed -e 's/localhost:/orderer.example.com:/' -e 's/localhost:/peer0.org1.example.com:/' -e 's/localhost:/peer0.org1.example.com:/' -e 's/localhost:/ca.org1.example.com:/' < $HOME/.composer/cards/restadmin#property-network/connection.json > /tmp/connection.json && cp -p /tmp/connection.json $HOME/.composer/cards/restadmin#property-network
Try running the REST server with the card -c restadmin#property-network - if you're running this tutorial https://hyperledger.github.io/composer/latest/integrating/deploying-the-rest-server then you will need to put this CARD NAME in the top of your envvars.txt and then ensure you run source envvars.txt to get it set 'in your current shell environment'
If you wish to issue further identities - say kcoe below - from the REST client (given you're currently 'restadmin') you simply do the following (first two can be done in Playground too FYI):
composer participant add -c admin#trade-network -d '{"$class":"org.acme.trading.Trader","tradeId":"trader2", "firstName":"Ken","lastName":"Coe"}'
composer identity issue -c admin#trade-network -f kcoe.card -u kcoe -a "resource:org.acme.trading.Trader#trader2"
composer card import -f kcoe.card # imported to the card store
Next - one-liner to get docker hostname resolution right, from inside the persistent dockerized REST server:
sed -e 's/localhost:/orderer.example.com:/' -e 's/localhost:/peer0.org1.example.com:/' -e 's/localhost:/peer0.org1.example.com:/' -e 's/localhost:/ca.org1.example.com:/' < $HOME/.composer/cards/kcoe#trade-network/connection.json > /tmp/connection.json && cp -p /tmp/connection.json $HOME/.composer/cards/kcoe#trade-network
Start your REST server as per the Deploy REST server doc:
docker run \
-d \
-e COMPOSER_CARD=${COMPOSER_CARD} \
-e COMPOSER_NAMESPACES=${COMPOSER_NAMESPACES} \
-e COMPOSER_AUTHENTICATION=${COMPOSER_AUTHENTICATION} \
-e COMPOSER_MULTIUSER=${COMPOSER_MULTIUSER} \
-e COMPOSER_PROVIDERS="${COMPOSER_PROVIDERS}" \
-e COMPOSER_DATASOURCES="${COMPOSER_DATASOURCES}" \
-v ~/.composer:/home/composer/.composer \
--name rest \
--network composer_default \
-p 3000:3000 \
myorg/my-composer-rest-server
From the System REST API in http://localhost:3000/explorer - go to the POST /wallet/import operation and import the card file kcoe.card with (in this case) the card name set to kcoe#trade-network and click on 'Try it Out' to import it - it should return a successful (204) response.
This is set as the default ID in the Wallet via System REST API endpoint
(if you need to set any further imported cards as the default card name in our REST client Wallet - go to the POST /wallet/name/setDefault/ method and choose the card name and click on Try it Out. This would now the default card).
Test it out - try getting a list of Traders (trade-network example):
Return to the Trader methods in the REST API client and expand the /GET Trader endpoint then click 'Try it Out' . It should confirm that we are now using a card in the business network, and should be able to interact with the REST Server and get a list of Traders (that were added to your business network)..

Nagios Percona Monitoring Plugin

I was reading a blog post on Percona Monitoring Plugins and how you can somehow monitor a Galera cluster using pmp-check-mysql-status plugin. Below is the link to the blog demonstrating that:
https://www.percona.com/blog/2013/10/31/percona-xtradb-cluster-galera-with-percona-monitoring-plugins/
The commands in this tutorial are run on the command line. I wish to try these commands in a Nagios .cfg file e.g, monitor.cfg. How do i write the services for the commands used in this tutorial?
This was my attempt and i cannot figure out what the best parameters to use for check_command on the service. I am suspecting that where the problem is.
So inside my /etc/nagios3/conf.d/monitor.cfg file, i have the following:
define host{
use generic-host
host_name percona-server
alias percona
address 127.0.0.1
}
## Check for a Primary Cluster
define command{
command_name check_mysql_status
command_line /usr/lib/nagios/plugins/pmp-check-
mysql-status -x wsrep_cluster_status -C == -T str -c non-Primary
}
define service{
use generic-service
hostgroup_name mysql-servers
service_description Cluster
check_command pmp-check-mysql-
status!wsrep_cluster_status!==!str!non-Primary
}
When i run the command Nagios and go to monitor it, i get this message in the Nagios dashboard:
status: UNKNOWN; /usr/lib/nagios/plugins/pmp-check-mysql-status: 31:
shift: can't shift that many
You verified that:
/usr/lib/nagios/plugins/pmp-check-mysql-status -x wsrep_cluster_status -C == -T str -c non-Primary
works fine on command line on the target host? I suspect there's a shell escape issue with the ==
Does this work well for you? /usr/lib64/nagios/plugins/pmp-check-mysql-status -x wsrep_flow_control_paused -w 0.1 -c 0.9

Resources