questions about composed processor and sink - spring-cloud-dataflow

I'm using Spring Could Data Flow 1.7.2.RELEASE and am trying to follow this blog post
to create a "a combination of processor + sink into a single application: “a new sink”."
When I structured my code as the blog's example I had problems and I figured it was because the blog's example used java.util.Function which is akin to a Processor.
I guessed I should use java.util.Consumer because I am trying to change my existing Sink into a Processor-Sink hybrid
My class looks like this:
#EnableBinding(Sink.class)
public class SampleCombinedSink extends Something {
String modifiedPayload;
Logger log;
Consumer<String> consumer = i -> { modifiedPayload="STUFF ADDED BY CONSUMER i=["+i+"]"; };
public void accept(String s){
log.info("SampleCombinedSink.accept() s="+s);
}
#StreamListener(Sink.INPUT)
public void doSink(String payload) {
consumer.accept(payload);
log.info("SampleCombinedSink.doSink() Payload received.");
log.info("SampleCombinedSink.doSink() payload="+ payload);
log.info("SampleCombinedSink.doSink() modifiedPayload="+ modifiedPayload);
}
}
My output looks like this:
SampleCombinedSink.doSink() Payload received.
SampleCombinedSink.doSink() payload=Friday 11 January 2019 19:03:53.330+0000
SampleCombinedSink.doSink() modifiedPayload=STUFF ADDED BY CONSUMER i=[Friday 11 January 2019 19:03:53.330+0000]
SampleCombinedSink.doSink() Payload received.
SampleCombinedSink.doSink() payload=Friday 11 January 2019 19:03:54.332+0000
SampleCombinedSink.doSink() modifiedPayload=STUFF ADDED BY CONSUMER i=[Friday 11 January 2019 19:03:54.332+0000]
SampleCombinedSink.doSink() Payload received.
SampleCombinedSink.doSink() payload=Friday 11 January 2019 19:03:55.333+0000
SampleCombinedSink.doSink() modifiedPayload=STUFF ADDED BY CONSUMER i=[Friday 11 January 2019 19:03:55.333+0000]
SampleCombinedSink.doSink() Payload received.
SampleCombinedSink.doSink() payload=Friday 11 January 2019 19:03:56.313+0000
SampleCombinedSink.doSink() modifiedPayload=STUFF ADDED BY CONSUMER i=[Friday 11 January 2019 19:03:56.313+0000]
My Source emits a timestamp every second.
I'm confused by my Consumer.
Consumer<String> consumer = i -> { modifiedPayload="STUFF ADDED BY CONSUMER i=["+i+"]"; };
I thought I'd be able to do something like:
Consumer<String> consumer = i -> { i="STUFF ADDED BY CONSUMER i=["+i+"]"; };
and then payload in
#StreamListener(Sink.INPUT)
public void doSink(String payload) {
would contain "STUFF ADDED BY CONSUMER i=[timestamp]"
It didn't.
I want to change the input to doSink and change it by adding "STUFF ADDED BY CONSUMER" so that when the input reaches doSink(String payload) payload will contain "STUFF ADDED BY CONSUMER i=[timestamp]"
How do I achieve that?

In this case, you don't have to change anything with your Sink application instead, just use the same function plumbing at the Sink application side.
For instance, to make:
a combination of processor + sink into a single application: “a new sink”."
all you need is to have your function beans as part of the Sink application or even having the function beans in a separate artifact but in the classpath of Sink application. Once you have this, you can define the spring.cloud.stream.function.definition for the Sink application.
You can see a sample for this here. The app log-composed has the function beans defined.
To run the sample:
dataflow:>app register --name http-transformer --type source --uri file:///Users/igopinathan/dev/git/ilayaperumalg/sandbox/function-composition/http-transformer/target/http-transformer-2.1.0.BUILD-SNAPSHOT.jar
Successfully registered application 'source:http-transformer'
dataflow:>app register --name log-composed --type sink --uri file:///Users/igopinathan/.m2/repository/org/springframework/cloud/stream/app/log-composed/2.1.0.BUILD-SNAPSHOT/log-composed-2.1.0.BUILD-SNAPSHOT.jar
Successfully registered application 'sink:log-composed'
dataflow:>stream create helloComposed --definition "http-transformer --server.port=9001 | log-composed"
Created new stream 'helloComposed'
dataflow:>stream deploy helloComposed --properties "app.log-composed.spring.cloud.stream.function.definition=upper|concat,deployer.*.local.inheritLogging=true"
Deployment request has been sent for stream 'helloComposed'
dataflow:>http post --data "friend" --target "http://localhost:9001"
In the stream deploy command, you can see the app being used to specify the function composition is log-composed.

Related

How to retrieve messages from Alpakka Mqtt Streaming client?

I was following document for writing a Mqtt client subscriber using alpakka.
https://doc.akka.io/docs/alpakka/3.0.4/mqtt-streaming.html?_ga=2.247958340.274298740.1642514263-524322027.1627936487
After the code marked in bold, I’m not sure how could I retrieve/interact with subscribed messages. Any lead?
Pair<SourceQueueWithComplete<Command>, CompletionStage> run =
Source.<Command>queue(3, OverflowStrategy.fail())
.via(mqttFlow)
.collect(
new JavaPartialFunction<DecodeErrorOrEvent, Publish>() {
#Override
public Publish apply(DecodeErrorOrEvent x, boolean isCheck) {
if (x.getEvent().isPresent() && x.getEvent().get().event() instanceof Publish)
return (Publish) x.getEvent().get().event();
else throw noMatch();
}
})
.toMat(Sink.head(), Keep.both())
.run(system);
SourceQueueWithComplete<Command> commands = run.first();
commands.offer(new Command<>(new Connect(clientId, ConnectFlags.CleanSession())));
commands.offer(new Command<>(new Subscribe(topic)));
session.tell(
new Command<>(
new Publish(
ControlPacketFlags.RETAIN() | ControlPacketFlags.QoSAtLeastOnceDelivery(),
topic,
ByteString.fromString(“ohi”))));
// for shutting down properly
commands.complete();
commands.watchCompletion().thenAccept(done → session.shutdown());
Also, in the following example, it shows how to subscribe to the client but nothing about how to get messages after the subscription.
https://github.com/pbernet/akka_streams_tutorial/blob/master/src/main/scala/alpakka/mqtt/MqttEcho.scala
Will be grateful if anyone knows the solution or can point to any resource which uses the same connector as mqtt client and can retrieve messages.
The code to retrieve messages for the subscriber is hidden in the client method which is used for both publisher and subscriber:
...
//Only the Publish events are interesting for the subscriber
.collect { case Right(Event(p: Publish, _)) => p }
.wireTap(event => logger.info(s"Client: $connectionId received: ${event.payload.utf8String}"))
.toMat(Sink.ignore)(Keep.both)
.run()
https://github.com/pbernet/akka_streams_tutorial/blob/3e4484c5356e55522366e65e42e1741c18830a18/src/main/scala/alpakka/mqtt/MqttEcho.scala#L136
I was struggling with this connector and then tried an example with the one based on Eclipse Paho, which in the end looks better:
https://github.com/pbernet/akka_streams_tutorial/blob/3e4484c5356e55522366e65e42e1741c18830a18/src/main/scala/alpakka/mqtt/MqttPahoEcho.scala#L41
Paul

How to properly configure SQS without using SNS topics in MassTransit?

I'm having some issues configuring MassTransit with SQS. My goal is to have N consumers which create N queues and each of them accept a different message type. Since I always have a 1 to 1 consumer to message mapping, I'm not interested in having any sort of fan-out behaviour. So publishing a message of type T should publish it directly to that queue. How exactly would I configure that? This is what I have so far:
services.AddMassTransit(x =>
{
x.AddConsumers(Assembly.GetEntryAssembly());
x.UsingAmazonSqs((context, cfg) =>
{
cfg.Host("aws", h =>
{
h.AccessKey(mtSettings.AccessKey);
h.SecretKey(mtSettings.SecretKey);
h.Scope($"{mtSettings.Environment}", true);
var sqsConfig = new AmazonSQSConfig() { RegionEndpoint = RegionEndpoint.GetBySystemName(mtSettings.Region) };
h.Config(sqsConfig);
var snsConfig = new AmazonSimpleNotificationServiceConfig()
{ RegionEndpoint = RegionEndpoint.GetBySystemName(mtSettings.Region) };
h.Config(snsConfig);
});
cfg.ConfigureEndpoints(context, new BusEnvironmentNameFormatter(mtSettings.Environment));
});
});
The BusEnvironmentNameFormatter class overrides KebabCaseEndpointNameFormatter and adds the environment as a prefix, and the effect is that all the queues start with 'dev', while the h.Scope($"{mtSettings.Environment}", true) line does the same for topics.
I've tried to get this working without configuring topics at all, but I couldn't get it working without any errors. What am I missing?
The SQS docs are a bit thin, but is at actually possible to do a bus.Publish() without using sns topics or are they necessary? If it's not possible, how would I use bus.Send() without hardcoding queue names in the call?
Cheers!
Publish requires the use of topics, which in the case of SQS uses SNS.
If you want to configure the endpoints yourself, and prevent the use of topics, you'd need to:
Set ConfigureConsumeTopology = false – this prevents topics from being created and connected to the receive endpoint queue.
Set PublishFaults = false – this prevents fault topics from being created when a consumer throws an exception.
Don't call Publish, because, obviously that will create a topic.
If you want to somehow establish a convention for your receive endpoint names that aligns with your ability to send messages, you could create your own endpoint name formatter that would use message types and then use those same names to call GetSendEndpoint using the queue:name short name syntax to Send messages directly to those queues.

Failing to create services on Google Cloud Run with API using Java SDK

I create a Cloud Run client, however, couldn't find a way to list a service that is deployed with Cloud Run on GKE (for Anthos).
Create the client:
HttpTransport httpTransport = new NetHttpTransport();
JsonFactory jsonFactory = new JacksonFactory();
GoogleCredentials credential = GoogleCredentials.getApplicationDefault();
credential.createScoped("https://www.googleapis.com/auth/cloud-platform");
HttpRequestInitializer requestInitializer = new HttpCredentialsAdapter(credential);
CloudRun.Builder builder = new CloudRun.Builder(httpTransport, jsonFactory, requestInitializer);
return builder.setApplicationName(applicationName)
.setRootUrl(cloudRunRootUrl)
.build();
} catch (IOException e) {
e.printStackTrace();
}
try to list services:
services = cloudRun.namespaces().services()
.list("namespaces/default")
.execute()
.getItems();
My "hello" service is deploy on a GKE cluster under the namespace default. The above code doesn't work because the client always see "default" as project_id and complains about permission stuff. If I put the project_id rather than "default", permission errors are gone, but no services will be found.
I tried another project that does have Google fully-managed cloud run services, the same code returns result (with .list("namespaces/")).
How to access the service on GKE?
And my next question would be, how to programmatically create Cloud Run services on GKE?
Edit - for creating a service
As I couldn't figure out how to interact with Cloud Run on GKE, I took a step back to try fully managed one. The following code to create a service fails, and the error message just doesn't provide much useful insight, how to make it work?
Service deployedService = null;
// Map<String,String> annotations = new HashMap<>();
// annotations.put("client.knative.dev/user-image","gcr.io/cloudrun/hello");
ServiceSpec spec = new ServiceSpec();
List<Container> containers = new ArrayList<>();
containers.add(new Container().setImage("gcr.io/cloudrun/hello"));
spec.setTemplate(new RevisionTemplate().setMetadata(new ObjectMeta().setName("hello-fully-managed-v0.1.0"))
.setSpec(new RevisionSpec().setContainerConcurrency(20)
.setContainers(containers)
.setTimeoutSeconds(100)
)
);
helloService.setApiVersion("serving.knative.dev/v1")
.setMetadata(new ObjectMeta().setName("hello-fully-managed")
.setNamespace("data-infrastructure-test-env")
// .setAnnotations(annotations)
)
.setSpec(spec)
.setKind("Service");
try {
deployedService = cloudRun.namespaces().services()
.create("namespaces/data-infrastructure-test-env",helloService)
.execute();
} catch (IOException e) {
e.printStackTrace();
response.add(e.toString());
return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR).body(response);
}
Error message I got:
com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request
{
"code" : 400,
"errors" : [ {
"domain" : "global",
"message" : "The request has errors",
"reason" : "badRequest"
} ],
"message" : "The request has errors",
"status" : "INVALID_ARGUMENT"
}
at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:150)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
And the base_url is: https://europe-west1-run.googleapis.com
Your question is quite detailed (and is about Java which I am no expert in) and there are actually too many questions in there (ideally, please ask only 1 question here). However, I'll try to answer a few things you asked:
First, Cloud Run (managed, and on GKE) both implement the Knative Serving API. I've explained this at https://ahmet.im/blog/cloud-run-is-a-knative/ In fact, Cloud Run on GKE is just the open source Knative components installed to your cluster.
And my next question would be, how to programmatically create Cloud Run services on GKE?
You will have a very hard time (if possible at all) using the Cloud Run API client libraries (e.g. new CloudRun above) because these are designed for *.googleapis.com endpoints.
The Knative API part of "Cloud Run on GKE" is actually just your Kubernetes (GKE) master API endpoint (which runs on an IP address, with a TLS certificate that isn't trusted by root CAs, but you can find the CA cert in GKE GetCluster API call to verify the cert.) The TLS is part is why it's so hard to use the API Client libraries.
Knative APIs are just Kubernetes objects. So your best bet is one of these:
See Kubernetes java client (https://github.com/kubernetes-client/java) actually allows dynamic objects. (Go implementation does) and try to use that to create Knative CRDs.
Use kubectl apply.
Ask Knative Serving open source repository for help (they should be providing client libraries, maybe they're already there I'm not sure)
To program Cloud Run (managed) with the API Client Libraries, you need to explicitly override the API endpoint to the region e.g. us-central1-run.googleapis.com. (This is documented on each API call's REST API reference documentation.)
I have written a blog post in detail (with sample code in Go) on how to create/update services on Cloud Run (managed) using the Knative Serving API here: https://ahmet.im/blog/gcloud-run-deploy/
If you want to see how gcloud run deploy works, and which APIs it calls, you can pass --log-http option to observe the request/response traffic.
As for the error you got, it seems like the error message isn't helpful, but it might be coming from anywhere (as you're trying to imitate Knative API in GCP client libraries). I recommend reading my blog posts and sample code in depth.
UPDATES: Our engineering team's looking at the issue, it appears that there's currently a bug not adding the "details" field to the error. That's being worked on.
In your case, we see the following errors from requests:
field: "spec.template.spec"
description: "Missing template spec."
Means you are not properly filling up the spec field as I shown in my blog post and sample code.
field: "metadata.name"
description: "The revision name must be prefixed by the name of the enclosing Service or Configuration with a trailing -"
Make sure the name you are specifying adheres the patterns specified in API docs. Try to create that name manually perhaps in the UI or gcloud CLI.
field: "api_version"
description: "Unsupported API version \'serving.knative.dev/v1\'. Expected \'serving.knative.dev/v1alpha1\'"
Do not use v1alpha1 API, use v1 directly.
We'll try to get the details to the error message, however it appears that you need to study the sample code I linked in my blog post more in detail:
https://github.com/GoogleCloudPlatform/cloud-run-button/blob/a52c7fbaae33a3e06c112206c7227a0ef9649647/cmd/cloudshell_open/deploy.go#L26-L112
The Java SDK is automatically generated from the fact that the Cloud Run (fully managed) API is public. It does not support Cloud Run for Anthos.
(gcloud.run.deploy) The revision name must be prefixed by the name of the enclosing Service or Configuration with a trailing -revision name
revision name name should be 65 character then problem will be resolved in Automation pipeline with GCP revision suffix should be less revision name is the combination of (service name +revision suffix) will automatically created by GCP.

Google dataflow 2.0 pubsub handler late data

I have an issue regarding goolge dataflow.
I'm writing a dataflow pipeline which reads data from PubSub, and write to BigQuery, it's works.
Now I have to handle late data and i was following some examples on intenet but it's not working properly, here is my code:
pipeline.apply(PubsubIO.readStrings()
.withTimestampAttribute("timestamp").fromSubscription(Constants.SUBSCRIBER))
.apply(ParDo.of(new ParseEventFn()))
.apply(Window.<Entity> into(FixedWindows.of(WINDOW_SIZE))
// processing of late data.
.triggering(
AfterWatermark
.pastEndOfWindow()
.withEarlyFirings(
AfterProcessingTime
.pastFirstElementInPane()
.plusDelayOf(DELAY_SIZE))
.withLateFirings(AfterPane.elementCountAtLeast(1)))
.withAllowedLateness(ALLOW_LATE_SIZE)
.accumulatingFiredPanes())
.apply(ParDo.of(new ParseTableRow()))
.apply("Write to BQ", BigQueryIO.<TableRow>write()...
Here is my pubsub message:
{
...,
"timestamp" : "2015-08-31T09:52:25.005Z"
}
When I manually push some messages(go to PupsubTopic and publish) with timestamp is << ALLOW_LATE_SIZE but these messages are still passed.
You should specify the allowed lateness formally using the "Duration" object as: .withAllowedLateness(Duration.standardMinutes(ALLOW_LATE_SIZE)), assuming you have set the value of ALLOW_LATE_SIZE in minutes.
You may check the documentation page for "Google Cloud Dataflow SDK for Java", specifically the "Triggers" sub-chapter.

Test Webhook at localhost in braintree

I am working on braintree and I want to send custom email notifications to my customers as I am working with recurring billing, so every month these custom notifications should be send to all users. For this I have to use webhooks to retrieve currently ocuured event and then send email notification according to webhook's response. (I think this is only solution in this case, If anyone know another possible solution please suggest). I want to test webhooks at my localhost first, And I have tried to create a new webhook and specified the localhost path as destination to retrieve webhooks. But this shows a error "Destination is not verified"..........
My path is : "http://127.0.0.1:81/webhook/Accept"
These are some of the tools that can be used during development of webhooks :
1) PostCatcher,
2) RequestBin,
3) ngrok,
4) PageKite and
5) LocalTunnel
http://telerivet.com/help/api/webhook/testing
https://www.twilio.com/blog/2013/10/test-your-webhooks-locally-with-ngrok.html
Well Another way to test it is by creating a WebAPI and POSTing Data to your POST method via Postman. To do this, just create a WebAPI in Visual Studio. In the API controller, create a POST method.
/// <summary>
/// Web API POST method for Braintree Webhook request
/// The data is passed through HTTP POST request.
/// A sample data set is present in POSTMAN HTTP Body
/// /api/webhook
/// </summary>
/// <param name="BTRequest">Data from HTTP request body</param>
/// <returns>Webhook notification object</returns>
public WebhookNotification Post([FromBody]Dictionary<String, String> BTRequest)
{
WebhookNotification webhook = gateway.WebhookNotification.Parse(BTRequest["bt_signature"], BTRequest["bt_payload"]);
return webhook;
}
In Postman, Post the following data in the Body as raw JSON.
{
"bt_signature":"Generated Data",
"bt_payload":"Very long generated data"
}
The data for the above Json dictionary has been generated through the below code:
Dictionary<String, String> sampleNotification = gateway.WebhookTesting.SampleNotification(WebhookKind.DISPUTE_OPENED, "my_Test_id");
// Your Webhook kind and your test ID
Just pick the data from sample notification and place it above in the JSON. Run your WebAPI, place debuggers. Add the localhost URL in Postman, select POST, and click on Send.
Your POST method should be hit.
Also, don't forget to add your gateway details:
private BraintreeGateway gateway = new BraintreeGateway
{
Environment = Braintree.Environment.SANDBOX,
MerchantId = "Your Merchant Key",
PublicKey = "Your Public Key",
PrivateKey = "Your Private Key"
};
I hope this helps!
I work at Braintree. If you need more help, please get in touch with our support team.
In order to test webhooks, your app needs to be able to be reached by the Braintree Gateway. A localhost address isn't. Try using your external IP address and make sure the port on the correct computer can be reached from the internet.
Take a look at the Braintree webhook guide for more info on setting up webhooks.
You can use PutsReq to simulate the response you want and do your end-to-end test in development.
For quick 'n dirty testing:
http://requestb.in/
For more formal testing (e.g. continuous integration):
https://www.runscope.com/
If you have a online server you may forward port from your computer to that server.
ssh -nNT -R 9090:localhost:3000 root#yourvds.com
And then specify webhook as http://yourvds.com:9090/webhook
all requests will be forwarded to you machine, you will be able to see logs
I know this is an old question, but according to the docs, you can use this code to test your webhook code:
Dictionary<String, String> sampleNotification = gateway.WebhookTesting.SampleNotification(
WebhookKind.SUBSCRIPTION_WENT_PAST_DUE, "my_id"
);
WebhookNotification webhookNotification = gateway.WebhookNotification.Parse(
sampleNotification["bt_signature"],
sampleNotification["bt_payload"]
);
webhookNotification.Subscription.Id;
// "my_id"
You can use the Svix CLI Listener: https://github.com/svix/svix-cli#using-the-listen-command
This will allow you to easily channel requests to your public endpoint to a local port where you can run your logic against and debug it on your localhost.

Resources