ksqldb REST API push query - reconnect to last consumed offset + 1 - ksqldb

Is it possible to reconnect to the same push query after connection was lost?
There is a queryId for terminating a query. Is there a similar mechanism which could be used for reconnections via ksqldb REST API?
In case that the subscription stopped at offset 5, it should receive also the new messages (without skipping any of them) after offset 5 after reconnection, but without replaying the whole topic from the beginning again.
The ideal would be exactly once processing for all messages (records), but at-least-once semantics are a must have in lot of scenarios.
"POST" "http://<ksqldb-host-name>:8088/query-stream"
{
"sql": "SELECT * FROM Movies EMIT CHANGES;",
"properties":{"auto.offset.reset":"earliest"}
}
Header response with queryID:
{"queryId":"f91157c7-cd12-407e-a173-5a4cbc398259","columnNames":["TITLE","ID","RELEASE_YEAR"],"columnTypes":["STRING","INTEGER","INTEGER"]}
What I'm probably missing is a programmatic way of committing the offsets.

Related

CAN bus sending data from two masters with equal balance

I have two master nodes connected to the same CAN bus, both send data to my PC.
first master ID = 0xFFA1
second master ID = 0xFFA2
Since the first master ID is lower than the second it takes control of the bus more than the second master. And this causes some delay in the data.
Is there a way to make load balancing between two nodes so that each node send an almost equal amount of messages.
I tried making the first node send data while switching between two IDs 0xFFA1 and 0xFFB2,
and the second node sends data with ID 0xFFB1. And it didn't help.
There is no such thing as "masters" in CAN, nor in higher layer protocols like CANopen for that matter (a "master" in CANopen is just a supervisor node). Who gets to send what is defined by the CAN identifiers - CAN is primarily focusing on data, not nodes. What matters is what is sent, rather than who is sending/receiving, since every message is broadcasted.
It sounds as if you have 2 nodes that wildly spam the bus with identifier 0xFFA1 and 0xFFA2 messages, as fast as they are able, leading to 100% bus load. Then the node sending 0xFFA2 will "starve". Sending data "as fast as you are able" is never the correct way to use CAN.
Instead you need to define a higher layer protocol that dictates real-time characteristics. In control systems, this is most commonly done by having nodes send data at fixed intervals, such as once per 10ms or 100ms. This alone should fix your starvation problem.
If you want to prevent nodes from sending at the same time, then you could provide a means for them to synchronize. A trick used in CANopen and other protocols, is to have one node send out a "sync" message at given fixed time intervals.
After reading this sync message, all nodes should act within x ms from receiving it.

Throttle Apache Spout Dynamically

I have a topology where spout reads data from Kafka and sends to bolt which in turn calls a REST API (A) and that calls another REST API (B). So far API B did not have throttling. Now they have implemented throttling (x number of max calls per clock minute).
We need to implement the throttling handler.
Option A
Initially we were thinking to do it in REST API (A) level and put a
Thread.sleep(x in millis) once the call is throttled by REST API (B)
but that will hold all the REST (A) calls waiting for that long (which will vary between 1 sec to 59 seconds) and that may increase the load for new calls coming in.
Option B
REST API (A) sends response back to Bolt about being throttled. Bolt notifies the Spout with process failure to
To not to change the offset for those messages
To tell spout to stop reading from Kafka and to stop emitting message to Bolt.
Spout waits for some time (say a minute) and resumes from where it left
Option A is straight forward to implement but not a good solution in my opinion.
Trying to figure out if Option B is feasible with topology.max.spout.pending however how to dynamically communicate to Storm to put a throttling in spout. Anyone please can you share some thoughts on this option.
Option C
REST API (B) throttles the call from REST (A) which will not handle the same but will send the 429 response code to the bolt. The bolt will re-queue the message to a error topic part of another storm topology. This message can have retry count as part of it and in case the same message gets throttled again we can re-queue again with ++retry count.
Updating the post as found a solution to make the option B feasible.
Option D
https://github.com/apache/storm/blob/master/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutRetryExponentialBackoff.java
/**
* The time stamp of the next retry is scheduled according to the exponential backoff formula (geometric progression):
* nextRetry = failCount == 1 ? currentTime + initialDelay : currentTime + delayPeriod^(failCount-1),
* where failCount = 1, 2, 3, ... nextRetry = Min(nextRetry, currentTime + maxDelay).
* <p/>
* By specifying a value for maxRetries lower than Integer.MAX_VALUE, the user decides to sacrifice guarantee of delivery for the
* previous polled records in favor of processing more records.
*
* #param initialDelay initial delay of the first retry
* #param delayPeriod the time interval that is the ratio of the exponential backoff formula (geometric progression)
* #param maxRetries maximum number of times a tuple is retried before being acked and scheduled for commit
* #param maxDelay maximum amount of time waiting before retrying
*
*/
public KafkaSpoutRetryExponentialBackoff(TimeInterval initialDelay, TimeInterval delayPeriod, int maxRetries, TimeInterval maxDelay) {
this.initialDelay = initialDelay;
this.delayPeriod = delayPeriod;
this.maxRetries = maxRetries;
this.maxDelay = maxDelay;
LOG.debug("Instantiated {}", this.toStringImpl());
}
The steps will be as follows:
Create kafkaSpoutRetryService using the above constructor
Set retry to KafkaSpoutConfig using
KafkaSpoutConfig.builder(kafkaBootStrapServers, topic).setRetry(kafkaSpoutRetryService)
Fail the Bolt in case there is throttling in Rest API (B) using
collector.fail(tuple) which will signal spout to process the tuple again, based in the retry configuration setup in step 1 and 2
Your option D sounds fine, but in the interest of avoiding duplicates in calls to API A, I think you should consider separating your topology into two.
Have a topology that reads from your original Kafka topic (call it topic 1), calls REST API A, and writes whatever the output of the bolt is back to a Kafka topic (call it topic 2).
You then make a second topology whose only job is to read from topic 2, and call REST API B.
This will allow you to use option D while avoiding extra calls to API A when you are saturating API B. Your topologies will look like
Kafka 1 -> Bolt A -> REST API A -> Kafka 2
Kafka 2 -> Bolt B -> REST API B
If you want to make the solution a little more responsive to the throttling, you can use the topology.max.spout.pending configuration in Storm to limit how many tuples can be in-flight at the same time. You could then make your bolt B buffer in-flight tuples until the throttling expires, at which point you can make it try sending the tuples again. You can use OutputCollector.resetTupleTimeout to avoid the tuples timing out while Bolt B is waiting for the throttling to expire. You can use tick tuples to make Bolt B periodically wake up and check whether the throttling has expired.

Control rate of individual topic consumption in Kafka Streams 0.9.1.0-cp1?

I am trying to backprocess data in Kafka topics using a Kafka Streams application that involves a join. One of the streams to be joined has much larger volume of data per unit of time in the corresponding topic. I would like to control the consumption from the individual topics so that I get roughly the same event timestamps from each topic in a single consumer.poll(). However, there doesn't appear to be any way to control the behavior of the KafkaConsumer backing the source stream. Is there any way around this? Any insight would be appreciated.
Currently Kafka cannot control the rate limit on both Producers and Consumers.
Refer:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-13+-+Quotas
But if you are using Apache Spark as the stream processing platform, you can limit the input rate for the Kafka receivers.
in the consumer side you can use consume([num_messages=1][, timeout=-1])
function instead of poll.
consume([num_messages=1][, timeout=-1]):
Consumes a list of messages (possibly empty on timeout). Callbacks may be executed as a side effect of calling this method.
The application must check the returned Message object’s Message.error() method to distinguish between proper messages (error() returns None) and errors for each Message in the list (see error().code() for specifics). If the enable.partition.eof configuration property is set to True, partition EOF events will also be exposed as Messages with error().code() set to _PARTITION_EOF.
num_messages (int) – The maximum number of messages to return (default: 1).
timeout (float) – The maximum time to block waiting for message, event or callback (default: infinite (-1)). (Seconds)

Erlang/OTP How to notify parent process that child processes are idle and no messages in their mailbox

I would like to design a process hierarchy where there is a a parent process P which acts like a gatekeeper and delegates the work(messages/events from its client processes) to it's children processes C1,C2..Cn which collaborate with each other and may send the result back to P. The children processes cannot talk to any process outside, only P.
The challenge is that though P may have multiple messages from its clients, it should accept only one message, delegate the work to C1..Cn and ONLY accept the next message from its clients
when all its children processes are done(or idle) and there are no more messages circulating between C1 to Cn.
P finishes accepting messages from C1..Cn so that it can return the result to its clients
Constraints:
Idle for me is when they are waiting with a receive (blocking) or simply exited.
C1 to Cn are finite state machines. Some or all of them may send messages back to C. Or there may be no messages to be sent back to C. Even if no messages are sent back to C, C has to figure out that all of them are done with no messages between them.
If any of C1 to Cn have been pre-empted, then it is considered busy(this may be obvious but I thought I'll put it here for completion) and C will not receive the next message
Is there an OTP pattern or library which will do this for me (before I hack something?). I know that process_info can let me know if the mailbox of a process are empty and I could keep on checking the children's mailboxes from P but it would be unnecessary polling from P.
EDIT GENERAL: I am trying to implement a reactive variant of Flow Based Programming on the Erlang platform. This has the notion of 'hierarchical processes' or composites which themselves may contain composite processes until we reach some boxes of actual code...I am going to research(looking at monitor,process_info,process_flag) but I wanted to respond to your excellent answers
EDIT RECURSIVE PARENTS: Each of C1 and Cn can themselves be parent/composite processes. If I just spawn processes and let them exit immediately, I'll have to create the chain of Composites everytime as C1..Cn may themselves be composites (which spawn composites..and so on). Finally, when we reach a leaf box(which is not a composite node), they are supposed to be finite state machines.. so I'm not sure of spawning and making them exit quickly if the are FSMs.
EDIT TKOWAL: Since I am trying to create a generic parent/composite process, it does not know 'when' the task ends. All it does is relay the messages it receives from its children to it's siblings with the 'constraint' that it will not accept the next message from its client/siblings until its children are 'done'. The children C1..Cn may send not just one but many messages. I understand from your proposal, that wait_for_task_finish will stop blocking the moment it gets the first message. But more messages may be emitted too by P's children. P should wait for all messages. Also, having a task_end symbol will not work for the same reason(i.e. multiple messages possible from the children)
Given how inexpensive it is to start up Erlang processes, your gatekeeper could start new children for each incoming task, and then wait for them all to exit normally once they complete their work.
But in general, it sounds like you're looking for a process pool. There are a few of these already available, such as poolboy and sidejob. Pools can be harder to get right than you think, so I advise using an existing proven pool implementation before attempting to write your own.
After edits, this became entirely different question, so I am posting new answer.
If you are trying to write Flow Based Programming, then you are probably solving wrong problem. FBP is effective, because almost everything is asynchronous and you start processing next request immediately after you finished with previous one.
So, the answer is - don't wait for children to finish:
In FBP, there is no time dependencies between the components. So if I
have a chunk of data, it should be able to flow from one end of the
diagram to the other regardless of how any other pieces of data are
being handled. In order to program an FBP system, you have to minimize
your dependencies.
source
When creating parent and children, you know all the connections between blocks, so just configure children to send processed data directly to next block. For example: P1 has children C1 and C2. You send message to P1, it delegates it to C1, packet flows couple of times between C1 and C2 and after that, C1 or C2 sends it directly to P2.
Blocks should be stateless. They output should not depend on previous requests, so even if C1 and C2 are processing data from two different requests to P1 - it is OK. There could be situations, where P1 gets data packet D1 and then D2, but will output answers in different order R2 and then R1. It is also OK. You can use Erlang reference to tag messages and then check, which response is from which request.
I don't think, there is ready library for that, but it is really easy to hack, unless I missed something. Your P process should look like this:
ready_for_next_task() ->
receive
{task, Task, CallerPid} ->
send_task_to_workers(Task)
end,
wait_for_task_finish(CallerPid).
wait_for_task_finish(CallerPid) ->
receive
{task_end, Response} ->
CallerPid ! Response
end,
ready_for_next_task().
In wait_for_task_finish/1 you have only one clause for receive, so it will not accept next task, until current one is finished. If you are waiting for multiple responses from workers, you can simply add second clause to receive with some partial response and call wait_for_task_finish/1 recursively.
It is always better to have some indicator, that the processing ended, because you don't have guarantees on message delivery time. This means, that you could check, that all processes currently are waiting for message and think, that they ended processing, but actually, they did not started yet or one of them send message to other and you caught them before the second one had it in message box.
If the processes C1..Cn have only parts of actual work and don't know about the progress, than the gatekeeper P should know how many parts there were, receive all of them one by one and then call ready_for_next_task/1.

Missing master heartbeat does not cause node to react in a CANopen system

I have a strange finding about the heartbeat-protocol in CANopen. Maybe somebody else has seen something like this and maybe it is supposed to work like this... Anyway, here's what it's about:
In CANopen there are two timeout-based life-guarding mechanisms: the first is node guarding, which I will not mention further, since it's considered old news.
The other one is called heartbeat. It is pretty simple: Any participant on the network sends a regular message stating its node ID and its state. The frequency is defined by object 0x1017sub0 and is called heartbeat-producer-time. If it is set to zero, no heartbeat is being sent.
Any other participant can then define a number of nodes it wants to find on the network plus the maximum time there may be between two consecutive heartbeat-messages. This information is stored in object 0x1016sub1..n as 32-bit entries for as many nodes as this particular node wants to listen to.
The entries consist of the node ID (bits 22 to 16) and the mentioned maximum time that may elaps between heartbeats, called the heartbeat-consumer-time (in bits 15..0). Again if the entry is zero, it is being ignored.
As you may have gathered, there is no distinction between network-master (node ID 1) and slaves (node IDs 2 to 127).
So far the theory, now for my problem:
I configure one of the slave-nodes in my network as a heartbeat-consumer for the master, so there's an entry in object 0x1016sub1 that looks like this: 0x000107D0. Meaning that a heartbeat-message from the master is expected after at least two seconds.
I have observed that this works in two examples. If I send a master-heartbeat for a time and then stop, the node either returns to pre-operational mode or sends an appropriate emergency-message.
If I don't send any master-heartbeat-messages, I would expect that after I start the node (send it into operational mode) it takes at most two seconds for the node to either return to pre-operational mode or send an appropriate emergency-message or perhaps even both. But in the two examples I tried, nothing happened. If I never send any heartbeat, the node never expects one and just keeps on running.
The two examples are very different from each other. I am not sure whether they use the same CANopen-stack library perhaps.
Is there an explanation?
If you read CANopen User Manual, section 1.3.1.6, page 39, you will notice that the heartbeat consumer is first activated upon receiving a heartbeat from the producer. I would assume then that, since in your example the first heartbeat is never sent, the consumer is not activated.

Resources