telemetry data after breaching the threshold generate multiple mails until the telemetry data come under the threshold in thigsboard

telemetry data after breaching the threshold generate multiple mails until the telemetry data come under the threshold in thigsboard - thingsboard

this is my rule cain(https://i.stack.imgur.com/VVkmh.png)
what I require to do is after the threshold breached , the first mail needs to be generated after 30 minuets and the second mail needs to be generated until the the telemetry data come under the threshold.
how we can put time duration from the first mail to the other mail ?

Related

send multiple (key value) in one transmission (all SAME sensor)?

i'm designing for having 100s of IoT devices. each broadcasts once a day, sending 20 data points, e.g. ALL these readings are from ONE gas meter over 24 hours. this is NOT telemetry from various sensors; one device, one sensor, 24h of data, all 20 readings sent in one shot, all at once, in one STRING. is there a way to split this 20-value set into TWENTY MQTT messages? we don't have the bandwidth nor battery power to send 20 messages when one will do.
thanks!

The payload of a MQTT message is just a collection of bytes, you can pack any data structure you want into it.
If you want to send one message a day with something like a CSV list of readings then that it totally up to you.

mrtg show average when counter data missing

I'm graphing my power meter with an old laptop in my barn.
This sends data using mqtt to mrtg(cacti)
Lately this laptop has begun to lockup when playing spotify.
This is a separate issue.
However, when I reboot, all the power used in the mean time is shown as being used in a single time period, giving a huge spike, so the rest of the data is hardly visible.
Is it possible to, when the data finally arrives, to intrapolate it on all the missing datapoints?
The laptop sending data was down between Sat 18:00 and Sun 11:00 approx, but of cause the real powermeter keeps running.
I'd rather have a straight line between the two datapoints, it is still loss of data, but is more true than a spike.
Edit: Complication, as Cacti reads the data asynchroneously from mqtt, it keeps getting the latest count even if the data is stale.
I guess I need to get my mqtt->cacti interface to send NaN or U if the timestamp of the data has not changed.

You have 2 options.
Add a timestamp to the message that way you can rebuild the data as the queued messages are delivered when the laptop reconnects to the broker.
Use a QOS 0 subscriptions and ensure that clean session is set to true, this will mean the missing readings are dropped. Zero data is probably easier to interpret from the graph than a large spike.

Real time stream processing for IOT through Google Cloud Platform

I was concerned about real time stream processing for IOT through GCD pub/sub, Cloud Dataflow and perform analytics through BigQuery.I am seeking help for how to implement this.
Here is the architecture for IOT real-time stream processing

I'm assuming you mean that you want to stream some sort of data from outside the Google Cloud Platform into BigQuery.
Unless you're transforming the data somehow, I don't think that Data Flow is necessary.
Note, that BigQuery has its own Streaming API so you don't necessarily have to use Pub/Sub to get data into BigQuery.
In any case, these are the steps you should generally follow.
Method 1
Issue a service account (and download the .json file from IAM on Google Console)
Write your application to get the data you want to stream in
Inside that application, use the service account to stream directly into a BQ dataset and table
Analyse the data on the BigQuery console (https://bigquery.cloud.google.com)
Method 2
Setup PubSub queue
Write an application that collections the information you want to stream in
Push to PubSub
Configure DataFlow to pull from PubSub, transform the data however you need to and push to BigQuery
Analyse the data on the BigQuery console as above.
Raw Data
If you just want to put very raw data (no processing) into BQ, then I'd suggest using the first method.
Semi Processed / Processed Data
If you actually want to transform the data somehow, then I'd use the second method as it allows you to massage the data first.
Try to always use Method 1
However, I'd usually always recommend using the first method, even if you want to transform the data somehow.
That way, you have a data_dump table (raw data) in your dataset and you can still use DataFlow after that to transform the data and put it back into an aggregated table.
This gives you maximum flexibility because it allows you to create potentially n transformed datasets from the single data_dump table in BQ.

Watermark getting stuck

I am ingesting data via pub/sub to a dataflow pipeline which is running in unbounded mode. The data are basically coordinates with timestamps captured from tracking devices. Those messages arrive in batches, where each batch might be 1..n messages. For a certain period there might be no messages arriving, which might be resent later on (or not). We use the time-stamp (in UTC) of each coordinate as an attribute for the pub-sub message. And read the pipeline via a Timestamp label:
pipeline.apply(PubsubIO.Read.topic("new").timestampLabel("timestamp")
An example of coordinates and delay looks like:
36 points wait 0:02:24
36 points wait 0:02:55
18 points wait 0:00:45
05 points wait 0:00:01
36 points wait 0:00:33
36 points wait 0:00:43
36 points wait 0:00:34
A message might look like:
2013-07-07 09:34:11;47.798766;13.050133
After the first batch the Watermark is empty, after the second batch I can see a Watermark in the Pipeline diagnostics, just it doesn't get updated, although new messages arrive. Also according to stackdriver logging PubSub has no undelivered or unacknowledged messages.
Shouldn't the watermark move forward as messages with new event time arrive?
According to What is the watermark heuristic for PubsubIO running on GCD? the WaterMark should also move forward every 2minutes which it doesn't?
[..] In the case that we have not seen data on the subscription in more
than two minutes (and there's no backlog), we advance the watermark to
near real time. [..]
Update to address Bens questions:
Is there a job ID that we could look at?
Yes I just restarted the whole setup at 09:52 CET which is 07:52 UTC, with job ID 2017-05-05_00_49_11-11176509843641901704.
What version of the SDK are you using?
1.9.0
How are you publishing the messages with the timestamp labels?
We use a python script to publish the data which is using the pub sub sdk.
A message from there might look like:
{'data': {timestamp;lat;long;ele}, 'timestamp': '2017-05-05T07:45:51Z'}
We use the timestamp attribute for the timestamplabel in dataflow.
What is the watermark stuck at?
For this job the watermark is now stuck at 09:57:35 (I am posting this around 10:10), although new data is sent e.g. at
10:05:14
10:05:43
10:06:30
I can also see that it may happen that we publish data to pub sub with delay of more than 10 seconds e.g. at 10:07:47 we publish data with a highest timestamp of 10:07:26.
After a few hours the watermark catches up but I cannot see why it is delayed /not moving in the beginning.

This is an edge-case in the PubSub watermark tracking logic that has two work arounds (see below). Essentially, if there is no input for 2 minutes, then the watermark will advance to the current time. But, if data is arriving faster than every 2 minutes but still at a very low QPS, then there isn't enough data to have a keep the estimated watermark up to date.
As I mentioned, there are several work arounds:
If you process more data the issue will naturally be resolved.
Alternatively, if you inject extra messages (say 2 per second) it will provide enough data for the watermark to advance more quickly. These just need to have timestamps, and may be immediately filtered out of the pipeline.

For the record, another thing to have in mind about the previously mentioned edge cases in a direct runner context, is the parallelism of the runner. Having a higher parallelism, which is default especially on multicore machines, seems to need even more data. In my case a setting --targetParallelism=1 helped. Basically transformed a stuck pipeline to in a working one without any other intervention.

Mnesia database design for storing message that needs to be sent in the future

I am writing a ejabberd module where the user controls when the message is delivered to the recipient instead of delivering immediately(like birthday wishes sending in advance). This is done by adding a custom xml element to the message stanza like the following
<message xmlns="jabber:client" from="test2#ubuntu" to="test1#ubuntu/32375806281445450055240436" type="chat">
<schedule xmlns="ank" year="2015" month="10" day="19" hour="22" minute="36" second="13"/>
<body>hi</body>
</message>
Now these scheduled messages has to be stored in the mnesia database and send to the recipient when the time arrives.
Approach 1:
One approach is to create a table for every user, when the message is received, store the message to the users table and set a timer to process
the message and delete when done like the following sample code
timer:apply_after(SecondsDelay, ?MODULE, post_message_delete, [TableName, RecordUniqueKeyHash, From, To, Packet]).
The post_message_delete method will send the message when called after the timer expires using the route method as shown in the following and delete the record from the mnesia database.
ejabberd_router:route(From, To, Packet)
Creating a table for every user is not feasible due to the limitations on the number of tables in mnesia.
Approach 2:
Another approach is to store all the users messages in one single table and set the timer(same as above) for every message as the message arrives and once the message is processed delete it.
The whole idea of using the mnesia database is to process the messages reliably in the case of ejabberd server crash.
To achieve this we use a pid field in the record of every message. There is a pid field for every message record that contains the pid of the process that is processing this message. Initially it is undefined(When the message arrives at the filter_packet hook) but after when the message processing method is spawned it updates the pid in the record in the mnesia database.
So if the server crashes on reboot in the modules start method all the messages are iterated and checked if the pid is alive(is_process_alive), if not alive then spawn the processing method on the message which will update with the new process pid, process the message and delte once done.
Drawbacks
The drawback of this method is that even if a message has to be delivered far in the future(next month or next year) still a process is running for this message and there are as many processes running as there as many messages.
Approach 3:
To over come the drawbacks of Approach 2, scan the database every hour and accumulate the messages that has to be delivered only for the next hour and process it.
The drawback with this approach is that the database is scanned every hour that might impact performance.
Approach 4:
To over come the performance of Approach 3, we can create tables for every year_month and spawn the message processing function only on the current months table.
What other approach is best suited for this use case using mnesia database?

Even though it's an old question, but it may one day become an issue for somebody else.
I think mnesia is the wrong choice for this kind of data store use case.
Redis from version 2.8.0 has a keyspace event notification features when certain operations are performed including key expiration commands set by EXPIRE, EXPIREAT and other variants. This information can reach your code by the PUBSUB feature. See Redis Keyspace Notifications
on how to start.
Generate a unique key(K) probably UUID for every birthday message.
Store the message, the entire XML, to send under the given generated K.
Store this message key as a value under a key called K:timer using the SET command with TTL set to the time difference between now and the birthday timestamp in seconds, OR use the EXPIREAT to set the message expiration time to the Unix timestamp of the birthday itself. When TTL expires, pubsub clients get notified of the event with the information of the key to expire K:timer. Extract the K and fetch the message with it. Send your message and delete it afterwards.
ISSUES TO CONSIDER:
1: Multiple pubsub clients may get notified of the same expiration event. This may cause the same message to be sent more than once. Implement some sort of locking to prevent this.
2: Redis PUBSUB is a fire and forget message passing construct. So if a client goes down and comes up again, it may have missed event notifications during this time window. One way to ensure reliability is to store the key, K, under different key variants of K:timer, K:timer:1, K:timer:2, K:timer:3,... at increasing TTL offsets(1, 2, 3, minutes in between) to target a worst time window during which the unavailable client may become available.
3: Redis is in-memory. Storing lots of large messages will cost you RAM. One way to solve this is to store only the message key K in redis and store the message (XML) with the same key, K, in any disk base key-value store like Riak, Cassandra etc.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart