mrtg show average when counter data missing - mqtt

I'm graphing my power meter with an old laptop in my barn.
This sends data using mqtt to mrtg(cacti)
Lately this laptop has begun to lockup when playing spotify.
This is a separate issue.
However, when I reboot, all the power used in the mean time is shown as being used in a single time period, giving a huge spike, so the rest of the data is hardly visible.
Is it possible to, when the data finally arrives, to intrapolate it on all the missing datapoints?
The laptop sending data was down between Sat 18:00 and Sun 11:00 approx, but of cause the real powermeter keeps running.
I'd rather have a straight line between the two datapoints, it is still loss of data, but is more true than a spike.
Edit: Complication, as Cacti reads the data asynchroneously from mqtt, it keeps getting the latest count even if the data is stale.
I guess I need to get my mqtt->cacti interface to send NaN or U if the timestamp of the data has not changed.

You have 2 options.
Add a timestamp to the message that way you can rebuild the data as the queued messages are delivered when the laptop reconnects to the broker.
Use a QOS 0 subscriptions and ensure that clean session is set to true, this will mean the missing readings are dropped. Zero data is probably easier to interpret from the graph than a large spike.

Related

Inhibit Time in Tx-PDO

Objects 180Nh have the following subindices:
0x00:----
0x01:----
0x02:----
0x03 (inhibit time): This subindex contains a time lock in 100 µs steps (see following figure). This can be used to set a time that must elapse after the sending of a PDO before the PDO is sent another time. This time only applies for asynchronous PDOs. This is intended to prevent PDOs from being sent continuously if the mapped object constantly changes.
0x04 (compatibility entry): This subindex has no function and exists only for compatibility reasons.
0x05 (event timer): This time (in ms) can be used to trigger an Event which handles the copying of the data and the sending of the PDO.
According to the above point, we realize that when the event occurs, a certain time is determined, which is blocked, and it is for Tx-PDO; now, if the event occurs in this interval, it will be executed in the next section.
Why should the whole section be implemented? Why is the second, third, and fourth event executed in the last part?
Shouldn't the third and fourth events be executed separately?
By default, common CANopen device profiles like for example CiA 401 "generic I/O module" are configured to suit large automation networks. That is: a large network with lots of nodes where it is important to keep bus traffic low. On such networks nodes only transmit PDOs when there has been a data update (an internal event has occurred).
However, such a setup is very much unsuitable when CANopen is used for real-time control systems, like for example having a PLC controlling a bunch of actuator I/O modules that control motions of a machine. Which could also be a safety-related application. In such systems, it is custom to always send data repeatedly at even intervals, even if it has not changed. For example send all data once every 10ms/100ms.
Only the last data sent is used by the receiving node(s), so in case data goes missing/corrupt, new reliable data will arrive soon again. And if no data arrives at all, that's an indication that something is broken and the system ought to revert to a safe state, after receiving no new data in a certain time period. This is how mobile/automotive control systems are most commonly designed, since it is safe, deterministic and proven in use. Custom, non-standard CAN bus protocols by OEM are often implemented exactly like this.
Now, to achieve this with CANopen, we have to configure the TPDO communication parameters. Event timer to set the interval and inhibit time to prevent the node spamming extra data as soon as something has changed. If I remember correctly we also need to set 180N:2 transmission type to asynchronous (which sounds counter-intuitive).
With a setup like this, only the most recent event matters. The most up to date data will always get sent, at fixed intervals.

Watermark getting stuck

I am ingesting data via pub/sub to a dataflow pipeline which is running in unbounded mode. The data are basically coordinates with timestamps captured from tracking devices. Those messages arrive in batches, where each batch might be 1..n messages. For a certain period there might be no messages arriving, which might be resent later on (or not). We use the time-stamp (in UTC) of each coordinate as an attribute for the pub-sub message. And read the pipeline via a Timestamp label:
pipeline.apply(PubsubIO.Read.topic("new").timestampLabel("timestamp")
An example of coordinates and delay looks like:
36 points wait 0:02:24
36 points wait 0:02:55
18 points wait 0:00:45
05 points wait 0:00:01
36 points wait 0:00:33
36 points wait 0:00:43
36 points wait 0:00:34
A message might look like:
2013-07-07 09:34:11;47.798766;13.050133
After the first batch the Watermark is empty, after the second batch I can see a Watermark in the Pipeline diagnostics, just it doesn't get updated, although new messages arrive. Also according to stackdriver logging PubSub has no undelivered or unacknowledged messages.
Shouldn't the watermark move forward as messages with new event time arrive?
According to What is the watermark heuristic for PubsubIO running on GCD? the WaterMark should also move forward every 2minutes which it doesn't?
[..] In the case that we have not seen data on the subscription in more
than two minutes (and there's no backlog), we advance the watermark to
near real time. [..]
Update to address Bens questions:
Is there a job ID that we could look at?
Yes I just restarted the whole setup at 09:52 CET which is 07:52 UTC, with job ID 2017-05-05_00_49_11-11176509843641901704.
What version of the SDK are you using?
1.9.0
How are you publishing the messages with the timestamp labels?
We use a python script to publish the data which is using the pub sub sdk.
A message from there might look like:
{'data': {timestamp;lat;long;ele}, 'timestamp': '2017-05-05T07:45:51Z'}
We use the timestamp attribute for the timestamplabel in dataflow.
What is the watermark stuck at?
For this job the watermark is now stuck at 09:57:35 (I am posting this around 10:10), although new data is sent e.g. at
10:05:14
10:05:43
10:06:30
I can also see that it may happen that we publish data to pub sub with delay of more than 10 seconds e.g. at 10:07:47 we publish data with a highest timestamp of 10:07:26.
After a few hours the watermark catches up but I cannot see why it is delayed /not moving in the beginning.
This is an edge-case in the PubSub watermark tracking logic that has two work arounds (see below). Essentially, if there is no input for 2 minutes, then the watermark will advance to the current time. But, if data is arriving faster than every 2 minutes but still at a very low QPS, then there isn't enough data to have a keep the estimated watermark up to date.
As I mentioned, there are several work arounds:
If you process more data the issue will naturally be resolved.
Alternatively, if you inject extra messages (say 2 per second) it will provide enough data for the watermark to advance more quickly. These just need to have timestamps, and may be immediately filtered out of the pipeline.
For the record, another thing to have in mind about the previously mentioned edge cases in a direct runner context, is the parallelism of the runner. Having a higher parallelism, which is default especially on multicore machines, seems to need even more data. In my case a setting --targetParallelism=1 helped. Basically transformed a stuck pipeline to in a working one without any other intervention.

How can you replay old data into dataflow via pub/sub and maintain correct event time logic?

We're trying to use dataflow's processing-time independence to start up a new streaming job and replay all of our data into it via Pub/Sub but are running into the following problem:
The first stage of the pipeline is a groupby on a transaction id, with a session window of 10s discarding fired panes and no allowed lateness. So if we don't specify the timestampLabel of our replay pub/sub topic then when we replay into pub/sub all of the event timestamps are the same and the groupby tries to group all of our archived data into transaction id's for all time. No good.
If we set the timestampLabel to be the actual event timestamp from the archived data, and replay say 1d at a time into the pub/sub topic then it works for the first day's worth of events, but then as soon as those are exhausted the data watermark for the replay pub/sub somehow jumps forward to the current time, and all subsequent replayed days are dropped as late data. I don't really understand why that happens, as it seems to violate the idea that dataflow logic is independent of the processing time.
If we set the timestampLabel to be the actual event timestamp from the archived data, and replay all of it into the pub/sub topic, and then start the streaming job to consume it, the data watermark never seems to advance, and nothing ever seems to come out of the groupby. I don't really understand what's going on with that either.
Your approaches #2 and #3 are suffering from different issues:
Approach #3 (write all data, then start consuming): Since data is written to the pubsub topic out-of-order, the watermark really cannot advance until all (or most) of the data is consumed - because the watermark is a soft guarantee "further items that you receive you are unlikely to have event time later than this", but due to out-of-order publishing there is no correspondence whatsoever between publish time and event time. So your pipeline is effectively stuck until it finishes processing all this data.
Approach #2: technically it suffers from the same problem within each particular day, but I suppose the amount of data within 1 day is not that large, so the pipeline is able to process it. However, after that, the pubsub channel stays empty for a long time, and in that case the current implementation of PubsubIO will advance the watermark to real time, that's why further days of data are declared late. The documentation explains this some more.
In general, quickly catching up with a large backlog, e.g. by using historic data to "seed" the pipeline and then continuing to stream in new data, is an important use case that we currently don't support well.
Meanwhile I have a couple of recommendations for you:
(better) Use a variation on approach #2, but try timing it against the streaming pipeline so that the pubsub channel doesn't stay empty.
Use approach #3, but with more workers and more disk per worker (your current job appears to be using autoscaling with max 8 workers - try something much larger, like 100? It will downscale after it catches up)

Is it guaranteed that mnesia event listeners will get each state of a record, if it changes fast?

Let's say I have some record like {my_table, Id, Value}.
I constantly overwrite the value so that it holds consecutive integers like 1, 2, 3, 4, 5 etc.
In a distributed environment, is it guaranteed that my event listeners will receive all of the values? (I don't care about ordering)
I haven't verified this by reading that part of the source yet, but it appears that sending a message out is part of the update process, so messages should always come out, even on very fast changes. (The alternative would be for Mnesia to either queue messages or queue changes and run them in batches. I'm almost positive this is not what happens -- it would be too hard to predict the variability of advantageous moments to start batching jobs or queueing messages. Sending messages is generally much cheaper than making a change in the db.)
Since Erlang guarantees delivery of messages to a live destination this is as close to a promise that every Mnesia change will eventually be seen as you're likely to get. The order of messages couldn't be guaranteed on the receiving end (as it appears you expect), and of course a network failure could make a set of messages get missed (rendering the destination something other than live from the perspective of the sender).

Can I prevent an iOS user from changing the date and time?

I want to deploy managed iOS devices to employees of the company, and the app they will use will timestamp data that will be recorded locally, then forwarded. I need those timestamps to be correct, so I must prevent the user from adjusting the time on the device, recording a value, then resetting the date and time. Date and time will be configured to come from the network automatically, but the device may not have network connectivity at all times (otherwise I would just read network time every time a data value is recorded). I haven't seen an option in Apple Configurator to prevent changing the date and time, so is there some other way to do this?
You won't be able to prevent a user either changing their clock or just hitting your API directly as other commentators have posted. These are two separate issues and can be solved by having a local time that you control on the device and by generating a hashed key of what you send to the server.
Local Time on Device:
To start, make an API call when you start the app which sends back a timestamp from the server; this is your 'actual time'. Now store this on the device and run a timer which uses a phone uptime function (not mach_absolute_time() or CACurrentMediaTime() - these get weird when your phone is in standby mode) and a bit of math to increase that actual time every second. I've written an article on how I did this for one of my apps at (be sure to read the follow up as the original article used CACurrentMediaTime() but that has some bugs). You can periodically make that initial API call (i.e. if the phone goes into the background and comes back again) to make sure that everything is staying accurate but the time should always be correct so long as you don't restart the phone (which should prompt an API call when you next open the app to update the time).
Securing the API:
You now have a guaranteed* accurate time on your device but you still have an issue in that somebody could send the wrong time to your API directly (i.e. not from your device). To counteract this, I would use some form of salt/hash with the data you are sending similar to OAuth. For example, take all of the parameters you are sending, join them together and hash them with a salt only you know and send that generated key as an extra parameter. On your server, you know the hash you are using and the salt so you can rebuild that key and check it with the one that was sent; if they don't match, somebody is trying to play with your timestamp.
*Caveat: A skilled attacked could hi-jack the connection so that any calls to example.com/api/timestamp come from a different machine they have set up which returns the time they want so that the phone is given the wrong time as the starting base. There are ways to prevent this (obfuscation, pairing it with other data, encryption) but that becomes a very open-ended question very quickly so best asked elsewhere. A combination of the above plus a monitor to notice weird times might be the best thing.
There doesn't appear to be any way to accomplish what you're asking for. There doesn't seem to be a way to stop the user from being able to change the time. But beyond that, even if you could prevent them from changing the time, they could let their device battery die, then plug it in and turn it on where they don't have a net connection, and their clock will be wrong until it has a chance to set itself over a network. So even preventing them from changing the time won't guarantee accuracy.
What you could do is require a network connection to record values, so that you can verify the time on a server. If you must allow it to work without a net connection, you could at least always log the current time when the app is brought up and note if the time ever seems to go backwards. You'll know something is up if the timestamp suddenly is earlier than the previous timestamp. You could also do this check perhaps only when they try to record a value. If they record a value that has a timestamp earlier than any previous recorded value, you could reject it, or log the event so that the person can be questioned about it at a later time.
This is also one of those cases where maybe you just have to trust the user not to do this, because there doesn't seem to be a perfect solution to this.
The first thing to note is that the user will always be able to forge messages to your server in order to create incorrect records.
But there are some useful things you can use to at least notice problems. Most of the time the best way to secure this kind of system is to focus on detection, and then publicly discipline anyone who has gone out of their way to circumvent policy. Strong locks are meaningless unless there's a cop who's eventually going to show up and stop you.
Of course you should first assume that any time mistakes are accidental. But just publicly "noticing" that someone's device seems to be "misbehaving" is often enough to make bad behaviors go away.
So what can you do? The first thing is to note the timestamps of things when they show up at the server. Timestamps should always move forward in time. So if you've already seen records from a device for Monday, you should not later receive records for the previous Sunday. The same should be true for your app. You can keep track of when you are terminated in NSUserDefaults (as well as posting this information to the server). You should not generally wake up in the past. If you do, complain to your server.
Watch for UIApplicationSignificantTimeChangeNotification. I believe you'll receive it if the time is manually changed (you'll receive it in several other cases as well, most of them benign). Watch for time moving significantly backwards. Complain to your server.
Pay attention to mach_absolute_time(). This is the time since the device was booted and is not otherwise modifiable by the user without jailbreaking. It's useful for distinguishing between reboots and other events. It's in a weird time unit, but it can be converted to human time as described in QA1398. If the mach time difference is more than an hour greater than the wall clock time, something is weird (DST changes can cause 1 hour). Complain to your sever.
All of these things could be benign. A human will need to investigate and make a decision.
None of these things will ensure that your records are correct if there is a dedicated and skilled attacker involved. As I said, a dedicated and skilled attacker could just send you fake messages. But these things, coupled with monitoring and disciplinary action, make it dangerous for insiders to even experiment with how to beat the system.
You cannot prevent the user from changing time.
Even the time of an Location is adjusted by Apple, and not a real GPS time.
You could look at mach kernel time, which is a relative time.
Compare that to the time when having last network connection.
But this all sounds not reliable.

Resources