MAximum Number of Modules - azure-iot-edge

this doc states the maximum number of modules in a deployment is 20. I am having problems getting over 15. Nothing ever happens, no error messages but the modules don't get deployed.
I also would like to know if this is a soft limit and if it is, what is the process to override it.

Did you find any error in edgeAgent log? probably you hit the limit of twin message size; Maximum size per twin section (tags, desired properties, reported properties) is 8 KB.

Related

Why would SQS ApproximateNumberOfMessagesVisible & ApproximateAgeOfOldestMessage go up even when Received/Deleted metrics match Sent metrics?

In the CloudWatch Metrics graph below, the purple line is ApproximateNumberOfMessagesVisible and red line is ApproximateAgeOfOldestMessage. They are trending up even when NumberOfMessagesReceived (orange)/NumberOfMessagesDeleted (green) match NumberOfMessagesSent (blue).
How is this possible?
In my code, I process the message in a new thread and therefore the message is almost immediately deleted from the queue. (This is not good practice in production but this is a load testing script so I don't expect or care about exceptions)
sqsClient.receiveMessage(queueUrl).getMessages().forEach(msg -> {
pool.execute(() -> handleSqsMessage(msg));
sqsClient.deleteMessage(queueUrl, msg.getReceiptHandle());
});
If the approximateAgeOfOldestMessage is increasing then it indicates that there is a poison pill. A poison pill is a malformed message which is unable to get processed by the consumer.
What is your redrive policy ? You will have to set the max-receive-count to a smaller value (say 3 for example). After the message is received 3 times by the consumer, if it was not able to process/delete it will be moved to dead letter queue. You can then analyze this poison pill.
If the number of visible messages are increasing consistently, it indicates that your consumer is unable to catch up and messages are piling up in the queue. This is not necessarily a bad sign but shouldn't be very large. Seems ok to me. You can increase the number of consumers to bring it down.
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-dead-letter-queues.html#sqs-dead-letter-queues-when-to-use
https://aws.amazon.com/message-queue/features/

Ever-increasing RAM usage with low series cardinality

I'm just testing influxdb 1.3.5 for storing a small number (~30-300) of very long integer series (worst case: (86400)*(12*365) [sec/day * ((days/year)*12) * 1 device] = 378.432.000)
e.g. the number of total points would be for 320 devices: (86400)*(12*365)*320 [sec/day * ((days/year)*12) * 320 devices] = 121.098.240.000)
The series cardinality is low, it equals the number of devices. I'm using second-precision timestamps (that mode is enabled when I commit to influxdb via the php-API.
Yes, I really need to keep all the samples, so downsampling is not an option.
I'm inserting the samples as point-arrays of size 86400 per request sorted from oldest to newest. The behaviour is similar (OOM in both cases) for inmem and tsi1 indexing modes.
Despite all that, I'm not able to insert this number of points to the database without crashing it due to out of memory. The host-vm has 8GiB of RAM and 4GiB of Swap which fill up completely. I cannot find anything about that setup being problematic in the documentation. I cannot find a notice that indicates this setup should result in a high RAM usage at all...
Does anyone have a hint on what could be wrong here?
Thanks and all the best!
b-
[ I asked the same question here but received no replies, that's the reason for crossposting: https://community.influxdata.com/t/ever-increasing-ram-usage-with-low-series-cardinality/2555 ]
I found out what the issue most likely was:
I had a bug in my feeder that caused timestamps not being updated to lots of points with distinct values were written over and over again to the same timestamp/tag combination.
If you experience something similar, try double-checking each step in the pipeline for a time concerning error.
This was not the issue unfortunately, the ram usage rises nevertheless then importing more points than before.

SPSS percentile issue

I am working with SPSS 18.
I am using FREQUENCIES to calculate the 95th percentile of a variable.
FREQUENCIES SdrelPromSldDeu_Acr_5_0
/FORMAT=NOTABLE
/PERCENTILES 1,5,95,99.
The result is given in a table
Statistics
SdrelPromSldDeu_Acr_5_0
N Valid 8881
Missing 0
Percentiles 1 -1,001060644014
5 -1,000541440102
95 6619,140632636228
99 9223372,036854776000
But if I double-click the 9223372,036854776 to copy it, another number appears: 1.0757943411193715E7.
If I use MEANS to get the maximum value, the result is 2.4329524990388575E8, so the number that appears on the double-click seems possible.
I have seen 9223372,03 in other cases as well, as if it were some kind of upper limit SPSS is able to display.
Can anybody tell me if the 9223372,03 represents anything useful? Should I trust the bigger number?
Thanks!
It appears to be a bug in the display of SPSS.
The number you have shown is eerily similar to
9223372036854775807
which is the highest value possible if a variable is declared as a long integer.
see also:
https://en.wikipedia.org/wiki/9223372036854775807
Since your actual number is 11 degrees smaller, it should not reach this limit. Hence the conclusion that it must be a bug in the display software.
Do not trust it.
(the number behind may or may not be right, but the 9223372,03 is surely wrong)

Find maximum memory used by a PBS job

After a job is finished, how can I know the maximum resident size it required at any given point while running?
(tried /usr/bin/time, but not installed on the server)
Thank you!
PBS MOM reports some statistics back and it gets recorded in the PBS server log.
A handy utility called tracejob parses the logs to extract all entries related to a specific job given a job ID.
For example after the job completion on PBS Pro 12.1 tracejob would return several lines including the following
07/11/2014 16:37:27 S Exit_status=0 resources_used.cpupercent=98
resources_used.cput=01:49:14 resources_used.mem=5368kb
resources_used.ncpus=1 resources_used.vmem=38276kb
resources_used.walltime=01:49:22
Here 5368 kb would correspond to the maximum rss.
Similarly on Torque 3.0.5
07/15/2014 03:45:12 S Exit_status=0 resources_used.cput=20:44:10
resources_used.mem=704692kb
resources_used.vmem=1110224kb
resources_used.walltime=20:44:30
Here the maximum rss was 704692 kb

epoll_create and epoll_wait

I was wondering about the parameters of two APIs of epoll.
epoll_create (int size) - in this API, size is defined as the size of event pool. But, it seems that having more events than the size still works. (I've put the size as 2 and forced event pool to have 3 events... but it still works !?) Thus I was wondering what this parameter actually means and curious about the maximum value of this parameter.
epoll_wait (int maxevents) - for this API, the maxevents definition is straight-forward. However, I can see the lackness of information or advices on how to determin this parameter. I expect this parameter to be changed depending on the size of epoll event pool size. Any suggestions or advices will be great. Thank you!
1.
"man epoll_create"
DESCRIPTION
...
The size is not the maximum size of the backing store but just a hint
to the kernel about how to dimension internal structures. (Nowadays,
size is unused; see NOTES below.)
NOTES
Since Linux 2.6.8, the size argument is unused, but must be greater
than zero. (The kernel dynamically sizes the required data strucā€
tures without needing this initial hint.)
2.
Just determine an accurate number by yourself, but be aware that
giving it a small number may drop out the efficiency a little bit.
Because the smaller number assigned to "maxevent" , the more often you may have to call epoll_wait() to consume all the events, queued already on the epoll.

Resources