MQTT Subscribe / OTA Update Deep Sleep / ESP32 / FreeRTOS - mqtt

The goal is to receive messages over MQTT in an IoT device that comes out of deep sleep periodically. The exact same considerations exist for OTA update as for any other parameter update. In my case, ultimately, I want to use this for both.
Progress
It runs
The device wakes for about 15 seconds. If during that time, I publish a bunch of messages to the relevant topic, the message arrived successfully. Inside the AWS console I can publish to :
$aws/things/<device-name>/shadow/update/delta
{
"state":{
"desired":{
"output":true
}
}
}
And the delta callback function runs for 'output'. Great but no practical use to anyone.
IoT Job
I created a custom AWS IoT job in the console in an effort to overcome the problem. My thinking was that it might retain the message to ensure delivery. I've been running the job for the past half hour but so far nothing has come through. It had a 20 timeout but is still stuck in queued, not even in progress yet... So, there is clearly a flaw in this approach.
AWS CLI test
Just for completeness, I've attempted to fire off the MQTT message from the console. It has the benefit that you can specify the QOS, (in theory) ensuring that it gets delivered at least once.
aws iot-data publish --topic "$aws/things/<device-name>/shadow/update/delta" --qos 1 --payload file://Downloads/outputTrue.json --cli-binary-format raw-in-base64-out
But oddly this didn't seem to work at all. I didn't see the message arrive at the broker at all: subscribing in the console test.

AWS IoT Core does not support retained messages, see here.
The MQTT specification provides a provision for the publisher to request that the broker retain the last message sent to a topic and send it to all future topic subscribers. AWS IoT doesn't support retained messages. If a request is made to retain messages, the connection is disconnected.
As the wake-up times are perriodically, a possible approach could be to publish the next wake-up slot of your device in a separate topic where your backend is listening to. Your backend will then publish the desired information to your device-topic once the slot opens up.
Of course this approach is quite fragile concerning latency and network stability.

Time to share the answer I found from piecing together numerous posts and reaching out to the very helpful AWS support team. This link is the one that really covers it:
https://docs.aws.amazon.com/iot/latest/developerguide/jobs-devices.html#jobs-workflow-device-online
My summarised pseudo code is :
1. init() & connect() to mqtt as before.
2. Subscribe to the following topics & create callback function for each:
a. Get pending.
b. Notify next.
c. Get next.
d. Update rejected.
e. Update accepted.
3. Create Publish topics:
a. Get pending.
b. Get Next.
4. Pending topics = optional. But necessary to handle many tasks and select between them.
5. Aws-iot-jobs-describe() to publish a request for the next job. It links up to the notify next callback (somehow).
6. In the callback, grab job document, execute job & report Success / Failure.
7. Done.
There is a helpful example in esp-aws-iot/samples/linux/jobs-samples/jobs_sample.c. You need to copy some of the constants over from the sample aws_iot_config.h.
Once you've done all of this, you are able to use AWS Jobs to manage your OTA roll out, which was the original intent.

Related

Cloud services to notify on a script not succeeding for a long time

I have a python script that resides on a VPS, reads (each hour) financial news from a public datafeed and emails me when certain keywords of interest appear. That can happen only a few times a week, but such events are very important and must not be missed. On any data fetching or parsing error, I should also be notified via email, and errors of course get recorded into the server's local log file.
But how do I know that my smtp credentials are not blocked by the mail provider, or my VPS is not shut down by my hoster? In that case, I would not be notified and would be unaware of important events (and the failure to fetch/deliver them itself) until I decided to log into VPS manually and take a look at the logs.
Even if I would use a backup notification channel, e.g., SMS or Telegram, it still would not protect against cloud provider service disruption, or my account being blocked due to temporary payment issues, as there would exist no instance of the script to deliver the message on any of the channels. That's why I suspect some 3rd party fault-tolerant service is needed. Especially if I'm a freelance coder having lots of similar scripts, running on a mixture of VPSes, serverless/Lambdas, possibly for different end clients.
What is the best practice you dear developers are using to be notified when some script has not succeeded for a long enough time? I would like something reliable and ready-to-use, maybe you can recommend some existing monitoring services. At least I was not able to find the ones that solve my particular problem straight away.
To clarify, I don't want to spend time on some manual checking until it's absolutely necessary (in this case, I can tolerate up to 2 hours, and if it does not self-heal within that period, then I need to be notified), and I obviously don't want to get regular annoying reports that the service is doing fine and there simply were no interesting news detected. Plus, I of course want to keep the costs reasonable.

Best way to build my call center using Twilio

I am working on building a call center using Twilio.
Parts of problems are tackled in questions and some answers are old. Given that what I am trying to do is one of the most common usecases, I am trying to use this question to build a tutorial so that people know what is the state of the art way to build this.
Usecase detail is below:
Call Tree:
Customers will call the Twilio number through phone.
Based on phone no identification high priority customers will be sent to Agent handling flow
Other customers have a call tree which they have to navigate, which will support them. Some customers might end up on Agent handling flow.
Call Center: Agent handling flow is as follows:
Agents are handling calls using their desktop computers. They are on the support page which has a Twilio phone call widget as a pop up window.
All agents can handle all calls.
There are two types of queues. High priority and Normal.
All available agents ring at the same time. Anyone can pick and then other agents are moved to the next caller if available.
If not agent available wait for some time, including giving an IVR option for voicemail.
After wait timeout send to IVR.
Following is based on my understanding. Please let me know if there is a better way.
Call Tree will work as per the following tutorial: https://www.twilio.com/docs/tutorials/walkthrough/ivr-phone-tree/node/express
Call Center Agent handling flow will work as follows:
Once workspace
n Workers
2 Task Queues - High priority and Normal
One Workflow which decides based on the task priority which queue to assign to.
My current queries are as follows:
What is the cleaned way of implementing wait for an agent for 1 minute and if agent is not available in 1 minute send to voicemail. Is this part of workflow?
What is the best way implement call receiving in browser. Webrtc?
Is there an HTML widget available for the implementing call receiving in the browser. This would include features like setting agent online/offline, receive call, end call, escalate to supervisor
Help with this will be really appreciated and will help avoid wild goose chases.
Andy , you should look at Twilio taskrouter .
1 You can use task reservation timeout to achieve your requirement 1 . Create a task for an incoming call , taskrouter can direct the call to the matching agent and if the reservatoin timeout is set to 1 minute , the task can be redirected to either a different agent or an IVR as you require
2 You can use Twilio Client , Twilio's WebRTC . You can set incoming/outgoing capabilities as required and can easily integrate with Taskrouter to handle incoming/outgoing calls.
[3] You can build a dialler easily to implement Twilio Client , here's a tutorial to help you progress : https://www.twilio.com/docs/quickstart/client/javascript . You can find a starter up implemented in C#,Java,nodejs,php,python and ruby.
Additionally, you will find this call centre blueprint useful :) https://github.com/nash-md/twilio-contact-center .

Background Tasks in Spring (AMQP)

I need to handle a time-consuming and error-prone task (e.g., invoking a SOAP endpoint that will trigger the delivery of an SMS) whenever a given endpoint of my REST API is invoked, but I'd prefer not to make my users wait for that before sending a response back. Spring AMQP is already part of my stack, so I though about leveraging it to establish a "work queue" and have a number of worker processes consuming from the queue and taking care of the "work units". I have, however, the following requirements:
A work unit is guaranteed to be delivered, and delivered to exactly one worker.
Shall a work unit fail to be completed for any reason it must get placed back in the queue so that another worker can pick it up later.
Work units survive server reboots and crashes. This is mandatory because I won't be using a DB of any kind to store them.
I know RabbitMQ and Spring AMQP can be configured in such a way that ensures these three requirements, but I've only ever used it to achieve RPC so I don't know much about anything other than that. Is there any example I might follow? What are some of the pitfalls to watch out for?
While creating queues, rabbitmq gives you two options; transient or durable. Durable messages will be available until you acknowledge them. And messages won't expire if you do not give queue a ttl. For starters you can enable rabbitmq management plugin and play around a little.
But if you really want to guarantee the safety of your messages against hard resets or hardware problems, i guess you need to use a rabbitmq cluster.
Rabbitmq Clustering and you can find high availability subject on the right side of the page.
This guy explaines how to cluster
By the way i like beanstalkd too. You can make it write messages to disk and they will be safe except disk failures.

Deploying an SQS Consumer

I am looking to run a service that will be consuming messages that are placed into an SQS queue. What is the best way to structure the consumer application?
One thought would be to create a bunch of threads or processes that run this:
def run(q, delete_on_error=False):
while True:
try:
m = q.read(VISIBILITY_TIMEOUT, wait_time_seconds=MAX_WAIT_TIME_SECONDS)
if m is not None:
try:
process(m.id, m.get_body())
except TransientError:
continue
except Exception as ex:
log_exception(ex)
if not delete_on_error:
continue
q.delete_message(m)
except StopIteration:
break
except socket.gaierror:
continue
Am I missing anything else important? What other exceptions do I have to guard against in the queue read and delete calls? How do others run these consumers?
I did find this project, but it seems stalled and has some issues.
I am leaning toward separate processes rather than threads to avoid the the GIL. Is there some container process that can be used to launch and monitor these separate running processes?
There are a few things:
The SQS API allows you to receive more than one message with a single API call (up to 10 messages, or up to 256k worth of messages, whichever limit is hit first). Taking advantage of this feature allows you to reduce costs, since you are charged per API call. It looks like you're using the boto library - have a look at get_messages.
In your code right now, if processing a message fails due to a transient error, the message won't be able to be processed again until the visibility timeout expires. You might want to consider returning the message to the queue straight away. You can do this by calling change_visibility with 0 on that message. The message will then be available for processing straight away. (It might seem that if you do this then the visibility timeout will be permanently changed on that message - this is actually not the case. The AWS docs state that "the visibility timeout for the message the next time it is received reverts to the original timeout value". See the docs for more information.)
If you're after an example of a robust SQS message consumer, you might want to check out NServiceBus.AmazonSQS (of which I am the author). (C# - sorry, I couldn't find any python examples.)

what would be the possible approach to go : SQS or SNS?

I am going to make the rails application which integrates the Amazon's cloud services.
I have explore amazon's SNS service which gives the facility of public subscription which i don't want to do. I want to notify only particular subscriber.
For example if I have 5 subscriber in one topic then the notification should be goes to particular subscriber.
I have also explored amazon's SQS in which i have to write a poller which monitor the queue for message. SQS has also a lock mechanism but the problem is that it is distributed so there would be a chance of getting same message from another copy of queue for process.
I want to know that what would be the possible approach to go.
SQS sounds like what you want.
You can run multiple "worker" processes that compete over messages in the queue. Each message is only consumed once. The logic behind the "lock" / timeout that you mention is as follows: if one of your workers were to die after downloading a message, but before processing it, then you want that message to eventually time out and be re-downloaded for processing on another node.
Yes, SQS is built on a polling model. For example, I have a number of use cases in which I use a minutely cron job to poll for new messages in the queue and take action on any messages found. This pattern is stupid simple to build and works wonders for a bunch of use cases -- a handy little "client" script that pushes a message into the queue, and the cron activated script that will process that message within a minute or so.
If your message pattern is extremely sparse -- eg, only a few messages a day -- it may seem wasteful to poll constantly while the queue is empty. It hardly matters.
My original calculation was that a minutely cron job would cost $0.04 (now $0.02) per month. Since then, SQS added a "Long-Polling" feature that lets you achieve sub-second latency on processing new messages by sending 1 "long-poll" message every 20 seconds to poll an idle queue. Plus, they dropped the price 50%. So per month, that's 131k messages (~$0.06), a little bit more expensive, but with near realtime request processing.
Keep in mind that a minutely cron job I described only costs ~$0.04 / month in request load (30d*24h*60m * 1c / 10k msgs). So at a minutely clip, cost shouldn't really be a concern here. Even polling every second, the price rises only to $2.59 / mo, not exactly a bank buster.
However, it is possible to avoid frequent polling using a webservice that takes an SNS HTTP message. Such an architecture would work as follows: client pushes message to SNS, which pushes message to SQS and routes an HTTP request to your webservice, triggering it to drain the queue. You'd still want to poll the queue hourly or daily, just in case an HTTP request was dropped. In the end though, I'm not sure I can think of any scenario which really justifies such complexity. I'd much rather pay $0.04 a month to have a dirt simple cron job polling my queue.

Resources