Triggering a SWF Workflow based on SQS messages - amazon-sqs

Preamble: I'm trying to put together a proposal for what I assume to be a very common use-case, and I'd like to use Amazon's SWF and SQS to accomplish my goals. There may be other services that will better match what I'm trying to do, so if you have suggestions please feel free to throw them out.
Problem: The need at its most basic is for a client (mobile device, web server, etc.) to post a message that will be processed asynchronously without a response to the client - very basic.
The intended implementation is to for the client to post a message to a pre-determined SQS queue. At that point, the client is done. We would also have a defined SWF workflow responsible for picking up the message off the queue and (after some manipulation) placing it in a Dynamo DB - again, all fairly straightforward.
What I can't seem to figure out though, is how to trigger the workflow to start. From what I've been reading a workflow isn't meant to be an indefinite process. It has a start, a middle, and an end. According to the SWF documentation, a workflow can run for no longer than a year (Setting Timeout Values in SWF).
So, my question is: If I assume that a workflow represents one message-processing flow, how can I start the workflow whenever a message is posted to the SQS?
Caveat: I've looked into using SNS instead of SQS as well. This would allow me to run a server that could subscribe to SNS, and then start the workflow whenever a notification is posted. That is certainly one solution, but I'd like to avoid setting up a server for a single web service which I would then have to manage / scale according to the number of messages being processed. The reason I'm looking into using SQS/SWF in the first place is to have an auto-scaling system that I don't have to worry about.
Thank you in advance.

I would create a worker process that listens to the SQS queue. Upon receiving a message it calls into SWF API to start a workflow execution. The workflow execution id should be generated based on the message content to ensure that duplicated messages do not result in duplicated workflows.

You can use AWS Lambda for this purpose . A lambda function will be invoked by SQS event and therefore you don't have to write a queue poller explicitly . The lambda function could then make a post request to SWF to initiate the workflow

Related

AWS redrive to source queue in Java

AWs recently added a feature that allows you to send messages from a DLQ back to source queue via a lick of a button "redrive to source". I wanted to know is this possible via an API call.
I know how to extract a message from dlq queue and re send it, But with this new function i was hoping i wouldnt need to handle the messages, but rather just call a method perhaps on the queue and if its configured it would do the redelivery.
Anyone know if this is possible, as im searching in the net.
I believe currently this feature is only available via the management console UI and not as an API

Process job using workers while client waits and return response when complete

I'm building an API using Rails where requests come in and they need to be executed by a cluster of workers running on a different server (these workers call remote APIs and parse the data, etc...). I'm going to be using Sidekiq or Resque to handle the queueing/processing of that.
My issue is the client needs to wait while this is happening and the controller needs to return the response to the client once it's complete. How would I handle this in the controller? We're using a redis backend, so I was thinking something along the lines of subscribing to a pub/sub channel and waiting for the worker to publish a status message. The controller would wait for a set time period and then return a 'check back later' response to the client if it doesn't receive a message in time. What would be the best way to implement that, or is there a better solution?
Do not make your clients wait! There are a lot of issues if you make the controller block for a long running job:
Other programs may assume the request timed out (proxies, browsers, scripts, etc.)
It makes your API endpoints become a source for denial of service
It requires you to put more engineering work into web servers (since a rails process can't handle another web request while it's handling the blocking call)
Part of the reason of using Sidekiq or Resque is the avoid controllers that do heavily lifting during the http request.
Instead, background jobs should report their status to the database. Then web server should query and return to the client the latest status from the database.
If clients need more immediate feedback, you can:
make clients constantly poll
post request to the client (if the API consumer is another webserver)
use another protocol mechanism (eg - websockets).

Design considerations for heavy loaded asp.net MVC + Web API application and asynchronous message bus

I'm planning to build a quite large application (large in term of concurrent user / number of request, not in term of features).
Basically, I'll have a service somewhere, that is waiting for commands execute them, and acknowledge the completion later. This service will use a service bus to communicate, making the execution eventual, before a acknowledge message is issued.
The consumers of this service can be any kind of application (WPF, SL, ...) but my main (and first) client will be an asp.net MVC application + WebApi (.Net 4.5) or MVC only (.Net 4.0) with ajax controller actions.
The web application will be relying on Ajax call to keep a user friendly responsive application.
I'm quite new to such full blown async architecture, and I'm having some questions to avoid future headache :
my web api calls can take some amount of times. How should I design properly the api to support long running operations (some kind of async?). I've read about the new async keyword, but for the sake of knowledge, I'd like to understand what's behind.
My calls to the service will consist is publishing a message and wait for the ack message. If I wrap this in a single method, how should I write this method? Should I "block" until the ack is received (I suppose I shouldn't)? Should I return a Task object and let the consumer decide?
I'm also wondering if SignalR can help me. With signalR, I think I can use a real "fire and forget" command issuing, and route up to the client to ack message.
Am I completely out of subject, and should I take another approach?
In term of implementation details / framework, I think I'll use :
Rabbitmq as messaging system
Masstransit to abstract the messaging system
asp.MVC 4 to build the UI
Webapi to isolate command issuing out of UI controllers, and to allow other kind of client to issue commands
my web api calls can take some amount of times.
How should I design properly the api to support long
running operations (some kind of async?).
I'm not 100% sure where you're going. You ask questions about Async but also mention message queuing, by throwing in RabbitMQ and MassTransit. Message queuing is asynchronous by default.
You also mention executing commands. If you're referring to CQRS, you seperate commands and queries. But what I'm not 100% about is what you're referring to when mentioning "long running processes".
When you query data, the data should already be present. Preferably in a way that is needed for the question at hand.
When you query data, no long-running-process should be started
When you execute commands, a long-running-processes can be started. But that's why you should use message queuing. Specify a task to start the long running process, create a message for it, throw it onto the queue, forget about it altogether. Some other process in the background will pick it up.
When the command is executed, the long-running-process can be started.
When the command is executed, a database can be updated with data
This data can be used by the API if someone requests data
When using this model, it doesn't matter that the long-running-process might take up to 10 minutes to complete. I won't go into detail on actually having a single thread take up to 10 minutes to complete, including locks on database, but I hope you get the point. Your API will be free almost instantly after throwing a message onto the queue. No need for Async there.
My calls to the service will consist is publishing a message and wait for the ack message.
I don't get this. The .NET Framework and your queuing platform take care of this for you. Why would you wait on an ack?
In MassTransit
Bus.Instance.Publish(new YourMessage{Text = "Hi"});
In NServiceBus
Bus.Publish(new YourMessage{Text = "Hi"});
I'm also wondering if SignalR can help me.
I should think so! Because of the asynchronous nature of messaging, the user has to 'wait' for updates. If you can provide this data by 'pushing' updates via SignalR to the user, all the better.
Am I completely out of subject, and should I take another approach?
Perhaps, I'm still not sure where you're going.
Perhaps read up on the following resources.
Resources:
http://www.udidahan.com/2013/04/28/queries-patterns-and-search-food-for-thought/
http://www.udidahan.com/2011/10/02/why-you-should-be-using-cqrs-almost-everywhere%E2%80%A6/
http://www.udidahan.com/2011/04/22/when-to-avoid-cqrs/
http://www.udidahan.com/2012/12/10/service-oriented-api-implementations/
http://bloggingabout.net/blogs/dennis/archive/2012/04/25/what-is-messaging.aspx
http://bloggingabout.net/blogs/dennis/archive/2013/07/30/partitioning-data-through-events.aspx
http://bloggingabout.net/blogs/dennis/archive/2013/01/04/databases-and-coupling.aspx
my web api calls can take some amount of times. How should I design
properly the api to support long running operations (some kind of
async?). I've read about the new async keyword, but for the sake of
knowledge, I'd like to understand what's behind.
Regarding Async, I saw this link being recommended on another question on stackoverflow:
http://msdn.microsoft.com/en-us/library/ee728598(v=vs.100).aspx
It says that when a request is made to an ASP .NET application, a thread is assigned to process the request from a limited thread pool.
An asynchronous controller action releases the thread back to the thread pool so that it is ready to accept addtitional requests. Within the action the operation which needs to be executed asynchronously is assigned to a callback controller action.
The asynchronous controller action is named using Async as the suffix and the callback action has a Completed suffix.
public void NewsAsync(string city) {}
public ActionResult NewsCompleted(string[] headlines) {}
Regarding when to use Async:
In general, use asynchronous pipelines when the following conditions
are true:
The operations are network-bound or I/O-bound instead of CPU-bound.
Testing shows that the blocking operations are a bottleneck in site performance and that IIS can service more requests by using
asynchronous action methods for these blocking calls.
Parallelism is more important than simplicity of code.
You want to provide a mechanism that lets users cancel a long-running request.
I think developing your service using ASP .NET MVC with Web API and using Async controllers where needed would be a good approach to developing a highly available web service.
Using a message based service framework like ServiceStack looks good too:
http://www.servicestack.net/
Additional resources:
http://msdn.microsoft.com/en-us/magazine/cc163725.aspx
http://www.codethinked.com/net-40-and-systemthreadingtasks
http://dotnet.dzone.com/news/net-zone-evolution
http://www.aaronstannard.com/post/2011/01/06/asynchonrous-controllers-ASPNET-mvc.aspx
http://channel9.msdn.com/Events/TechDays/Techdays-2012-the-Netherlands/2287
http://www.dotnetcurry.com/ShowArticle.aspx?ID=948 // also shows setup of performance tests
http://www.asp.net/mvc/tutorials/mvc-4/using-asynchronous-methods-in-aspnet-mvc-4
http://visualstudiomagazine.com/articles/2013/07/23/async-actions-in-aspnet-mvc-4.aspx
http://hanselminutes.com/327/everything-net-programmers-know-about-asynchronous-programming-is-wrong
http://www.hanselman.com/blog/TheMagicOfUsingAsynchronousMethodsInASPNET45PlusAnImportantGotcha.aspx

Suggestions for how to write a service in Rails 3

I am building an application which will send status requests to users (via email & sms) on a regular basis. I want to execute the service each hour which will:
Query the database for all requests that need to be sent (based on some logic)
Send the requests through Amazon's Simple Email Service (this is already working)
Write a record of the status request notification back to the data store
I am considering wrapping up this series of operations into a single controller with an end point that can be called remotely to kick off the process within the rails app.
Longer term, I will break this process out into an app that can be run independently of my rails app, but for now I'm just trying to keep it simple.
My first inclination is to build the following:
Controller with the following elements:
A method which will orchestrate the steps outlined above (and can be called externally)
A call to the status_request model which will bring back a collection of request needing to be sent
A loop to iterate through the pending requests, which will:
Make a call to my AWS Simple Email Service module to actually send the email, and
Make a call to the status_request model to log the request back to the database
Model:
A method on my status_request model which will bring back a collection of requests that need to be sent
A method in my status_request model which will log that a notification was sent
Since this will behave as a service that gets called periodically from an outside scheduler I don't think I'll need a view for this operation. (Will, of course, need views to show users and admins what requests have been sent, but that's later...).
As someone new to Rails, I'm asking for review of this approach and any suggestions you may have.
Thanks!
Instead of a controller which Jeff pointed out exposes a security risk, you may just want to expose a rake task and use cron to invoke it on an hourly basis.
If you are still interested in building a controller, look at devise gem and its single access token, token_authenticatable, for securing the methods you are exposing.
You may also want to look at delayed_job or resque to offload the call to status_request and the loop to AWS simple service to a background worker process.
You may want a seperate controller and view for the log file so you can review progress on demand.
And if you want to get real fancy use Amazon SNS to send you alerts when the service reaches some unacceptable level of failures, backlog, etc.
Since you are trying to invoke this from an outside process, your approach should work. You could also have a worker process that processes task when they are there.
You will need routes to expose your service, and you may want to also make security decisions. How will the service that invokes your application authenticate so all others can't hit it at will?
Another consideration should be how many emails are you sending. If there are enough, we may want to look into the fact that writing this sort of loop is going to be extremely top heavy; and may affect users on the current system if it's a web application.
In the end, there are many ways to do this. I would focus on the performance/usage you expect as well as security. There's never one perfect way to solve a problem like this, and your way should just be aware of the variables it will need to be operating within.
Resque and Redis might be helpful to you in scheduling and performing operatio n .They are simple and superfast, [here](http://railscasts.com/episodes/271-resque] is a simple tut on same.

How to send many emails via ASP.NET without delaying response

Following a specific action the user takes on my website, a number of messages must be sent to different emails. Is it possible to have a separate thread or worker take care of sending multiple emails so as to avoid having the response from the server take a while to return if there are a lot of emails to send?
I would like to avoid using system process or scheduled tasks, email queues.
You can definitely spawn off a background thread in your controller to handle the emails asynchronously.
I know you want to avoid queues, but another thing i have done in the past is written a windows service that pulls email from a DB queue and processes it at certain intervals. This way you can separate the 2 applications if there is a lot of email to be sent.
This can be done in many different ways, depending on how large your application is and what kind of reliability you want. Any of these ways should help you achieve what you want (in ascending order based on complexity):
If you're using IIS SMTP Server or another mail server that supports a pickup directory option, you can go with that. With this option, instead of sending the emails directly, they are saved first in the pickup directory. Your call will immediately return after the email is saved in the pickup directory, so the user won't have to wait until the email is sent. On the other hand, the server will try to send the email as soon as it's saved in the pickup directory so it's almost immediate (just without blocking the call).
You can use a background thread like described in other answers. You'll need to be careful with this option as the thread can end unexpectedly before it finishes its job. You'll need to add some code to make sure this works reliably (personally, I'd prefer not to use this option).
Using a messaging queue server like MSMQ. This is more work and you probably should only look into this if you have a large scale application or have good reasons not to use the first option with the pickup directory.
There are a few ways you could do this.
You could store enough details about the message in the database, and write a windows service to loop through them and send the email. When the user submits the form it just inserts the required data about the message and trusts the service will pick it up. Almost an email queue which you said you didn't want, but you're going to end up in a queue situation with almost any solution.
Another option would be to drop in NServiceBus. Use that for these kinds of tasks.
I typically compile the message body and store that in a table in the db along with the from and to addresses, a subject, and a timestamp indicating when the email was sent. Then I have a background task check the table periodically and pull any that haven't been sent. This task attempts to send each email and updates the timestamp accordingly. One advantage of storing the compiled message body up front is that the background task doesn't have to do any processing of context-specific data, and therefore can be pretty darn simple.
Whenever an operation like is hingent upon an event, there is always the possibility something will go wrong.
In ASP.NET you can spawn multiple threads and have those threads do the action. Make sure you tell the thread it's a background thread, otherwise ASP.NET might way for the thread to finish before rendering your page:
myThread.IsBackground = true;
I know you said you didn't want to use system process or scheduled tasks, but a windows service would be a viable approach to this as well. The approach would be to use MS Queue, or save the actions needing to be done in a DataBase table. Then have a windows service check every minute or so and do those actions.
This way, if something fails (Email server down) those emails / actions can still be done.
They will also be recorded for audit's (which is very nice to have).
This method allows you're web site to function as a website while offloading these tasks to another service. The last thing you need is for multiple ASP.NET processes to be used up waiting for emails to send. let something else handle that.

Resources