Over the last couple of years, I have done a fair amount of work on Amazon SWF, but the following points are still unclear to me and I am not able to find any straight forward answers on any forums yet.
These are pretty basic requirements I suppose, sure others might have come across too. Would be great if someone can clarify these.
Is there a simple way to return a workflow execution result (maybe just something as simple as boolean) back to workflow starter?
Is there a way to catch Activity timeout exception, so that we can do run customised actions in such scenarios?
Why doesn't WorkflowExecutionHistory contains Activities, why just Events?
Why there is no simple way of restarting a workflow from the point it failed?
I am considering to use SWF for more business processes at my workplace, but these limitations/doubts are holding me back!
FINAL WORKING SOLUTION
public class ReturnResultActivityImpl implements ReturnResultActivity {
SettableFuture future;
public ReturnResultActivityImpl() {
}
public ReturnResultActivityImpl(SettableFuture future) {
this.future = future;
}
public void returnResult(WorkflowResult workflowResult) {
System.out.print("Marking future as Completed");
future.set(workflowResult);
}
}
public class WorkflowResult {
public WorkflowResult(boolean s, String n) {
this.success = s;
this.note = n;
}
private boolean success;
private String note;
}
public class WorkflowStarter {
#Autowired
ReturnResultActivityClient returnResultActivityClient;
#Autowired
DummyWorkflowClientExternalFactory dummyWorkflowClientExternalFactory;
#Autowired
AmazonSimpleWorkflowClient swfClient;
String domain = "test-domain;
boolean isRegister = true;
int days = 7;
int terminationTimeoutSeconds = 5000;
int threadPollCount = 2;
int taskExecutorThreadCount = 4;
public String testWorkflow() throws Exception {
SettableFuture<WorkflowResult> workflowResultFuture = SettableFuture.create();
String taskListName = "testTaskList-" + RandomStringUtils.randomAlphabetic(8);
ReturnResultActivity activity = new ReturnResultActivityImpl(workflowResultFuture);
SpringActivityWorker activityWorker = buildReturnResultActivityWorker(taskListName, Arrays.asList(activity));
DummyWorkflowClientExternalFactory factory = new DummyWorkflowClientExternalFactoryImpl(swfClient, domain);
factory.getClient().doSomething(taskListName)
WorkflowResult result = workflowResultSettableFuture.get(20, TimeUnit.SECONDS);
return "Call result note - " + result.getNote();
}
public SpringActivityWorker buildReturnResultActivityWorker(String taskListName, List activityImplementations)
throws Exception {
return setupActivityWorker(swfClient, domain, taskListName, isRegister, days, activityImplementations,
terminationTimeoutSeconds, threadPollCount, taskExecutorThreadCount);
}
}
public class Workflow {
#Autowired
private DummyActivityClient dummyActivityClient;
#Autowired
private ReturnResultActivityClient returnResultActivityClient;
#Override
public void doSomething(final String resultActivityTaskListName) {
Promise<Void> activityPromise = dummyActivityClient.dummyActivity();
returnResult(resultActivityTaskListName, activityPromise);
}
#Asynchronous
private void returnResult(final String taskListname, Promise waitFor) {
ActivitySchedulingOptions schedulingOptions = new ActivitySchedulingOptions();
schedulingOptions.setTaskList(taskListname);
WorkflowResult result = new WorkflowResult(true,"All successful");
returnResultActivityClient.returnResult(result, schedulingOptions);
}
}
The standard pattern is to host a special activity in the workflow starter process that is used to deliver the result. Use a process specific task list to make sure that it is routed to a correct instance of the starter. Here are the steps to implement it:
Define an activity to receive the result. For example "returnResultActivity". Make this activity implementation to complete the Future passed to its constructor upon execution.
When the workflow is started it receives "resultActivityTaskList" as an input argument. At the end the workflow calls this activity with a workflow result. The activity is scheduled on the passed task list.
The workflow starter creates an ActivityWorker and an instance of a Future. Then it creates an instance of "returnResultActivity" with that future as a constructor parameter.
Then it registers the activity instance with the activity worker and configures it to poll on a randomly generated task list name. Then it calls "start workflow execution" passing the generated task list name as an input argument.
Then it wait on the Future to complete. The future.get() is going to return the workflow result.
Yes, if you are using the AWS Flow Framework a timeout exception is thrown when activity is timed out. If you are not using the Flow framework than you are making your life 100 times harder. BTW the workflow timeout is thrown into a parent workflow as a timeout exception as well. It is not possible to catch a workflow timeout exception from within the timing out instance itself. In this case it is recommended to not rely on workflow timeout, but just create a timer that would fire and notify workflow logic that some business event has timed out.
Because a single activity execution has multiple events associated to it. It should be pretty easy to write code that converts history to whatever representation of activities you like. Such code would just match the events that relate to each activities. Each event always has a reference to the related events, so it is easy to roll them up into higher level representation.
Unfortunately there is no easy answer to this one. Ideally SWF would support restarting workflow by copying its history up to the failure point. But it is not supported. I personally believe that workflow should be written in a way that it never fails but always deals with failures without failing. Obviously it doesn't work in case of failures due to unexpected conditions. In this case writing workflow in a way that it can be restarted from the beginning is the simplest approach.
Related
I've encountered a dependency injection scenario which I cannot find a way through.
We currently have an Azure function.
We are using dependency injection via the FunctionsStartup attribute.
That all works fine, until I get asked to make it work for multiple environments.
The tester found it too onerous to deploy to 7 different environments, so I was asked to re-jig the function so that it runs (in a loop) for those environments.
That means 7 different IConfigurations and somehow having 7 separate compartmentalised IOC registrations of services.
I can't think of a way of doing that, without significantly re-structuring the way abstractions are being resolved. Even if you set up registrations in a loop and inject an IEnumerable of a service, when it goes to resolve a child dependency, it just pulls the last one registered, rather than the one which was meant to correlate with the current item being iterated.
So, something like this (using Autofac):
Registration
foreach (var configuration in configurations)
{
containerBuilder.Register<ICosmosDbService<AccountUsage>>(sp =>
{
var dBConfig = CosmosDBHelper.GetProjectDatabaseConfig(configuration.Value, Project.Jupiter);
return CosmosClientInitializer<AccountUsage>.Initialize(dBConfig);
}).As<ICosmosDbService<AccountUsage>>();
}
Usage
private readonly IEnumerable<IAccountUsageService> _accountUsageService;
public JobScheduler(IEnumerable<IAccountUsageService> accountUsageService)
{
_accountUsageService = accountUsageService;
}
[FunctionName("JobScheduler")]
public async Task Run([TimerTrigger("0 */2 * * * *")] TimerInfo myTimer, ILogger log)
{
log.LogInformation($"Job Scheduler Timer trigger function executed at: {DateTime.Now}");
try
{
foreach (var usageService in _accountUsageService)
{
var logs = await usageService.GetCurrentAccountUsage("gfkjdsasjfa");
// ...
}
}
I realise this kind of DI usage is not ideal (and does not even work).
Is there a way to structure an Azure Function such that it can execute for different configurations in a compartmentalised manner? Or is this really just fighting against the technology?
You've got a couple of ways to do this - either inject the right dependencies into the function constructor, or resolve them dynamically using a service-locater type approach with a named instance.
Let's consider the second approach and what it would mean for your implementation. As you demonstrated, you'd be looping through your instances and resolving the dependency you want to use, then invoking it
foreach (var usageService in _accountUsageService)
{
var logs = await usageService.GetCurrentAccountUsage("named-instance");
logs.DoSomething();
}
This is technically possible, but now you're doing batch processing - you're doing more than once piece of work that's been triggered by a single event (the timer object), which means you have to deal with a couple of extra problems. What should you do if there's a failure with one of the instances, and what to do if one of the instances is running slowly?
Ideally, you want functions to do the smallest bit of work they can, and complete quickly - You don't want failure or slowness with one particular instance impacting the other instances. By breaking it down to the smallest piece of work (think, one event trigger does one piece of work) then you can take advantage of the functions runtime for things like retries on failures, and threading and concurrency is now being done for you by the runtime.
You could then think about a couple of ways you could do this. a) multiple function signatures and a service resolver approach, e.g.
public class JobScheduler
{
public JobScheduler(IEnumerable<IAccountUsageService> accountUsageService)
{
_accountUsageService = accountUsageService;
}
[FunctionName("FirstInstance")]
public Task FirstInstance([TimerTrigger("%MetricPoller:Schedule%")] TimerInfo myTimer)
{
var logs = await _accountUsageService.GetNamedInstance("instance-a");
logs.DoSomething();
}
[FunctionName("SecondInstance")]
public Task SecondInstance([TimerTrigger("%MetricPoller:Schedule%")] TimerInfo myTimer)
{
var logs = _accountUsageService.GetNamedInstance("instance-b");
logs.DoSomething();
}
}
or b), multiple classes with the necessary dependencies injected
public class JobSchedulerFirstInstance
{
public JobSchedulerFirstInstance(ILogs logs)
{
_logs = logs;
}
[FunctionName("FirstInstance")]
public Task FirstInstance([TimerTrigger("%MetricPoller:Schedule%")] TimerInfo myTimer)
{
_logs.DoSomething();
}
}
I'd personally lean towards multiple classes approach, and register named instances with my container. A bit of extra wire up work needed, but you'll end up with lots of small classes that all look very similar that are basically jus t plumbing that the functions runtime executes.
I am trying to achieve a fire and forget type of effect with webflux, thymeleaf and r2dbc. I have two endpoints, one to add an employee and another to list all employees. I want to simulate a slow database access so I have a thread sleep of several seconds before I call the DB.
Now, the effect I expect to see when I call /add is that my controller returns immediately and the page add is rendered at once. However, I'm not sure how to achieve this. With the current code nap() happens before I can return a Mono. In other words, I'm trying to run a long running job in the background without blocking the controller.
I have the following model:
#Data
public class Employee {
#Id
private Long id;
private String name;
}
The annotated controller has following methods:
#GetMapping(value = "/")
public String home(Model model) {
model.addAttribute("employees", repo.findAll());
return "home";
}
#GetMapping(value = "/add")
public Mono<String> add() {
return Mono
.defer(this::getEmployee)
.doOnNext(e -> repo.save(e).subscribe())
.thenReturn("add");
}
private Mono<Employee> getEmployee() {
final var e = new Employee();
e.setName("John");
nap(); // calls thread sleep for a few sec
return Mono.just(e);
}
My question is how can I wrap the long running job but at the same time preserve a Controller based notation (instead of functional) and also render the add page immediately? I am aware of some similar questions like this and this, but I don't seem to be able to achieve the behaviour I need.
Edit:
lkatiforis' suggestion and this SO question were a push in the right direction. I had to adjust their example a bit because the employee didn't persist. The change is in add():
public String add() {
Mono.just(employee)
.delayElement(Duration.ofSeconds(5))
.doOnNext(e -> repo.save(e).subscribe())
.subscribe();
return "add";
}
employee is just an instance of Employee with a populated name. The delayElement operator pauses for 5 seconds without blocking. Finally, I had to call subscribe() on repo.save() and at the end in order for it to work. I assume that if subscribe() is only called on doOnNext() then the main chain that starts with Mono.just() is never executed.
I guess nap() method executes Thread.sleep or something similar, right? Thread.sleep is blocking the main thread making the application unresponsive. You can use delayElements operator to simulate a long-running operation:
private Mono<Employee> getEmployee() {
final var e = new Employee();
e.setName("John");
return Mono.just(e).delayElement(Duration.ofSeconds(5));
}
We have an Entity Framework execution strategy coded in our lower environment. How do we test this to show it's actually working? We don't want to release to Prod without something to say we aren't introducing new problems.
The easy way is to use some listener where you can throw an exception and subscribe this listener to the dbContext.
public class CommandListener
{
[DiagnosticName("Microsoft.EntityFrameworkCore.Database.Command.CommandExecuting")]
public void OnCommandExecuting(DbCommand command, DbCommandMethod executeMethod, Guid commandId, Guid connectionId, bool async, DateTimeOffset startTime)
{
throw new TimeoutException("Test exception");
}
[DiagnosticName("Microsoft.EntityFrameworkCore.Database.Command.CommandExecuted")]
public void OnCommandExecuted(object result, bool async)
{
}
[DiagnosticName("Microsoft.EntityFrameworkCore.Database.Command.CommandError")]
public void OnCommandError(Exception exception, bool async)
{
}
}
Subscribe listener to dbContext f.i. in Startup.cs
var context = provider.GetService<SomeDbContext>();
var listener = context.GetService<DiagnosticSource>();
(listener as DiagnosticListener).SubscribeWithAdapter(new CommandListener());
As TimeoutException is transient exception in SqlServerRetryingExecutionStrategy.cs (if you use the default retrying strategy) you will get TimeoutException as many as your MaxRetryingCount of strategy setting has. Finally, you have to get RetryLimitExceededException as a result of the request.
You have to see TimeoutException in your app logs. Also it is good idea to turn on transaction logs "Microsoft.EntityFrameworkCore.Database.Transaction": "Debug"
What I did to manage to throw the transient exception and debug strategy (testing and playing around purpose only)
I added ExectuionStrategyBase.cs and
TestServerRetryingExecutionStrategy.cs. The first one is clone of
ExectuionStrategy.cs and second one is clone of
SqlServerRetryingExecutionStrategy.cs
In Startup.cs I set retrying strategy
strategy services.AddDbContext<SomeDbContext>(options =>
{options.UseSqlServer(Configuration.GetConnectionString("DefaultConnection"), sqlOption =>
{
sqlOption.ExecutionStrategy(dependencies =>
{
return new TestSqlRetryingStrategy(
dependencies,
settings.SqlMaxRetryCount,
settings.SqlMaxRetryDelay,
null);
});
});
In OnCommandExecuting of CommandListener.sc I just checked some static bool variable to throw or not TimeoutException and in ExectuionStrategyBase.cs I swithced that variable.
Thus, I managed to throw transient Exception on the first execution of the query and successful execution on the second short. Now I think about some long running transaction and kill session of this transaction in SSCM during execution of it.
Also, I found out that if there is a query like
var users = context.Users.AsNoTracking().ToArrayAsync() execution strategy is not implemented and I am stuck on it. I have been struggling with that a couple of days but still can figure out nothing. If remove AsNoTracking or replace ToArrayAsync() by something like FirstAsync() all foes well.
I'm evaluating Amazon SWF as an option to build a distributed workflow system. The main language will be Java, so the Flow framework is an obvious choice. There's just one thing that keeps puzzling me and I would get some other opinions before I can recommend it as a key component in our architecture:
The examples are all about tasks that produce a result after a deterministic, relatively short period of time (i.e. after some minutes). In our real-life business workflow, the matter looks different, here we have tasks that could take potentially weeks to complete. I checked the calculator already, having workflows that live 30 days or so do not lead to a cost explosion, so it seems they already counted in that possibility.
Did anybody use SWF for some scenario like this and can share any experience? Are there any recommendations, best practices how to design a workflow like this? Is Flow the right choice here?
It seems to me Activity implementations are expected to eventually return a value synchronously, however, for long running transactions we would rather use messages to send back worker results asynchronously.
Any helpful feedback is appreciated.
From the Amazon Simple Workflow Service point of view an activity execution is a pair of API calls: PollForActivityTask and RespondActivityTaskCompleted that share a task token. There is no requirement for those calls coming from the same thread, process or even host.
By default AWS Flow Framework executes an activity synchronously. Use #ManualActivityCompletion annotation to indicate that activity is not complete upon return of the activity method. Such activity should be explicitly completed (or failed) using provided ManualActivityCompletionClient.
Here is an example taken from the AWS Flow Framework Developer Guide:
#ManualActivityCompletion
public String getName() {
ActivityExecutionContext executionContext = contextProvider.getActivityExecutionContext();
String taskToken = executionContext.getTaskToken();
sendEmail("abc#xyz.com",
"Please provide a name for the greeting message and close task with token: " + taskToken);
return "This will not be returned to the caller";
}
public class CompleteActivityTask {
public void completeGetNameActivity(String taskToken) {
AmazonSimpleWorkflow swfClient = new AmazonSimpleWorkflowClient(…); //pass in user credentials
ManualActivityCompletionClientFactory manualCompletionClientFactory = new ManualActivityCompletionClientFactoryImpl(swfClient);
ManualActivityCompletionClient manualCompletionClient
= manualCompletionClientFactory.getClient(taskToken);
String result = "Hello World!";
manualCompletionClient.complete(result);
}
public void failGetNameActivity(String taskToken, Throwable failure) {
AmazonSimpleWorkflow swfClient
= new AmazonSimpleWorkflowClient(…); //pass in user credentials
ManualActivityCompletionClientFactory manualCompletionClientFactory
= new ManualActivityCompletionClientFactoryImpl(swfClient);
ManualActivityCompletionClient manualCompletionClient
= manualCompletionClientFactory.getClient(taskToken);
manualCompletionClient.fail(failure);
}
}
That an activity is implemented using #ManualActivityCompletion is an implementation detail. Workflow code calls it through the same interface and doesn't treat any differently than any activity implemented synchronously.
I have a few utility actions that return text output via return Content("my text","text/plain").
Sometimes these methods take a few minutes to run (i.e. log parsing, database maintenance).
I would like to modify my action method so that instead of returning all of the output at once, the text is instead streamed to the client when it is ready.
Here's a contrived example:
public ActionResult SlowText()
{
var sb = new System.Text.StringBuilder();
sb.AppendLine("This happens quickly...");
sb.AppendLine("Starting a slow 10 second process...");
System.Threading.Thread.Sleep(10000);
sb.AppendLine("All done with 10 second process!");
return Content(sb.ToString(), "text/plain");
}
As written, this action will return three lines of text after 10 seconds. What I want is a way to keep the response stream open, and return the first two lines immediately, and then the third line after 10 seconds.
I remember doing this 10+ years ago in Classic ASP 3.0 using the Response object. Is there an official, MVC-friendly way to accomplish this?
--
Update: using Razor .cshtml in the app; but not using any views (just ContentResult) for these actions.
Writing directly to the Response object should work, but only in some simple cases. Many MVC features depend on output writer substitution (e.g. partial views, Razor view engine, and others) and if you write directly to the Response your result will be out of order.
However, if you don't use a view and instead write straight in the controller then you should be fine (assuming your action is not being called as a child action).
I would skip the MVC controller entirely since you are going to break encapsulation anyway. In it's place I'd use a barenaked IHttpHandler implementation, streaming directly to the aforementioned output stream.
You are exposing yourself to a browser timeout if the process takes longer than originally intended. Then you don't have a way to recover what happened / unless you implement a separate method that gives the information on the long running process.
Given that you want the other method anyway, you can start a long running process and return immediately. Have the browser check the other method that gives the latest information on the long running process. On the last time I had to do this, I kept it simple and just set the refresh header from the controller before returning the view.
As for starting a long running process, you can do something like this:
// in the controller class
delegate void MyLongProcess();
//...
// in the method that starts the action
MyLongProcess processTask = new MyLongProcess(_someInstance.TheLongRunningImplementation);
processTask.BeginInvoke(new AsyncCallback(EndMyLongProcess), processTask);
//...
public void EndMyLongProcess(IAsyncResult result)
{
try{
MyLongProcess processTask = (MyLongProcess)result.AsyncState;
processTask.EndInvoke(result);
// anything you needed at the end of the process
} catch(Exception ex) {
// an error happened, make sure to log this
// as it won't hit the global.asax error handler
}
}
As for where do you put the log of the actions that happened, it's up to you to how long lived you want it to be. It can be as simple as a static field/class where you add the info of the ongoing process, or instead saving it to a data store where it can survive an application recycle.
The above assume this is all about a long running process that goes on reporting the actions that has been done. Streaming is a different subject, but the above might still play a role in keeping the operations in your controller & only the piece responsible of streaming what becomes available to the client in the action result.
You can implement your custom ActionResult like ContentStreamingResult and use HttpContext, HttpRequest and HttpResponse in the ExecuteResult method.
public class ContentStreamingResult : ActionResult
{
private readonly TextReader _reader;
public ContentStreamingResult(TextReader reader)
{
_reader = reader;
}
public override void ExecuteResult(ControllerContext context)
{
var httpContext = context.HttpContext;
//Read text from the reader and write to the response
}
}
public class YourController : Controller
{
public ContentStreamingResult DownloadText()
{
string text = "text text text";
return new ContentStreamingResult(new System.IO.StringReader(text));
}
}
Try Response.Flush and BufferOutput to false. Note it would work with the different action results, you have to directly write into the response object. Probably you can use it with conjunction with AsyncController.