Run a long running job using the fire and forget strategy with Thymeleaf in Reactor and r2dbc - thymeleaf

I am trying to achieve a fire and forget type of effect with webflux, thymeleaf and r2dbc. I have two endpoints, one to add an employee and another to list all employees. I want to simulate a slow database access so I have a thread sleep of several seconds before I call the DB.
Now, the effect I expect to see when I call /add is that my controller returns immediately and the page add is rendered at once. However, I'm not sure how to achieve this. With the current code nap() happens before I can return a Mono. In other words, I'm trying to run a long running job in the background without blocking the controller.
I have the following model:
#Data
public class Employee {
#Id
private Long id;
private String name;
}
The annotated controller has following methods:
#GetMapping(value = "/")
public String home(Model model) {
model.addAttribute("employees", repo.findAll());
return "home";
}
#GetMapping(value = "/add")
public Mono<String> add() {
return Mono
.defer(this::getEmployee)
.doOnNext(e -> repo.save(e).subscribe())
.thenReturn("add");
}
private Mono<Employee> getEmployee() {
final var e = new Employee();
e.setName("John");
nap(); // calls thread sleep for a few sec
return Mono.just(e);
}
My question is how can I wrap the long running job but at the same time preserve a Controller based notation (instead of functional) and also render the add page immediately? I am aware of some similar questions like this and this, but I don't seem to be able to achieve the behaviour I need.
Edit:
lkatiforis' suggestion and this SO question were a push in the right direction. I had to adjust their example a bit because the employee didn't persist. The change is in add():
public String add() {
Mono.just(employee)
.delayElement(Duration.ofSeconds(5))
.doOnNext(e -> repo.save(e).subscribe())
.subscribe();
return "add";
}
employee is just an instance of Employee with a populated name. The delayElement operator pauses for 5 seconds without blocking. Finally, I had to call subscribe() on repo.save() and at the end in order for it to work. I assume that if subscribe() is only called on doOnNext() then the main chain that starts with Mono.just() is never executed.

I guess nap() method executes Thread.sleep or something similar, right? Thread.sleep is blocking the main thread making the application unresponsive. You can use delayElements operator to simulate a long-running operation:
private Mono<Employee> getEmployee() {
final var e = new Employee();
e.setName("John");
return Mono.just(e).delayElement(Duration.ofSeconds(5));
}

Related

How to pass data down the reactive chain

Whenever I need to pass data down the reactive chain I end up doing something like this:
public Mono<String> doFooAndPassDtoAsMono(Dto dto) {
return Mono.just(dto)
.flatMap(dtoMono -> {
Mono<String> result = // remote call returning a Mono
return Mono.zip(Mono.just(dtoMono), result);
})
.flatMap(tup2 -> {
return doSomething(tup2.getT1().getFoo(), tup2.getT2()); // do something that requires foo and result and returns a Mono
});
}
Given the below sample Dto class:
class Dto {
private String foo;
public String getFoo() {
return this.foo;
}
}
Because it often gets tedious to zip the data all the time to pass it down the chain (especially a few levels down) I was wondering if it's ok to simply reference the dto directly like so:
public Mono<String> doFooAndReferenceParam(Dto dto) {
Mono<String> result = // remote call returning a Mono
return result.flatMap(result -> {
return doSomething(dto.getFoo(), result); // do something that requires foo and result and returns a Mono
});
}
My concern about the second approach is that assuming a subscriber subscribes to this Mono on a thread pool would I need to guarantee that Dto is thread safe (the above example is simple because it just carries a String but what if it's not)?
Also, which one is considered "best practice"?
Based on what you have shared, you can simply do following:
public Mono<String> doFooAndPassDtoAsMono(Dto dto) {
return Mono.just(dto.getFoo());
}
The way you are using zip in the first option doesn't solve any purpose. Similarly, the 2nd option will not work either as once the mono is empty then the next flat map will not be triggered.
The case is simple if
The reference data is available from the beginning (i.e. before the creation of the chain), and
The chain is created for processing at most one event (i.e. starts with a Mono), and
The reference data is immutable.
Then you can simple refer to the reference data in a parameter or local variable – just like in your second solution. This is completely okay, and there are no concurrency issues.
Using mutable data in reactive flows is strongly discouraged. If you had a mutable Dto class, you might still be able to use it (assuming proper synchronization) – but this will be very surprising to readers of your code.

Amazon SWF queries

Over the last couple of years, I have done a fair amount of work on Amazon SWF, but the following points are still unclear to me and I am not able to find any straight forward answers on any forums yet.
These are pretty basic requirements I suppose, sure others might have come across too. Would be great if someone can clarify these.
Is there a simple way to return a workflow execution result (maybe just something as simple as boolean) back to workflow starter?
Is there a way to catch Activity timeout exception, so that we can do run customised actions in such scenarios?
Why doesn't WorkflowExecutionHistory contains Activities, why just Events?
Why there is no simple way of restarting a workflow from the point it failed?
I am considering to use SWF for more business processes at my workplace, but these limitations/doubts are holding me back!
FINAL WORKING SOLUTION
public class ReturnResultActivityImpl implements ReturnResultActivity {
SettableFuture future;
public ReturnResultActivityImpl() {
}
public ReturnResultActivityImpl(SettableFuture future) {
this.future = future;
}
public void returnResult(WorkflowResult workflowResult) {
System.out.print("Marking future as Completed");
future.set(workflowResult);
}
}
public class WorkflowResult {
public WorkflowResult(boolean s, String n) {
this.success = s;
this.note = n;
}
private boolean success;
private String note;
}
public class WorkflowStarter {
#Autowired
ReturnResultActivityClient returnResultActivityClient;
#Autowired
DummyWorkflowClientExternalFactory dummyWorkflowClientExternalFactory;
#Autowired
AmazonSimpleWorkflowClient swfClient;
String domain = "test-domain;
boolean isRegister = true;
int days = 7;
int terminationTimeoutSeconds = 5000;
int threadPollCount = 2;
int taskExecutorThreadCount = 4;
public String testWorkflow() throws Exception {
SettableFuture<WorkflowResult> workflowResultFuture = SettableFuture.create();
String taskListName = "testTaskList-" + RandomStringUtils.randomAlphabetic(8);
ReturnResultActivity activity = new ReturnResultActivityImpl(workflowResultFuture);
SpringActivityWorker activityWorker = buildReturnResultActivityWorker(taskListName, Arrays.asList(activity));
DummyWorkflowClientExternalFactory factory = new DummyWorkflowClientExternalFactoryImpl(swfClient, domain);
factory.getClient().doSomething(taskListName)
WorkflowResult result = workflowResultSettableFuture.get(20, TimeUnit.SECONDS);
return "Call result note - " + result.getNote();
}
public SpringActivityWorker buildReturnResultActivityWorker(String taskListName, List activityImplementations)
throws Exception {
return setupActivityWorker(swfClient, domain, taskListName, isRegister, days, activityImplementations,
terminationTimeoutSeconds, threadPollCount, taskExecutorThreadCount);
}
}
public class Workflow {
#Autowired
private DummyActivityClient dummyActivityClient;
#Autowired
private ReturnResultActivityClient returnResultActivityClient;
#Override
public void doSomething(final String resultActivityTaskListName) {
Promise<Void> activityPromise = dummyActivityClient.dummyActivity();
returnResult(resultActivityTaskListName, activityPromise);
}
#Asynchronous
private void returnResult(final String taskListname, Promise waitFor) {
ActivitySchedulingOptions schedulingOptions = new ActivitySchedulingOptions();
schedulingOptions.setTaskList(taskListname);
WorkflowResult result = new WorkflowResult(true,"All successful");
returnResultActivityClient.returnResult(result, schedulingOptions);
}
}
The standard pattern is to host a special activity in the workflow starter process that is used to deliver the result. Use a process specific task list to make sure that it is routed to a correct instance of the starter. Here are the steps to implement it:
Define an activity to receive the result. For example "returnResultActivity". Make this activity implementation to complete the Future passed to its constructor upon execution.
When the workflow is started it receives "resultActivityTaskList" as an input argument. At the end the workflow calls this activity with a workflow result. The activity is scheduled on the passed task list.
The workflow starter creates an ActivityWorker and an instance of a Future. Then it creates an instance of "returnResultActivity" with that future as a constructor parameter.
Then it registers the activity instance with the activity worker and configures it to poll on a randomly generated task list name. Then it calls "start workflow execution" passing the generated task list name as an input argument.
Then it wait on the Future to complete. The future.get() is going to return the workflow result.
Yes, if you are using the AWS Flow Framework a timeout exception is thrown when activity is timed out. If you are not using the Flow framework than you are making your life 100 times harder. BTW the workflow timeout is thrown into a parent workflow as a timeout exception as well. It is not possible to catch a workflow timeout exception from within the timing out instance itself. In this case it is recommended to not rely on workflow timeout, but just create a timer that would fire and notify workflow logic that some business event has timed out.
Because a single activity execution has multiple events associated to it. It should be pretty easy to write code that converts history to whatever representation of activities you like. Such code would just match the events that relate to each activities. Each event always has a reference to the related events, so it is easy to roll them up into higher level representation.
Unfortunately there is no easy answer to this one. Ideally SWF would support restarting workflow by copying its history up to the failure point. But it is not supported. I personally believe that workflow should be written in a way that it never fails but always deals with failures without failing. Obviously it doesn't work in case of failures due to unexpected conditions. In this case writing workflow in a way that it can be restarted from the beginning is the simplest approach.

breeze: creating inheritance in client-side model

I'm having a weird issue with the configureMetadataStore.
My model:
class SourceMaterial {
List<Job> Jobs {get; set;}
}
class Job {
public SourceMaterial SourceMaterial {get; set;}
}
class JobEditing : Job {}
class JobTranslation: Job {}
Module for configuring Job entities:
angular.module('cdt.request.model').factory('jobModel', ['breeze', 'dataService', 'entityService', modelFunc]);
function modelFunc(breeze, dataService, entityService) {
function Ctor() {
}
Ctor.extend = function (modelCtor) {
modelCtor.prototype = new Ctor();
modelCtor.prototype.constructor = modelCtor;
};
Ctor.prototype._configureMetadataStore = _configureMetadataStore;
return Ctor;
// constructor
function jobCtor() {
this.isScreenDeleted = null;
}
function _configureMetadataStore(entityName, metadataStore) {
metadataStore.registerEntityTypeCtor(entityName, jobCtor, jobInitializer);
}
function jobInitializer(job) { /* do stuff here */ }
}
Module for configuring JobEditing entities:
angular.module('cdt.request.model').factory(jobEditingModel, ['jobModel', modelFunc]);
function modelFunc(jobModel) {
function Ctor() {
this.configureMetadataStore = configureMetadataStore;
}
jobModel.extend(Ctor);
return Ctor;
function configureMetadataStore(metadataStore) {
return this._configureMetadataStore('JobEditing', metadataStore)
}
}
Module for configuring JobTranslation entities:
angular.module('cdt.request.model').factory(jobTranslationModel, ['jobModel', modelFunc]);
function modelFunc(jobModel) {
function Ctor() {
this.configureMetadataStore = configureMetadataStore;
}
jobModel.extend(Ctor);
return Ctor;
function configureMetadataStore(metadataStore) {
return this._configureMetadataStore('JobTranslation', metadataStore)
}
}
Then Models are configured like this :
JobEditingModel.configureMetadataStore(dataService.manager.metadataStore);
JobTranslationModel.configureMetadataStore(dataService.manager.metadataStore);
Now when I call createEntity for a JobEditing, the instance is created and at some point, breeze calls setNpValue and adds the newly created Job to the np SourceMaterial.
That's all fine, except that it is added twice !
It happens when rawAccessorFn(newValue); is called. In fact it is called twice.
And if I add a new type of job (hence I register a new type with the metadataStore), then the new Job is added three times to the np.
I can't see what I'm doing wrong. Can anyone help ?
EDIT
I've noticed that if I change:
metadataStore.registerEntityTypeCtor(entityName, jobCtor, jobInitializer);
to
metadataStore.registerEntityTypeCtor(entityName, null, jobInitializer);
Then everything works fine again ! So the problem is registering the same jobCtor function. Should that not be possible ?
Our Bad
Let's start with a Breeze bug, recently discovered, in the Breeze "backingStore" model library adapter.
There's a part of that adapter which is responsible for rewriting data properties of the entity constructor so that they become observable and self-validating and it kicks in when register a type with registerEntityTypeCtor.
It tries to keep track of which properties it has rewritten. The bug is that it records the fact of rewrite on the EntityType rather than on the constructor function. Consequently, every time you registered a new type, it failed to realize that it had already rewritten the properties of the base Job type and re-wrapped the property.
This was happening to you. Every derived type that you registered re-wrapped/re-wrote the properties of the base type (and of its base type, etc).
In your example, a base class Job property would be re-written 3 times and its inner logic executed 3 times if you registered three of its sub-types. And the problem disappeared when you stopped registering constructors of sub-types.
We're working on a revised Breeze "backingStore" model library adapter that won't have this problem and, coincidentally, will behave better in test scenarios (that's how we found the bug in the first place).
Your Bad?
Wow that's some hairy code you've got there. Why so complicated? In particular, why are you adding a one-time MetadataStore configuration to the prototypes of entity constructor functions?
I must be missing something. The code to register types is usually much smaller and simpler. I get that you want to put each type in its own file and have it self-register. The cost of that (as you've written it) is enormous bulk and complexity. Please reconsider your approach. Take a look at other Breeze samples, Zza-Node-Mongo for example.
Thanks for reporting the issue. Hang in there with us. A fix should be arriving soon ... I hope in the next release.

How to process an excel file more efficiently?

I have an excel file that enters through my MVC web app that I have to process and do things with. So I receive my file on the controller
public class StripExcelDocument
{
public DataSet Convert(HttpPostedFileBase file)
{
return GetDataFromExcel(file.InputStream);
}
private DataSet GetDataFromExcel(Stream target)
{
var excelReader = ExcelReaderFactory.CreateOpenXmlReader(target);
excelReader.IsFirstRowAsColumnNames = true;
return excelReader.AsDataSet();
}
}
and I send it through a processor I have created that is just a large conditional statement and then based on the outcome it gets sent to a specific table in a database.
public class Processor{
public Result Process
{
if (FirstCondition(string foo, int bar)){
SetResult(foo, bar);
}
if (SecondCondition(string foo, int bar)){
SetResult(foo, bar);
}
if (ThirdCondition(string foo, int bar)){
SetResult(foo, bar);
}
//etc...
Obviously this works great when the user wants to enter a single record but when processing large excel files it either:
A: Times out on the server.
B: Leaves the user staring at a screen for a while.
What is a more effective way to deal with bulk processing large amounts of data from an excel file, where the records will need to be their own entity in the database?
Try to keep it as last option. Because SqlBulkCopy belongs to some older versions of .net and may be there are some better things available now.
Do the Bulk Import for all the records of excel sheet in some table. So you can use SqlBulkCopy.
Create a Stored proc and based upon the conditions, use the Insert/Update in one shot.
The above approach in Stored proc will be faster as comparing to Linq operations in code behind.
A: Times out on the server. B: Leaves the user staring at a screen for
a while.
Do it asynchronously.
Example Code
class ThreadTest
{
public ActionResult Main()
{
Thread t = new Thread (WriteY);
t.Start();
return View();
}
void WriteY()
{
}
}
For TimeOut
sqlcommand.CommandTimeout = 0 will set it to infinite

Streaming text output for long-running action?

I have a few utility actions that return text output via return Content("my text","text/plain").
Sometimes these methods take a few minutes to run (i.e. log parsing, database maintenance).
I would like to modify my action method so that instead of returning all of the output at once, the text is instead streamed to the client when it is ready.
Here's a contrived example:
public ActionResult SlowText()
{
var sb = new System.Text.StringBuilder();
sb.AppendLine("This happens quickly...");
sb.AppendLine("Starting a slow 10 second process...");
System.Threading.Thread.Sleep(10000);
sb.AppendLine("All done with 10 second process!");
return Content(sb.ToString(), "text/plain");
}
As written, this action will return three lines of text after 10 seconds. What I want is a way to keep the response stream open, and return the first two lines immediately, and then the third line after 10 seconds.
I remember doing this 10+ years ago in Classic ASP 3.0 using the Response object. Is there an official, MVC-friendly way to accomplish this?
--
Update: using Razor .cshtml in the app; but not using any views (just ContentResult) for these actions.
Writing directly to the Response object should work, but only in some simple cases. Many MVC features depend on output writer substitution (e.g. partial views, Razor view engine, and others) and if you write directly to the Response your result will be out of order.
However, if you don't use a view and instead write straight in the controller then you should be fine (assuming your action is not being called as a child action).
I would skip the MVC controller entirely since you are going to break encapsulation anyway. In it's place I'd use a barenaked IHttpHandler implementation, streaming directly to the aforementioned output stream.
You are exposing yourself to a browser timeout if the process takes longer than originally intended. Then you don't have a way to recover what happened / unless you implement a separate method that gives the information on the long running process.
Given that you want the other method anyway, you can start a long running process and return immediately. Have the browser check the other method that gives the latest information on the long running process. On the last time I had to do this, I kept it simple and just set the refresh header from the controller before returning the view.
As for starting a long running process, you can do something like this:
// in the controller class
delegate void MyLongProcess();
//...
// in the method that starts the action
MyLongProcess processTask = new MyLongProcess(_someInstance.TheLongRunningImplementation);
processTask.BeginInvoke(new AsyncCallback(EndMyLongProcess), processTask);
//...
public void EndMyLongProcess(IAsyncResult result)
{
try{
MyLongProcess processTask = (MyLongProcess)result.AsyncState;
processTask.EndInvoke(result);
// anything you needed at the end of the process
} catch(Exception ex) {
// an error happened, make sure to log this
// as it won't hit the global.asax error handler
}
}
As for where do you put the log of the actions that happened, it's up to you to how long lived you want it to be. It can be as simple as a static field/class where you add the info of the ongoing process, or instead saving it to a data store where it can survive an application recycle.
The above assume this is all about a long running process that goes on reporting the actions that has been done. Streaming is a different subject, but the above might still play a role in keeping the operations in your controller & only the piece responsible of streaming what becomes available to the client in the action result.
You can implement your custom ActionResult like ContentStreamingResult and use HttpContext, HttpRequest and HttpResponse in the ExecuteResult method.
public class ContentStreamingResult : ActionResult
{
private readonly TextReader _reader;
public ContentStreamingResult(TextReader reader)
{
_reader = reader;
}
public override void ExecuteResult(ControllerContext context)
{
var httpContext = context.HttpContext;
//Read text from the reader and write to the response
}
}
public class YourController : Controller
{
public ContentStreamingResult DownloadText()
{
string text = "text text text";
return new ContentStreamingResult(new System.IO.StringReader(text));
}
}
Try Response.Flush and BufferOutput to false. Note it would work with the different action results, you have to directly write into the response object. Probably you can use it with conjunction with AsyncController.

Resources