MVC componentization vs parallel data retrieval

MVC componentization vs parallel data retrieval - asp.net-mvc

This question describes two approaches of solving the sophisticated architectural problem related to ASP.NET MVC. Unfortunately our team is quite new to this technology and we haven’t found any solid sources of information on this particular topic (except overviews where it’s said that MVC is more about separation than componentization). So as for now we are hesitating: whether our solution is appropriate or there is a different obvious way to solve this problem.
We have a requirement to make ASP.NET MVC-based design with componentization in mind. View engine Razor is also a requirement for us. The key feature here is that any level of controller’s nesting is expected (obviously thru Html.Action directive within .cshtml). Any controller could potentially obtain the data thru a webservice call (the final design can break this limitation, as it’s described below).
The issue is that the data must be obtained in async and maximum parallel fashion. E.g. if two backend calls within the controllers are independent they must be performed in parallel.
At first glance the usage of async MVC controllers could solve all the problems. But there is a hidden caveat: nested controller must be specified within cshtml only (within a view). And a .cshtml view is being parsed after the original controller finished its own async execution. So all the async operations within the nested controller will be performed in a separate async slot and therefore not in parallel with the first parent controller. This is a limitation of synchronous nature of .cshtml processing.
After a deep investigation we revealed that two options are available.
1) Have only one parent async controller which will retrieve all the data and put this data into container (dictionary or whatever). The nested controllers aren’t allowed to perform any backend calls. Instead of this they will have a reference to the initialized container with the results of all the backend calls. Bu this way the consumer of the framework must differentiate between parent and child controller which is not a brilliant solution.
2) Retrieve all the data from backends within a special async HttpModule. This module will initialize the same container which will reside, for instance within HttpContext. Obviously all the controllers in such a case will not be allowed to perform any backend calls, but they will have a unified internal structure (in comparison with #1 option).
As for now we think that the option #2 is more desirable, but we are more interested in the solid community-adopted way to solve this problem in a real enterprise-level MVC projects.
Literally any links/comments are welcomed.
[UPD] A requirement of any level of nesting of controllers came from our customer which wants a system where fully reusable MVC components will be presented. And they could be combined in any sequence with any level of nesting - as it is already done in the existing webforms-based implementation. This is a business rule for existing app that the components could be combined anyhow so we're not targeted to break this rule. As for now we think that such a component is a combination of "controller+view+metadata" where "metadata" part describes the backend calls to be performed in the scenario 1 or 2.

Why are you considering async calls here? Keep in mind if your async calls are so the asp.net threads don't get all used up since the db is taking a while to return, as soon as new requests come in they too will go to the db, thus increasing the workload and in turn gaining nothing.
To be honest though, Im having a hard time following exactly what you have in mind here. Nested controllers for...?
"The key feature here is that any level of controller’s nesting is expected"
I think I (we?) need a bit more information on that part here.
However, the warning on async still stands :)

E.g. if two backend calls within the controllers are
independent they must be performed in parallel.
If they are truly independent you might be able to use asynch JavaScript calls from the client and achieve some degree of parallelism that way.

Related

How to pass thread local variable in Project Reactor

I started using project reactor. Does anyone know how can I pass thread local variables from one thread to another? I saw some methods on Hooks.java but could not figure out what is the recommended way of doing this. Can someone point me to some documentation or with a code snippet on how to do it. Thanks.

I have a working example in this github repository based on the spring-cloud-sleuth's implementation: https://github.com/gumartinm/JavaForFun/tree/master/SpringJava/WebReactive/spring-webreactive-reactor-context-enrich
The key classes are: ContextCoreSubscriber.java, SubscriberContext.java, ThreadContextEnrichmentAutoConfiguration.java and UsernameFilter.java
ContextCoreSubscriber.java:
Enables you to fill the Mapped Diagnostic Context: MDC
SubscriberContext.java:
Helper class for inserting data in the Reactor's Context.
ThreadContextEnrichmentAutoConfiguration.java:
In charge of configuring the Reactor's Hooks: Hooks.onEachOperator
UsernameFilter.java:
Example where we want to register the username information based on some HTTP header.

Reactor doesn't guarantee that the processing done by a Flux or Mono chain of operators will stick executing on a single thread. On the contrary, it performs work-stealing and lets the user switch execution context.
As such, using ThreadLocal is not very adapted to Reactor.
There is currently some work done in 3.1.0 towards providing an equivalent, at least for library authors that use Reactor, but nothing definite in place yet.
Keep your eyes peeled for 3.1.0, that should be the main theme of that release (and will probably be the focus of the second upcoming milestone, M2).

Is using a Web API as dataprovider for a website efficient?

I was thinking about setting up a project with Web API. Basically build the API first and program the web site using this API.
Although it's sound promising I was wondering:
If I separate the logic in a nice way, I might end up retrieving data on a web-page through multiple API call's, which in turn are multiple connections with the server with all the overhead etc..
For example, if I use, let's say 8 different API call's on one page, I can't imagine it won't have an impact on the web-page's performance.
So, have I misunderstood something? Or is this kind of overhead negligible - or does the need for multiple call's indicates that the design is wrong?
Thanks in advance.

Well, we did it. Web API server providing the REST access to all the data. Independent UI Clients consuming it as the only access-point to underlying peristence.
The first request takes some time. It is significantly longer. It must init all the UI Client stuff, and get the least needed data from a server. (Menu, user, access rights, metadata...list-view data)
The point, the real advantage, is hidden in the second, the third... request. Lot of stuff is already there on a UI Client. And even, if this is requested again, caching (Server, Client, both) could be introduced.
So, this would mean more requests (at least during the UI Client start up)... but it does not imply ... slower application.
The maintenance benefit is hidden (maybe it is not hidden, it should be obvious) in the Separation of Concern. On the server, we are no longer solving the issue, where to place the user data handling, the base-controller or child-controller... should there by the Master-page, the Layout-controller...
Solved. We are taking care about single, specific stuff, published via REST. One method, one business operation. And that's the dream if we'd like to keep that application alive and be the repairman and extender.

One aspect is that you can display the page to the end user very very fast . Once the page is loaded, use Jquery async calls and any Javscript template tool (like angularjs or mustacheJs) to call the web api simultaneously to build the client page views.
I have used this approach in multiple project and experience of the user is tremendous.

Most modern browsers support 6-8 parallel connections to the same site. So you do have to be careful about that. Unless you are connecting to that many separate systems, I would try to reduce the number of connections. Or ensure the calls are called asynchronously by different events to reduce the chance of parallel connections.

Making a series of HTTP calls to obtain data for your page will have an overhead. Only testing will tell you how that might impact in your scenario.
There is little point using Web API just because you can. You should have a legitimate reason for building a RESTful API. Even then, if it is primarily for your own consumption, design it to deliver a ViewModel for each page in one call.

breeze memory management - pattern / practice?

I have an old SL4/ria app, which I am looking to replace with breeze. I have a question about memory use and caching. My app loads lists of Jobs (a typical user would have access to about 1,000 of these jobs). Additionally, there are quite a few lookup entity types. I want to make sure these are cached well client-side, but updated per session. When a user opens a job, it loads many more related entities (anywhere from 200 - 800 additional entities) which compose multiple matrix-style views for the jobs. A user can view the list of jobs, or navigate to view 1 job at a time.
I feel that I should be concerned with memory management, especially not knowing how browsers might deal with this. Originally I felt this should all be 1 EntityManager and I would detachEntities when user navigates away from a job, but I'm thinking this might benefit from multiple managers by intended lifetime. Or perhaps I should create a new dataservice & EntityManager each time the user navigates to a new hash '/#/' area, since comments on clear() seems to indicate that this would be faster? If I did this, I suppose I will be using pub/sub to notify other viewmodels of changes to entities? This seems complex and defeating some of the benefits of breeze as the context.
Any tips or thoughts about this would be greatly appreciated.

I think I understand the question. I think I would use a multi-manager approach:
Lookups Manager - holds once-per session reference (lookup) entities
JobsView Manager - "readonly" list of Jobs in support of the JobsView
JobEditor Manager - One per edit session.
The Lookups Manager maintains the canonical copy of reference entities. You can fill it once with a single call to server (see docs for how). This Lookups Manager will Breeze-export these reference entities to other managers which Breeze-import them as they are created. I am assuming that, while numerous and diverse, the total memory footprint of reference entities is pretty low ... low enough that you can afford to have more than one copy in multiple managers. There are more complicated solutions if that is NOT so. But let that be for now.
The JobsView Manager has the necessary reference entities for its display. If you only displayed a projection of the Jobs, it would not have Jobs in cache. You might have an array and key map instead. Let's keep it simple and assume that it has all the Jobs but not their related entities.
You never save changes with this manager! When editing or creating a Job, your app always fires up a "Job Editor" view with its own VM and JobEditor Manager. Again, you import the reference entities you need and, when editing an existing Job, you import the Job too.
I would take this approach anyway ... not just because of memory concerns. I like isolating my edit sessions into sandboxes. Eases cancellation. Gives me a clean way to store pending changes in browser storage so that the user won't lose his/her work if the app/browser goes down. Opens the door to editing several Jobs at the same time ... without worrying about mutually dependent entities-with-changes. It's a proven pattern that we've used forever in SL apps and should apply as well in JS apps.
When a Job edit succeeds, You have to tell the local client world about it. Lots of ways to do that. If the ONLY place that needs to know is the JobsView, you can hardcode a backchannel into the app. If you want to be more clever, you can have a central singleton service that raises events specifically about Job saving. The JobsView and each new JobEditor communicate with this service. And if you want to be hip, you use an in-process "Event Aggregator" (your pub/sub) for this purpose. I'd probably be using Durandal for this app anyway and it has an event aggregator in the box.
Honestly, it's not that complicated to use and importing/exporting entities among managers is a ... ahem ... breeze. Well worth it compared to refreshing the Jobs List every time you return to it (although you'll want a "refresh button" too because OTHER users could be adding/changing those Jobs). You retain plenty of Breeze benefits: querying, validation, change-tracking, batch saves, entity navigation (those reference lists work "for free" in Breeze).
As a refinement, I don't know that I would automatically destroy the JobEditor view/viewmodel/manager when I returned to the JobsView. In my experience, people often return to the same Job that they just left. I might hold on to a view so you could go back and forth quickly. But now I'm getting tricky.

When should one use asynchronous controller in asp.net mvc 2?

Thus far worked with asp.net mvc1 and just started with asp.net mvc2..... what are good candidates for executing a controller asynchronously? Should i use it for long running process or some background processing? What are the pros and cons choosing asynchronous controller in asp.net mvc 2? Any suggestion...

Only use async if the operation is IO bound. A good example would be aggregating RSS feeds from multiple servers and then displaying them in a webpage.
See:
http://msdn.microsoft.com/en-us/magazine/ee336138.aspx
http://blog.stevensanderson.com/2008/04/05/improve-scalability-in-aspnet-mvc-using-asynchronous-requests/
for a good overview of asynchronous controllers.
And for more in-depth but non-MVC specific info:
http://blogs.msdn.com/tmarq/archive/2010/04/14/performing-asynchronous-work-or-tasks-in-asp-net-applications.aspx

If your controller's method is calling some other system for example query DB or external resources with HTTP and operation lasts long then you should implement those methods as async to let the thread used to doing that be released from waiting on the results and used to service other requests. Our application is getting threads from ThreadPool and number of them is limited. Considering that limitation and the situations of having lot of application users and also long-time queries for data is it very easy to see how important it is.

Erlang gen_server vs stateless module

I've recently finished Joe's book and quite enjoyed it.
I'm since then started coding a soft realtime application with erlang and I have to say I am a bit confused at the use of gen_server.
When should I use gen_server instead of a simple stateless module?
I define a stateless module as follow:
- A module that takes it's state as a parameter (much like ETS/DETS) as opposed to keeping it internally (like gen_server)
Say for an invoice manager type module, should it initialize and return state which I'd then pass subsequently to it?
SomeState = InvoiceManager:Init(),
SomeState = InvoiceManager:AddInvoice(SomeState, AnInvoiceFoo).
Suppose I'd need multiple instances of the invoice manager state (say my application manages multiple companies each with their own invoices), should they each have a gen_server with internal state to manage their invoices or would it better fit to simply have the stateless module above?
Where is the line between the two?
(Note the invoice manage example above is just that, an example to illustrate my question)

I don't really think you can make that distinction between what you call a stateless module and gen_server. In both cases there is a recursive receive loop which carries state in at least one argument. This main loop handles requests, does work depending on the requests and, when necessary, sends results back the requesters. The main loop will most likely handle a number of administrative requests as well which may not be part of the main API/protocol.
The difference is that gen_server abstracts away the main receive loop and allows the user to only the write the actual user code. It will also handle many administrative OTP functions for you. The main difference is that the user code is in another module which means that you see the passed through state more easily. Unless you actually manage to write your code in one big receive loop and not call other functions to do the work there is no real difference.
Which method is better depends very much on what you need. Using gen_server will simplify your code and give you added functionality "for free" but it can be more restrictive. Rolling your own will give you more power but also you give more possibilities to screww things up. It is probably a little faster as well. What do you need?

It strongly depend of your needs and application design. When you need shared state between processes you have to use process to keep this state. Then gen_server, gen_fsm or other gen_* is your friend. You can avoid this design when your application is not concurrent or this design doesn't bring you some other benefits. For example break your application to processes will lead to simpler design. In other case sometimes you can choose single process design and using "stateless" modules for performance or such. "stateless" module is best choice for very simply stateless (pure functional) tasks. gen_server is often best choice for thinks that seems naturally "process". You must use it when you want share something between processes (using processes can be constrained by scalability or concurrency).

Having used both models, I must say that using the provided gen_server helps me stay structured more easily. I guess this is why it is included in the OTP stack of tools: gen_server is a good way to get the repetitive boiler-plate out of the way.

If you have shared state over multiple processes you should probably go with gen_server and if the state is just local to one process a stateless module will do fine.

I suppose your invoices (or whatever they stand for) should be persistent, so they would end up in an ETS/Mnesia table anyway. If this is so, you should create a stateless module where you put your API for accessing the invoice table.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart