In case it matters, here's the reason I want to do this:
My application will have many elements in various places on the same page that each trigger individually tiny http requests for small (less than a kilobyte, sometimes only tens of bytes) pieces of data. To avoid excessive per-request overhead, I want to combine them all into one larger request. I've got the code for combining and handling the combined request and response done, but I'm not sure how to tell when the requests have stopped coming and it's time to send it.
The idea I'm using right now is to use new Future.value().whenComplete() (I'm coding in Dart) to simply wait for the event loop to run, but I don't know whether Angular2's rendering spans multiple iterations of the event loop or not. Is this enough to guarantee Angular2 has invoked every property binding on the page before my http request goes out, and if not how can I get such a guarantee?
I don't think there is a better way.
Instead of
new Future.value().whenComplete()
You can just use
new Future(() {
// delayed code here
});
or to delay a bit more
new Future.delayed(const Duration(milliseconds: 10), () {
// delayed code here
});
Related
Problem Context
I am trying to generate a total (linear) order of event items per key from a real-time stream where the order is event time (derived from the event payload).
Approach
I had attempted to implement this using streaming as follows:
1) Set up a non overlapping sequential windows, e.g. duration 5 minutes
2) Establish an allowed lateness - it is fine to discard late events
3) Set accumulation mode to retain all fired panes
4) Use the "AfterwaterMark" trigger
5) When handling a triggered pane, only consider the pane if it is the final one
6) Use GroupBy.perKey to ensure all events in this window for this key will be processed as a unit on a single resource
While this approach ensures linear order for each key within a given window, it does not make that guarantee across multiple windows, e.g. there could be a window of events for the key which occurs after that is being processed at the same time as the earlier window, this could easily happen if the first window failed and had to be retried.
I'm considering adapting this approach where the realtime stream can first be processed so that it partitions the events by key and writes them to files named by their window range.
Due to the parallel nature of beam processing, these files will also be generated out of order.
A single process coordinator could then submit these files sequentially to a batch pipeline - only submitting the next one when it has received the previous file and that downstream processing of it has completed successfully.
The problem is that Apache Beam will only fire a pane if there was at least one time element in that time window. Thus if there are gaps in events then there could be gaps in the files that are generated - i.e. missing files. The problem with having missing files is that the coordinating batch processor cannot make the distinction between knowing whether the time window has passed with no data or if there has been a failure in which case it cannot proceed until the file finally arrives.
One way to force the event windows to trigger might be to somehow add dummy events to the stream for each partition and time window. However, this is tricky to do...if there are large gaps in the time sequence then if these dummy events occur surrounded by events much later then they will be discarded as being late.
Are there other approaches to ensuring there is a trigger for every possible event window, even if that results in outputting empty files?
Is generating a total ordering by key from a realtime stream a tractable problem with Apache Beam? Is there another approach I should be considering?
Depending on your definition of tractable, it is certainly possible to totally order a stream per key by event timestamp in Apache Beam.
Here are the considerations behind the design:
Apache Beam does not guarantee in-order transport, so there is no use within a pipeline. So I will assume you are doing this so you can write to an external system with only the capability to handle things if they come in order.
If an event has timestamp t, you can never be certain no earlier event will arrive unless you wait until t is droppable.
So here's how we'll do it:
We'll write a ParDo that uses state and timers (blog post still under review) in the global window. This makes it a per-key workflow.
We'll buffer elements in state when they arrive. So your allowed lateness affects how efficient of a data structure you need. What you need is a heap to peek and pop the minimum timestamp and element; there's no built-in heap state so I'll just write it as a ValueState.
We'll set a event time timer to receive a call back when an element's timestamp can no longer be contradicted.
I'm going to assume a custom EventHeap data structure for brevity. In practice, you'd want to break this up into multiple state cells to minimize the data transfered. A heap might be a reasonable addition to primitive types of state.
I will also assume that all the coders we need are already registered and focus on the state and timers logic.
new DoFn<KV<K, Event>, Void>() {
#StateId("heap")
private final StateSpec<ValueState<EventHeap>> heapSpec = StateSpecs.value();
#TimerId("next")
private final TimerSpec nextTimerSpec = TimerSpec.timer(TimeDomain.EVENT_TIME);
#ProcessElement
public void process(
ProcessContext ctx,
#StateId("heap") ValueState<EventHeap> heapState,
#TimerId("next") Timer nextTimer) {
EventHeap heap = firstNonNull(
heapState.read(),
EventHeap.createForKey(ctx.element().getKey()));
heap.add(ctx.element().getValue());
// When the watermark reaches this time, no more elements
// can show up that have earlier timestamps
nextTimer.set(heap.nextTimestamp().plus(allowedLateness);
}
#OnTimer("next")
public void onNextTimestamp(
OnTimerContext ctx,
#StateId("heap") ValueState<EventHeap> heapState,
#TimerId("next") Timer nextTimer) {
EventHeap heap = heapState.read();
// If the timer at time t was delivered the watermark must
// be strictly greater than t
while (!heap.nextTimestamp().isAfter(ctx.timestamp())) {
writeToExternalSystem(heap.pop());
}
nextTimer.set(heap.nextTimestamp().plus(allowedLateness);
}
}
This should hopefully get you started on the way towards whatever your underlying use case is.
I have an API code, which loads a data necessary for my application.
It's as simple as:
- (void) getDataForKey:(NSString*) key onSuccess:(id (^)())completionBlock
I cache data returned from server, so next calls of that functions should not do network request, until there is some data missing for given key, then I need to load it again from server side.
Everything was okey as long as I had one request per screen, but right now I have a case where I need to do that for every cell on one screen.
Problem is my caching doesn't work because before the response comes in from the first one, 5-6 more are created at the same time.
What could be a solution here to not create multiple network request and make other calls waiting for the first one ?
You can try to make a RequestManager class. Use dictionary to cache the requesting request.
If the next request is the same type as first one, don't make a new request but return the first one. If you choose this solution, you need to manager a completionBlock list then you will be able to send result to all requesters.
If the next request is the same type as first one, waiting in another thread until the first one done. Then make a new request, you API will read cache automatically. Your must make sure your codes are thread-safe.
Or you can use operation queues to do this. Some documents:
Apple: Operation Queues
Soheil Azarpour: How To Use NSOperations and NSOperationQueues
May be there will be so many time consuming solutions for this. I have a trick. Create a BOOL in AppDelegate, its default is FALSE. When you receive first response, then set it TRUE. So when you go to other screen and before making request just check value of your BOOL variable in if condition. If its TRUE means response received so go for it otherwise in else don't do anything.
I have a simple yet time consuming operation:
when the user clicks a button, it performs a database intensive operation, processing records from an import table into multiple other tables, one import record at a time.
I have a View with a button that triggers the operation and at the end of the operation a report is displayed.
I am looking at ways to notify the user that the operation is being processed.Here is a solution that I liked.
I have been reading up online about Asynchronous operations in MVC. I have found a numbers of links saying that if your process is CPU bound stick to using synchronous operations. Is database related process considered CPU bound or not?
Also if I got the Asynchronous operation route should I use AsyncController as described here or just use Task as in the example I mentioned and also here . or are they all the same?
The first thing you need to know is that async doesn't change the HTTP protocol. As I describe on my blog, the HTTP protocol gives you one response for each request. So you can't return once saying it's "in progress" and return again later saying it's "completed".
The easy solution is to only return when it's completed, and just use AJAX to toss up some "in progress..." notification on the client side, updating the page when the request completes. If you want to get more complex, you can use something like SignalR to have the server notify the client when the request is completed.
In particular, an async MVC action does not return "early"; ASP.NET will wait until all the asynchronous actions are complete, and then send the response. async code on the server side is all about scalability, not responsiveness.
That said, I do usually recommend asynchronous code on the server side. The one exception is if you only have a single DB backend (discussed well in this blog post). If your backend is a DB cluster or a distributed/NoSQL/SQL Azure DB, then you should consider making it asynchronous.
If you do decide to make your servers asynchronous, just return Tasks; AsyncController is just around for backwards compatibility these days.
Assuming C# 5.0, I would do something like this following:
// A method to get your intensive dataset
public async Task<IntensiveDataSet> GetIntensiveDataSet() {
//in here you'll want to use any of the newer await Async calls you find
// available for your operations. This prevents thread blocking.
var intensiveDataSet = new IntensiveData();
using (var sqlCommand = new SqlCommand(SqlStatement, sqlConnection))
{
using (var sqlDataReader = await sqlCommand.ExecuteReaderAsync())
{
while (await sqlDataReader.ReadAsync())
{
//build out your intensive data set.
}
}
}
return intensiveDataSet;
}
// Then in your controller, some method that uses that:
public async Task<JsonResult> Intense() {
return Json(await GetIntensiveDataSet());
}
In your JS you'd call it like this (With JQuery):
$.get('/ControllerName/Intense').success(function(data) {
console.log(data);
});
Honestly, I'd just show some sort of spinner while it was running.
If you do need some sort of feedback to the user, you would have to sprinkle updates to your user's Session throughout your async calls... and in order to do that you'd need to pass a reference to that Session around. Then you'd just add another simple JsonResult action that checked the message in the Session variable and poll it with JQuery on an interval. Seems like overkill though. In most cases a simple "This may take a while" is enough for people.
You should consider the option of implementing asynchronization using AJAX. You could handle the client "... processing" message right in your View, with minimum hassle,
$.ajax({
url: #Url.Action("ActionName"),
data: data
}).done(function(data) {
alert('Operation Complete!');
});
alert('Operation Started');
// Display processing animation
Handling async calls on the server side can be expensive, complicated and unnecessary.
I have a controller with two actions. One performs a very long computation, and at several steps, stores status in a session container:
public function longAction()
{
$session = new Container('SessionContainer');
$session->finished = 0;
$session->status = "A";
// do something long
$session->status = "B";
// do more long jobs
$session->status = "C";
// ...
}
The second controller:
public function shortAction()
{
$session = new Container('SessionContainer');
return new JsonModel(
array(
'status' => $session->status
)
);
}
These are both called via AJAX, but I can evidence the same behavior in just using browser tabs. I first call /module/long which does its thing. While it completes its tasks, calling /module/short (I thought would just echo JSON) stalls /module/long is done!
Bringing this up, some ZFers felt this was a valid protection against race conditions; but I can't be the only one with this use case that really doesn't care about the latter.
Any cheap tricks that avoid heading towards queues, databases, or memory caches? Trying to keep it lightweight.
this is the expected behavior. this is why:
Sessions are identified using a cookie to store the session id, this allows your browser to pickup the same session on the next request.
As you long process is using sessions, it will not call session_write_close() until the whole process execution is complete, meaning the session is still open while the long process is running.
when you connect with another browser tab the browser will try and pickup the same session (using the same cookie) which is still open and running the long process.
If you open the link using a different browser you will see the page will load fine and not wait around for the session_write_close() to be called, this is because it's opening a separate session (however you will not see the text you want as it's a separate session)
You could try and manually write and close (session_write_close()) the session, but that's probably not the best way to go about things.
It's definitely worth looking at something like Gearman for this, there's not that much extra work, and it's designed especially for this kind of async job processing. Even writing status to the database would be better, but that's still not ideal.
I'm trying to implement comet style features by polling the server for changes in data and holding the connection open untill there is something to response with.
Firstly i have a static variable on my controller which stores the time that the data was last updated:
public static volatile DateTime lastUpdateTime = 0;
So whenever the data i'm polling changes this variable will be changed.
I then have an Action, which takes the last time that the data was retrieved as a parameter:
public ActionResult Push(DateTime lastViewTime)
{
while (lastUpdateTime <= lastViewTime)
{
System.Threading.Thread.Sleep(10000);
}
return Content("testing 1 2 3...");
}
So if lastUpdateTime is less than or equal to the lastViewTime, we know that there is no new data, and we simply hold the request there in a loop, keeping the connection open, untill there is new information, which we could then send back to the client, which would handle the response and then make a new request, so the connection is essentially always open.
This seems to work fine but i'm concerned about thread safety, is this OK? Does lastUpdateTime need to be marked as volatile? Is there a better way?
Thanks
edit: perhaps i should use a lock object when i update the time value
private static object lastUpdateTimeLock = new object();
..
lock (lastUpdateTimeLock)
{
lastUpdateTime = DateTime.Now;
}
Regarding your original question, you do have to be careful with DateTimes, since they're actual objects in the .NET runtime. Only a few data types can be natively accessed (eg ints, bools) without locking (assuming you're not using Interlocked). If you want to avoid any issues with Datetimes, you can get the ticks as a long and use the Interlocked class to manage them.
That said, if you're looking for comet capabilities in a .NET application, you're unfortunately going to have to go a lot further than what you've got here. IIS/ASP.NET won't scale with the approach you've got in place right now; you'll hit limits before you even get to 100 users. Among other things, you will have to switch to using async handlers, and implement a custom bounded thread pool for the incoming requests.
If you really want a tested solution for ASP.NET/IIS, check out WebSync, it's a full comet server designed specifically for that purpose.
Honestly my concern would be with the number of connections kept open and the empty while loop. The connections you're probably fine on, but I'd definitely want to do some load testing to be sure.
The while (lastUpdateTime <= lastViewTime) {} seems like it should have a Thread.Sleep(100) or something in there. Otherwise I'd think it would consume a lot of cpu cycles needlessly.
The lock does not seem necessary to me around lastUpdateTime = DateTime.Now since the previous value does not matter. If it were lastUpdateTime = lastUpdateTime + 1 or something, then maybe it would be.