Is it possible to configure the fine-grained CPU profile to record method invocations for the NashornScriptEngine?
I'm trying to analyse some very slow requests in a web server. I have configured a trigger in Perfino that records "fine-grained CPU data in profiling mode", so I can use JProfiler to inspect this data. But I cannot find how to make Perfino recurse into NashornScriptEngine methods:

In the method sampling configuration you can delete the "jdk." filter to measure these packages.


Load Test Application calling external http service

Thanks for looking this question, I have an application which reads from JMS Queue and processes the mesages and POST the processed message to external http service. What will be best way to load test using gatling.
I can simulate load on queue using gatling.jms. How to verify POST to external service.
Load testing with Gatling is a fairly complex affair to do it right. I've done it enough to know some of the pitfalls so here is some insight that may be useful:
you want to test over the network and you want the latency to be minimal so that delays due to network latency are minimized/nullified and so that the results show how quickly incoming HTTP requests can be handled/responded to. For this reason, if your application is in the cloud in europe-east, say, you want to run your tests from the same location. If your requests were coming from us-west, there'd be a big delay in routing the requests from the wrong side of the US which could introduce big variations in the response times to/from your application.
Remove all other load from your service. If you can't remove load because you're hoping to test against a live application, then you need to make another deployment to test against that has no active load
Load tests should run for (in my experience) 45 minutes as a minimum to verify your service can handle the load. Reason for this being that it can take time for an unbearable load to accumulate on the server... so you may run at 33req/s which is fine for 40 minutes, but when run for 45-60 mins, its just long enough that the balance between what your application can cope with, vs. what causes catastrophic failure is tipped towards failure.
You don't need to test to destruction but it is sometimes a useful metric to be aware of. I find using a binary search strategy works well here to get peak load relatively quickly.
What you should test is that your application can handle the load you expect it to receive in a worst case scenario; Different organisations have different tolerances for how much load they expect their applications to be able to cope with. At some places I've worked they've used a lot of optimisations to minimise load directly to their servers, but if those protections fail, the server is expected to handle 10x more traffic than the usual load. At other places, those same optimisations were not in place, instead there were be disaster recovery systems available, ready to pick up when the main app fails. In this case the application only needed to be able to handle 2x the peak load (as observed by assessing logs/metrics for the past year).
I work predominantly with garbage collected languages on the JVM. I'm aware there are now Zero Garbage Collection designs/capabilities which could help minimize the effects of a buildup of GC tasks... so there are almost always optimisations you can make either with language/memory settings, database indexing, or within your application itself, or the strategies you employ to perform a task effectively, before you start changing the hardware.
Peak load can be assessed from logs/metrics systems

Does Serilog.Sinks.Console sink get any benefits from wrapping to Serilog.Sinks.Async sink?

I using Serilog inside my aspnet core app for logging. And i need to write log events to console pretty frequently (300-500 events per second). I run my app inside docker container and procces console logs using orchestrator tools.
So my question: should i use Async wrapper for my Console sink and will i get any benefits from that?
I read the documentation (, but can't understand is it actual for Console sink or not.
The Async Sink takes the already-captured LogEvent items and shifts them to a single background processor from multiple foreground threads using a ConcurrentQueue Producer/Consumer collection. In general that's a good thing for stable throughput esp with that throughput of events.
Also if sending to >1 sink, shifting to a background thread which will be scheduled as necessary focusing on that workload (i.e., paths propagating to sinks being in cache) can be good if you have enough cores available and/or Sinks block even momentarily.
Having said that, to base anything of this information is premature optimization.
Console sinks and their ability to ingest efficiently without blocking if you don't put an Async in front, always Depends a lot - for example, hosting environments typically synthesize a stdout that buffers efficiently. When that works well, adding an Async in front of the Console sink is merely going to prolong object lifetimes without much benefit vs allowing each thread submit to the Console sink directly.
So, it depends - IME feeding everything to Async and doing all processing there (e.g. writing to a buffered file, emitting every .5s (perhaps to a sidecar process that forwards to your log store)) can work well. The bottom line is that a good load generator rig is a very useful thing for any high throughput app. Once you have one, you can experiment - I've seen 30% throughput gains from reorganizing the exact same output and how it's scheduled (admittedly I also switched to Serilog during that transition - you're unlikely to see anything of that order).

In, what is the correct way to do expensive operations without impacting other users?

I asked this question about 5 years ago around how to "offload" expensive operations where the users doesn't need to wait for (such as auditng, etc) so they get a response on the front end quicker.
I now have a related but different question. On my, I have build some reporting pages where you can generate excel reports (i am using EPPlus) and powerpoint reports (i am using aspose.slides). Here is an example controller action:
public ActionResult GenerateExcelReport(FilterParams args)
byte[] results = GenerateLargeExcelReportThatTake30Seconds(args);
return File(results, #"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml", "MyReport.xlsx");
The functionality working great but I am trying to figure out if these expensive operations (some reports can take up to 30 seconds to return) are impacting other users. In the previous question, I had an expensive operation that the user DIDN"T have to wait for but in this case he does have to wait for as its a syncronoous activity (click Generate Report and expectation is that users get a report when its finished)
In this case, I don't care that the main user has to wait 30 seconds but i just want to make sure I am not negatively impacting other users because of this expensive operation, generating files, etc
Is there any best practice here in for this use case ?
You can try combination of Hangfire and SignalR. Use Hangfire to kickoff a background job and relinquish the http request. And once report generation is complete, use SignalR to generate a push notification.
SignalR notification from server to client
Alternate option is to implement a polling mechanism on client side.
Send an ajax call to enque a hangfire job to generate the report.
And then start polling some api using another ajax call that provides status and as soon report is ready, retrieve it. I prefer to use SignalR rather than polling.
If the report processing is impacting the performance on the web server, offload that processing to another server. You can use messaging (ActiveMQ or RabbitMQ or some other framework of your choice) or rest api call to kick off report generation on another server and then again use messaging or rest api call to notify report generation completion back to the web server, finally SignalR to notify the client. This will let the web server be more responsive.
Regarding your question
Is there any best practice here in for this use case
You have to monitor your application overtime. Monitor both Client side as well as server side. There are few tools you can rely upon such as newrelic, app dynamics. I have used newrelic and it has features to track issues both at client browser as well as server side. The names of the product are "NewRelic Browser" and "NewRelic Server". I am sure there are other tools that will capture similar info.
Analyze the metrics overtime and if you see any anomalies then take appropriate actions. If you observe server side CPU and memory spikes, try capturing metrics on client side around same timeframe. On client side if you notice any timeout issues, connection errors that means your application users are unable to connect to your app while the server is doing some heavy lifting. Next try to Identify server side bottlenecks. If there is not enough room to performance tune the code, then go thru some server capacity planning exercise and figure out how to further scale your hardware or move the background jobs out of the web servers to reduce load. Just capturing metrics using these tools may not be enough, you may have to instrument (log capturing) your application to capture additional metrics to properly monitor application health.
Here you can find some information about capacity planning for .net application from Microsoft.
These are all great ideas on how to move work out of the request/response cycle. But I think #leora simply wants to know whether a long-running request will adversely impact other users of an application.
The answer is no. is multi-threaded. Each request is handled by a separate worker thread.
In general it could be considered a good practice to run long running tasks in background and give some kind of notification to user when the job is done. As you probably know web request execution time is limited to 90 seconds, so if your long running task could exceed this, you have no choice but to run in some other thread/process. If you are using .net 4.5.2 you can use HostingEnvironment.QueueBackgroundWorkItem for running long running tasks in background and use SignalR to notify user when the task is finished the execution. In case that you are generating a file you can store it on server with some unique ID and send to user a link for downloading it. You can delete this file later (with some windows service for example).
As mentioned by others, there are some more advanced background task runners such as Hangfire, Quartz.Net and others but the general concept is the same - run task in backround and notify user when it is done. Here is some nice article about different oprions to run background tasks.
You need to use async and await of C#.
From your question I figured that you are just concerned with the fact that the request can be taking more resources than it should, instead of with scalability. If that's the case, make your controller actions async, as well as all the operations you call, as long as they involve calls that block threads. e.g. if your requests go through wires or I/O operations, they will be blocking the thread without async (technically, you will, since you will wait for the response before continuing). With async, those threads become available (while awaiting for the response), and so they can potentially serve other requests of other users.
I assumed you are not wandering how to scale the requests. If you are, let me know, and I can provide details on that as well (too much to write unless it's needed).
I believe a tool/library such as Hangfire is what your looking for. First, it'll allows for you to specify a task run on a background thread (in the same application/process). Using various techniques, such as SignalR allows for real-time front-end notification.
However, something I set up after using Hangfire for nearly a year was splitting our job processing (and implementation) to another server using this documentation. I use an internal ASP.NET MVC application to process jobs on a different server. The only performance bottleneck, then, is if both servers use the same data store (e.g. database). If your locking the database, the only way around it is to minimize the locking of said resource, regardless if the methodology you use.
I use interfaces to trigger jobs, stored in a common library:
public interface IMyJob
MyJobResult Execute( MyJobSettings settings );
And, the trigger, found in the front-end application:
//tell the job to run
var settings = new MyJobSettings();
_backgroundJobClient.Enqueue<IMyJob>( c => c.Execute( settings ) );
Then, on my background server, I write the implementation (and hook in it into the Autofac IOC container I'm using):
public class MyJob : IMyJob
protected override MyJobResult Running( MyJobSettings settings )
//do stuff here
I haven't messed too much with trying to get SignalR to work across the two servers, as I haven't run into that specific use case yet, but it's theoretically possible, I imagine.
You need to monitor your application users to know if other users are being affected e.g. by recording response times
If you find that this is affecting other users, you need to run the task in another process, potentially on another machine. You can use the library Hangfire to achieve this.
Using that answer, you can declare a Task with low priority
lowering priority of Task.Factory.StartNew thread
public ActionResult GenerateExcelReport(FilterParams args)
byte[] result = null;
Task.Factory.StartNew(() =>
result = GenerateLargeExcelReportThatTake30Seconds(args);
}, null, TaskCreationOptions.None, PriorityScheduler.BelowNormal)
return File(result, #"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml", "MyReport.xlsx");
Queue the jobs in a table, and have a background process poll that table to decide which Very Large Job needs to run next. Your web client would then need to poll the server to determine when the job is complete (potentially by checking a flag in the database, but there are other methods.) This guarantees that you won't have more than one (or however many you decide is appropriate) of these expensive processes running at a time.
Hangfire and SignalR can help you here, but a queueing mechanism is really necessary to avoid major disruption when, say, five users request this same process at the same time. The approaches mentioned that fire off new threads or background processes don't appear to provide any mechanism for minimizing processor / memory consumption to avoid disrupting other users due to consuming too many resources.

Is it possible to reduce the throughput of my pipeline?

I have a dataflow job that communicates with external resources. The problem is that theses external resources are slower than the dataflow job and this causes that the external resources are always saturated. I need some form to reduce the quantity of messages read from PubSub or something to reduce the throughput of the job in order to reduce the traffic to the external resources.
We currently do not support throttling primitives (such as "make sure this DoFn is throttled to at most X calls per second over the whole job"), however we know it is an important use case and it will most likely be supported sooner or later.
Meanwhile your best bet is, as Ryan said, to limit the number of workers and worker threads: specify --numWorkers (or --maxNumWorkers if you are using autoscaling) and --numberOfWorkerHarnessThreads. However, note that this will lead to creating a backlog of input messages, rather than dropping them. It is hard to tell which is better in your use case.

Load-testing web-app

When load testing a basic web application, what sanity checks do you do other than expected response time?
Is it fair to ask for peak memory usage?
What other checks do you make?
On the server
Requests per second the application can withstand
Requests per second that hit the database (if any, related to the number above, but it's useful to have them as separate figures)
Transferred bandwidth (separated by media type, if possible)
CPU utilization
Memory utilization
On the client
Response time
Weight of the average page
Is the CPU usage high at any time
Run something like YSlow to see what can you optimize on the output to make it quick for users
Stress testing tools usually come with most of these measures (except for Memory, CPU and database usage), as do YSlow or Firebug do on the client.
We look at a pretty wide variety of metrics when analyzing the results of a load test.
On the server, we start with these main 4 categories:
CPU (% utilization, context switches/sec, process queue length)
Memory (% use, page reads/sec, page writes/sec)
Bandwidth (incoming, outgoing, send & receive errors, # connections, connection failures, segment retransmits/sec)
Disk (Disk I/O Time %, avg service time, queue length, reads and writes/sec)
We also like look at metrics specific to the webserver and application server in use. For example, in IIS we look at IIS connection counts, cache hit rates and turnover frequency, etc. In .NET, we would be looking at ASP.NET Requests/sec, ASP.NET Last Request Execution Time, ASP.NET Current Requests, ASP.NET Queued Requests, ASP.NET Request Wait Time, ASP.NET Errors/sec and many others.
On the client side, we are primarily looking at total load time for the pages, duration and TTFB (time to first byte) for critical transactions, bandwidth usage, average page size and failure rate. We also find two metrics very useful - we call them Waiting Users and Average Wait Time. Not many tools have these - they tell you at each sample period exactly how many simulated users are in the process of retrieving a resource from the server and how long, on average, they have been waiting for the resource to arrive. We find these very useful for
determining when the server has reached its capacity
discovering that the server has stopped responding to certain types of requests (typically for certain resources, such as those requiring a database query)
Another good sanity check is to run the tests for at least 24 hours. We do that because one app ran nicely for a few hours then degraded. Discovered some issues with scheduled tasks as well as db connection pooling.
There are a number of services online that can do this type of testing for you as well. Of course, one of the downsides to this approach is that its harder to correlate the data from the service (which is what can be observed externally) with your own internal data about disk I/O, DB ops, etc. If you end up going this route I would suggest finding a vendor that will give you programmatic access to the raw test result data.
