Publishing to live – Get status and prevent timeouts - timeout

I have the following scenario:
LR Portal 6.1.20 EE GA2 Portal behind IBM WebSeal
Staged Sites
Custom portlet which needs to publish it’s contents from staging to live
The custom portlet is publishing it’s contents with a class that extends BasePortletDataHandler and overrides the following methods:
doExportData
doImportData
doDeleteData
isAlwaysExportable
isPublishToLiveByDefault
isAlwaysStaged
This works quite well in developing mode, where there is no WebSeal. In control panel, you go to "site pages" and invoke “publish to live”.
In production however, we get WebSeal timeouts whenever this process takes more than 2 minutes. The process is still running in the background, but the user has no way telling if it's done, if it worked or if it did not. He gets no feedback about it what so ever.
Is there a way to implement a custom portlet for the control panel which takes care of these problems? How do I get/track the status of the process and how do I keep the session alive?

I don't have any experience with liferay, but I administrator WebSEAL daily so I can approach your question from that angle. You can increase the timeouts on individual junctions. I have encountered similar scenarios with applications in the past. We have had to go up to a 300 second timeout.
[junction:junction_name]
http-timeout = 300
https-timeout = 300
http://publib.boulder.ibm.com/infocenter/tivihelp/v2r1/index.jsp?topic=%2Fcom.ibm.itame.doc_6.1.1%2Fam611_webseal_admin95.htm
You may also need to increase the server timeouts:
[server]
client-connect-timeout = 300
http://publib.boulder.ibm.com/infocenter/tivihelp/v2r1/topic/com.ibm.itame.doc_6.1.1/am611_webseal_admin94.htm?path=3_10_3_3_1_4_0_6_5#http-https-timeouts
The problem is the application doesn't send any data over the TCP connection, so WebSEAL times out the connection. Unless you can change the way your application works, you'll have to increase the timeouts. Preferably you would use AJAX or similar technology to have the client routinely query the server for a status once the procedure is kicked off. However, I had a customer that was integrating with us and they couldn't change their application code, so I was forced to increase the timeouts for them as well.

Related

Large percent of requests in CLRThreadPoolQueue

We have an ASP.NET MVC application hosted in an azure app-service. After running the profiler to help diagnose possible slow requests, we were surprised to see this:
An unusually high % of slow requests in the CLRThreadPoolQueue. We've now run multiple profile sessions each come back having between 40-80% in the CLRThreadPoolQueue (something we'd never seen before in previous profiles). CPU each time was below 40%, and after checking our metrics we aren't getting sudden spikes in requests.
The majority of the requests listed as slow are super simple api calls. We've added response caching and made them async. The only thing they do is hit a database looking for a single record result. We've checked the metrics on the database and the query avg run time is around 50ms or less. Looking at application insights for these requests confirms this, and shows that the database query doesn't take place until the very end of the request time line (I assume this is the request sitting in the queue).
Recently we started including SignalR into a portion of our application. Its not fully in use but it is in the code base. We since switched to using Azure SignalR Service and saw no changes. The addition of SignalR is the only "major" change/addition we've made since encountering this issue.
I understand we can scale up and/or increase the minWorkerThreads. However, this feels like I'm just treating the symptom not the cause.
Things we've tried:
Finding the most frequent requests and making them async (they weren't before)
Response caching to frequent requests
Using Azure SignalR service rather than hosting it on the same web
Running memory dumps and contacting azure support (they
found nothing).
Scaling up to an S3
Profiling with and without thread report
-- None of these steps have resolved our issue --
How can we determine what requests and/or code is causing requests to pile up in the CLRThreadPoolQueue?
We encountered a similar problem, I guess internally SignalR must be using up a lot of threads or some other contended resource.
We did three things that helped a lot:
Call ThreadPool.SetMinThreads(400, 1) on app startup to make sure that the threadpool has enough threads to handle all the incoming requests from the start
Create a second App Service with the same code deployed to it. In the javascript, set the SignalR URL to point to that second instance. That way, all the SignalR requests go to one app service, and all the app's HTTP requests go to the other. Obviously this requires a SignalR backplane to be set up, but assuming your app service has more than 1 instance you'll have had to do this anyway
Review the code for any synchronous code paths (eg. making a non-async call to the database or to an API) and convert them to async code paths

IIS Web Farm AppPool warm-up

I have multiple servers (2012 R2 with IIS 8.5) that have shared configuration, shared vanity URL (f5 load balanced), and host several different applications. One of the applications (an ASP.NET MVC web app) is rarely used (maybe once or twice a week) but when it needs to be used, it needs to load quickly.
I've set the AppPool to have a Start Mode of "AlwaysRunning", and a Recycling -> Regular Time Interval to 0, but it seems like every time I hit the app, it takes forever to load (like 10-20 seconds) but subsequent page requests happen instantly.
Is there another setting that I need to set to keep the app warmed up? The app has Kerberos Authentication and access is limited to one security group (that I'm not even a member of), so I can't use external PowerShell scripts to manually keep it warm.
You can check to see if the application pool is running before you hit the app.
If you click on your server name in IIS then click on "Worker Processes" you'll see all the Process ID's of the different application pools and that state.
This way you can confirm the app pool is running, before you access the application. This will help you narrow down where the problem exists.
1) Is the app pool running?
2) Is my app loaded in my app pool?
If 1 checks out, then move on to step 2 and check to see if the libraries of that application is loaded up in that Process ID.
Check your event log for application pool failures.
If you have some asynchronous initialisation/maintenance task which is started in parallel with the request or with some delay and subsequently fails, it can make the request (and some afterward) succeed but kill the application pool shortly after. This would exhibit these exact symptoms.

Azure Websites and ASP.NET, how much inactivity before the app pool is recycled causing a recompilation?

I have a MVC3, .NET4.5 asp.net web application hosted on Azure Websites.
I am experimenting with "Free", "Shared" and "Standard" scaling configurations.
I have noticed that after a period of inactivity the compiled code get dropped from memory, or the app pool gets recycled forcing a JIT recompile.
My main question is what is time period before the compiled code gets dropped forcing a recompile? I assume this is as a result of the application pool recycling? I have come across this on standard shared hosts such as DiscountASP.
My second question is: What is the best approach to minimise this issue as I would not like my users bumping into this recompilation lag? My initial thoughts are precompilation.
Many thanks in advance.
EDIT:
I have a found a related SO post on this here: App pool timeout for azure web sites
However it seems, as like standard Shared hosting, one cannot change App Pool recycling. One has more flexibility with the "Standard" scale option, since it is dedicated. So the likely options at present are:
1) Precompilation
2) Use of "Keep alive" ping sites.
EDIT2:
1) "Keep Alive" approach seems to be working. I have a 10 minute monitor running.
I believe the inactivity period is 20 minutes by default. I haven't used web sites yet so I'm not famailiar with rescrtictions on changing settings but one quick way to keep your site activie is to use a uptime monitoring service like Pingdom (you can check one site for free at time of writing), this will ping your site regularly and prevent it from becoming idle.

Best way to run rails with long delays

I'm writing a Rails web service that interacts with various pieces of hardware scattered throughout the country.
When a call is made to the web service, the Rails app then attempts to contact the appropriate piece of hardware, get the needed information, and reply to the web client. The time between the client's call and the reply may be up to 10 seconds, depending upon lots of factors.
I do not want to split the web service call in two (ask for information, answer immediately with a pending reply, then force another api call to get the actual results).
I basically see two options. Either run JRuby and use multithreading or else run several regular Ruby instances and hope that not many people try to use the service at a time. JRuby seems like the much better solution, but it still doesn't seem to be mainstream and have out of the box support at Heroku and EngineYard. The multiple instance solution seems like a total kludge.
1) Am I right about my two options? Is there a better one I'm missing?
2) Is there an easy deployment option for JRuby?
I do not want to split the web service call in two (ask for information, answer immediately with a pending reply, then force another api call to get the actual results).
From an engineering perspective, this seems like it would be the best alternative.
Why don't you want to do it?
There's a third option: If you host your Rails app with Passenger and enable global queueing, you can do this transparently. I have some actions that take several minutes, with no issues (caveat: some browsers may time out, but that may not be a concern for you).
If you're worried about browser timeout, or you cannot control the deployment environment, you may want to process it in the background:
User requests data
You enter request into a queue
Your web service returns a "ticket" identifier to check the progress
A background process processes the jobs in the queue
The user polls back, referencing the "ticket" id
As far as hosting in JRuby, I've deployed a couple of small internal applications using the glassfish gem, but I'm not sure how much I would trust it for customer-facing apps. Just make sure you run config.threadsafe! in production.rb. I've heard good things about Trinidad, too.
You can also run the web service call in a delayed background job so that it's not hogging up a web-server and can even be run on a separate physical box. This is also a much more scaleable approach. If you make the web call using AJAX then you can ping the server every second or two to see if your results are ready, that way your client is not held in limbo while the results are being calculated and the request does not time out.

Web App Performance Problem

I have a website that is hanging every 5 or 10 requests. When it works, it works fast, but if you leave the browser sit for a couple minutes and then click a link, it just hangs without responding. The user has to push refresh a few times in the browser and then it runs fast again.
I'm running .NET 3.5, ASP.NET MVC 1.0 on IIS 7.0 (Windows Server 2008). The web app connects to a SQLServer 2005 DB that is running locally on the same instance. The DB has about 300 Megs of RAM and the rest is free for web requests I presume.
It's hosted on GoGrid's cloud servers, and this instance has 1GB of RAM and 1 Core. I realize that's not much, but currently I'm the only one using the site, and I still receive these hangs.
I know it's a difficult thing to troubleshoot, but I was hoping that someone could point me in the right direction as to possible IIS configuration problems, or what the "rough" average hardware requirements would be using these technologies per 1000 users, etc. Maybe for a webserver the minimum I should have is 2 cores so that if it's busy you still get a response. Or maybe the slashdot people are right and I'm an idiot for using Windows period, lol. In my experience though, it's usually MY algorithm/configuration error and not the underlying technology's fault.
Any insights are appreciated.
What diagnistics are available to you? Can you tell what happens when the user first hits the button? Does your application see that request, and then take ages to process it, or is there a delay and then your app gets going and works as quickly as ever? Or does that first request just get lost completely?
My guess is that there's some kind of paging going on, I beleive that Windows tends to have a habit of putting non-recently used apps out of the way and then paging them back in. Is that happening to your app, or the DB, or both?
As an experiment - what happens if you have a sneekly little "howAreYou" page in your app. Does the tiniest possible amount of work, such as getting a use count from the db and displaying it. Have a little monitor client hit that page every minute or so. Measure Performance over time. Spikes? Consistency? Does the very presence of activity maintain your applicaition's presence and prevent paging?
Another idea: do you rely on any caching? Do you have any kind of aging on that cache?
Your application pool may be shutting down because of inactivity. There is an Idle Time-out setting per pool, in minutes (it's under the pool's Advanced Settings - Process Model). It will take some time for the application to start again once it shuts down.
Of course, it might just be the virtualization like others suggested, but this is worth a shot.
Is the site getting significant traffic? If so I'd look for poorly-optimized queries or queries that are being looped.
Your configuration sounds fine assuming your overall traffic is relatively low.
To many data base connections without being release?
Connecting some service/component that is causing timeout?
Bad resource release?
Network traffic?
Looping queries or in code logic?

Resources