Automating Azure VIP Swap

Automating Azure VIP Swap - asp.net-mvc

I have an ASP.NET MVC 4 app hosted as an Azure web role. I want to do something that seems like it should be pretty standard: I want to create a function that I can call that initiates a VIP swap and raises and event (or calls a callback) when the VIP Swap operation is done.
Just to add some context to the situation: My website implements a workflow that takes about an hour (or less) to complete. If I want to release a new version of the website code, it's convenient (i.e. much less "backward compatibility" code to write) to first let all of the current users complete the workflow so that the new code doesn't need to deal with data created by the previous version of the code. So a management function in my website would first poke a value into the database that disables new workflows; it would then wait until all current workflows are done; it would then call the "VIP Swap" routine; finally, when the VIP Swap routine signals its completion, it would poke the database value to re-enable new workflows.
I found the Microsoft documentation for how to programmatically initiate a VIP swap here:
http://msdn.microsoft.com/en-us/library/ee460814.aspx
The procedure involves POSTing to a magic URL and including some headers in the POST, then periodically performing a GET to a magic URL and checking the response code.
The more I think about this, the more non-trivial it seems. In addition to the basic complexities of wiring up a background timer and completion notification, I don't know what complexities, if any, I might run into trying to do this stuff in the IIS environment. Can I even perform HTTP operations on a background thread? For that matter, will I run into complications just trying to use any of the half dozen or so different "do things in the background" mechanisms baked into .NET?
Any help or guidance will be greatly appreciated. In particular, I'd be ecstatic if someone could point me at a ready-to-go implementation of this function!

I don't think you will find an easy solution to this as the fabric controller is setup to do some very fancy things without your involvement. Running hour-long workflows on a cloud computing environment, where an instance can be pulled out from underneath you, (with a maximum of 5 minutes from the OnStopping event being called to clean up) requires that you do other work anyway to make sure that all of your tasks complete.
The simple question is "What do you do if an instance goes down when workflows are still running?" Do you restart them or are they lost? If they get lost then you don't care anyway, so killing workflows for an upgrade are equally unimportant. If you re-start them then use that same mechanism to decide whether or not a node is due to be shut down, and distribute the jobs accordingly. This pattern is eerily similar to the Hadoop JobTracker. Don't just run the workflows on any 'ol instance. Submit them to a (job tracker) service that decides what to do. The (job tracker) service can then use the service management API to scale up as many instances as you need running the version that you want, run workflows on the appropriate node, and shut them down when they are no longer needed or are outdated.
Unfortunately this may not be the simple solution that you are looking for, but something in your architecture needs to change, rather than trying to force PaaS to fit with your current approach. Decompose your workloads, create loosely coupled services, design for failure, and a few other cloud/distributed computing practices need to be considered. There is a reason why Hadoop is built the way that it is — and it has a reputation for being able to get work done on a bunch of somewhat unreliable commodity hardware.

Related

Should I use a Container/Service Fabric Guest Executable for a scheduled daily workload?

This is a more general question about which types of payloads to host in a Container. In our case we will use Service Fabric guest executables. For this post I will only use the word Container to refer to both. The reason I do this is they have similar properties and think more people may understand a container than a SF Guest Exe.
WebAPIs/Services that needs to scale are a good fit for containers, but this question is related to what we call a "Batch" job. This nomenclature comes out of the old .bat files, but in our case we are using a .NET Framework or Core .exe (console apps).
Currently Windows Task Scheduler kicks off the batch running under a service account on a VM. We want the processing to happen on a certain time of day or day of the week and not before or after. There is not any real scaling here. There is one instance which may or may not be multithreaded and on average they generally run between 2-15 minutes and then stop. Some run longer some run shorter. I understand there are limitations to this approach but this is the type of payload I'm discussing here.
As we modernize the Technology stack we are looking to use the Orchestrator as much as possible. As a technologist I've always tried to understand the different tools in our tool belts and not use a tool just because that's the one I used last, instead use the correct tool for the task.
We started out by not writing any more .net console apps. Instead we put the business logic of these "batches" into WebApi's. Then having the task scheduler call the API when it needed to perform its action. If I put this into Service Fabric and host it my concern is that the system resources are consumed for 23 hours and 45 minutes a day when they are not being used. That seems to be opposite of what you would expect when using a container.
Now if I could spin up a Service Fabric Guest Exe/Container on demand and then after it finishes destroy the instance of the app that could fit the need. Then I could have the benefits of the orchestrator without the determent of having it consume resources all the time. I would hope to retire the Batch Server (VM) as the hardware is usage is not optimized and instead add resources to the cluster.
UPDATE
Looking at Vaclav's Scalability Doco I think there might be a use case in here? https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-concepts-scalability He uses a "Workload Manager Service" combined with CreateServiceAsync, to spin up an instance of the service on demand. I guess I would deploy the app to the image store but not create an instance of the app until needed. Then I need to figure out how to end it, is it as simple as changing the infinite loop in Program.cs? The thing is it doesn't look like there is a Program.cs in a Guest Executable.
This looks like a way to run a package until completion, which was releases as part of 7.1. But how do we start a second execution of the service? I want to execute based on a request coming in.
https://learn.microsoft.com/en-us/azure/service-fabric/run-to-completion
Thoughts?

Durable tasks sub orchestration with micro services

I'm attempting to use azure durable tasks to orchestrate some microservices but am running into a small gap in understanding how taskhubs work as well as coordinating the projects correctly.
I'm trying to create a main orchestrator that is in charge of kicking off sub orchestrations to do the actual work. Below is a diagram of what I'm trying to achieve.
The idea is that each .net Project will be able to scale independent of the other, so if .Net project 2 was under quite a bit of load I'd be able to scale that project only and not have to worry about the other 2 projects. The problem I'm running into is from what I understand the taskhub queue is shared by all the services so there is no way to have each process focus on only it's work, meaning each project can see everything in the queue and it may cause 1 project to dequeue a message intended for project 2. Is this correct?
From reading the documentation it doesn't seem clear that I can send project 2 it's sub orchestration messages as well as send project 3 it's specific orchestration.
Am I thinking about this problem incorrectly, is there a different way I might want to approach this?

What you want cannot be achieve.
As of now, Azure Function only allow orchestrator functions to call activity and sub-orchestrator functions that exist in the same function app. The main reason is a technical one: queues within a task hub are shared across all functions, so there's no way to guarantee that a message intended for FunctionAppA does not get picked up by FunctionAppB.
If cross-project communication is required, the correct method is to use http or queue.

What makes erlang scalable?

I am working on an article describing fundamentals of technologies used by scalable systems. I have worked on Erlang before in a self-learning excercise. I have gone through several articles but have not been able to answer the following questions:
What is in the implementation of Erlang that makes it scalable? What makes it able to run concurrent processes more efficiently than technologies like Java?
What is the relation between functional programming and parallelization? With the declarative syntax of Erlang, do we achieve run-time efficiency?
Does process state not make it heavy? If we have thousands of concurrent users and spawn and equal number of processes as gen_server or any other equivalent pattern, each process would maintain a state. With so many processes, will it not be a drain on the RAM?
If a process has to make DB operations and we spawn multiple instances of that process, eventually the DB will become a bottleneck. This happens even if we use traditional models like Apache-PHP. Almost every business application needs DB access. What then do we gain from using Erlang?
How does process restart help? A process crashes when something is wrong in its logic or in the data. OTP allows you to restart a process. If the logic or data does not change, why would the process not crash again and keep crashing always?
Most articles sing praises about Erlang citing its use in Facebook and Whatsapp. I salute Erlang for being scalable, but also want to technically justify its scalability.
Even if I find answers to these queries on an existing link, that will help.
Regards,
Yash

Shortly:
It's unmutable. You have no variables, only terms, tuples and atoms. Program execution can be divided by breakpoint at any place. Fully transactional model.
Processes are even lightweight than .NET threads and isolated.
It's made for communications. Millions of connections? Fully asynchronous? Maximum thread safety? Big cross-platform environment, which built only for one purpose — scale&communicate? It's all Ericsson language — first in this sphere.
You can choose some impersonators like F#, Scala/Akka, Haskell — they are trying to copy features from Erlang, but only Erlang born from and born for only one purpose — telecom.
Answers to other questions you can find on erlang.com and I'm suggesting you to visit handbook. Erlang built for other aims, so it's not for every task, and if you asking about awful things like php, Erlang will not be your language.

I'm no Erlang developer (yet) but from what I have read about it some of the features that makes it very scalable is that Erlang has its own lightweight processes that are using message passing to communicate with each other. Because of this there is no such thing as shared state and locking which is the case when using for example a multi threaded Java application.
Another difference compared to Java is that the Erlang VM does garbage collection on every little process that is running which does not take any time at all compared to Java which does garbage collection only per VM.
If you get problem with bottlenecks from database connection you could start by using a database pooling app running against maybe a replicated PostgreSQL cluster or if you still have bottlenecks use a multi replicated NoSQL setup with Mnesia, Riak or CouchDB.
I think process restarts can be very useful when you are experiencing rare bugs that only appear randomly and only when specific criteria is fulfilled. Bugs that cause the application to crash as soon as you restart the app should optimally be fixed or taken care of with a circuit breaker so that it does not spread further.

Here is one way process restart helps. By not having to deal with all possible error cases. Say you have a program that divides numbers. Some guy enters a zero to divide by. Instead of checking for that possible error (and tons more), just code the "happy case" and let process crash when he enters 3/0. It just restarts, and he can figure out what he did wrong.
You an extend this into an infinite number of situations (attempting to read from a non-existent file because the user misspelled it, etc).

The big reason for process restart being valuable is that not every error happens every time, and checking that it worked is verbose.
Error handling is verbose typically, so writing it interspersed with the logic handling doing a task can make it harder to understand the code. Moving that logic outside of the task allows you to more clearly distinguish between "doing things" code, and "it broke" code. You just let the thing that had a problem fail, and handle it as needed by a supervising party.
Since most errors don't mean that the entire program must stop, only that that particular thing isn't working right, by just restarting the part that broke, you can keep operating in a state of degraded functionality, instead of being down, while you repair the problem.
It should also be noted that the failure recovery is bounded. You have to lay out the limits for how much failure in a certain period of time is too much. If you exceed that limit, the failure propagates to another level of supervision. Each restart includes doing any needed process initialization, which is sometimes enough to fix the problem. For example, in dev, I've accidentally deleted a database file associated with a process. The crashes cascaded up to the level where the file was first created, at which point the problem rectified itself, and everything carried on.

Best way to run rails with long delays

I'm writing a Rails web service that interacts with various pieces of hardware scattered throughout the country.
When a call is made to the web service, the Rails app then attempts to contact the appropriate piece of hardware, get the needed information, and reply to the web client. The time between the client's call and the reply may be up to 10 seconds, depending upon lots of factors.
I do not want to split the web service call in two (ask for information, answer immediately with a pending reply, then force another api call to get the actual results).
I basically see two options. Either run JRuby and use multithreading or else run several regular Ruby instances and hope that not many people try to use the service at a time. JRuby seems like the much better solution, but it still doesn't seem to be mainstream and have out of the box support at Heroku and EngineYard. The multiple instance solution seems like a total kludge.
1) Am I right about my two options? Is there a better one I'm missing?
2) Is there an easy deployment option for JRuby?

I do not want to split the web service call in two (ask for information, answer immediately with a pending reply, then force another api call to get the actual results).
From an engineering perspective, this seems like it would be the best alternative.
Why don't you want to do it?

There's a third option: If you host your Rails app with Passenger and enable global queueing, you can do this transparently. I have some actions that take several minutes, with no issues (caveat: some browsers may time out, but that may not be a concern for you).
If you're worried about browser timeout, or you cannot control the deployment environment, you may want to process it in the background:
User requests data
You enter request into a queue
Your web service returns a "ticket" identifier to check the progress
A background process processes the jobs in the queue
The user polls back, referencing the "ticket" id
As far as hosting in JRuby, I've deployed a couple of small internal applications using the glassfish gem, but I'm not sure how much I would trust it for customer-facing apps. Just make sure you run config.threadsafe! in production.rb. I've heard good things about Trinidad, too.

You can also run the web service call in a delayed background job so that it's not hogging up a web-server and can even be run on a separate physical box. This is also a much more scaleable approach. If you make the web call using AJAX then you can ping the server every second or two to see if your results are ready, that way your client is not held in limbo while the results are being calculated and the request does not time out.

windows service vs scheduled task

What are the cons and pros of windows services vs scheduled tasks for running a program repeatedly (e.g. every two minutes)?

Update:
Nearly four years after my original answer and this answer is very out of date. Since TopShelf came along Windows Services development got easy. Now you just need to figure out how to support failover...
Original Answer:
I'm really not a fan of Windows Scheduler. The user's password must be provided as #moodforall points out above, which is fun when someone changes that user's password.
The other major annoyance with Windows Scheduler is that it runs interactively and not as a background process. When 15 MS-DOS windows pop up every 20 minutes during an RDP session, you'll kick yourself that didn't install them as Windows Services instead.
Whatever you choose I certainly recommend you separate out your processing code into a different component from the console app or Windows Service. Then you have the choice, either to call the worker process from a console application and hook it into Windows Scheduler, or use a Windows Service.
You'll find that scheduling a Windows Service isn't fun. A fairly common scenario is that you have a long running process that you want to run periodically. But, if you are processing a queue, then you really don't want two instances of the same worker processing the same queue. So you need to manage the timer, to make sure if your long running process has run longer than the assigned timer interval, it doesn't kick off again until the existing process has finished.
After you have written all of that, you think, why didn't I just use Thread.Sleep? That allows me to let the current thread keep running until it has finished and then the pause interval kicks in, thread goes to sleep and kicks off again after the required time. Neat!
Then you then read all the advice on the internet with lots of experts telling you how it is really bad programming practice:
http://msmvps.com/blogs/peterritchie/archive/2007/04/26/thread-sleep-is-a-sign-of-a-poorly-designed-program.aspx
So you'll scratch your head and think to yourself, WTF, Undo Pending Checkouts -> Yes, I'm sure -> Undo all today's work..... damn, damn, damn....
However, I do like this pattern, even if everyone thinks it is crap:
OnStart method for the single-thread approach.
protected override void OnStart (string args) {
// Create worker thread; this will invoke the WorkerFunction
// when we start it.
// Since we use a separate worker thread, the main service
// thread will return quickly, telling Windows that service has started
ThreadStart st = new ThreadStart(WorkerFunction);
workerThread = new Thread(st);
// set flag to indicate worker thread is active
serviceStarted = true;
// start the thread
workerThread.Start();
}
The code instantiates a separate thread and attaches our worker
function to it. Then it starts the thread and lets the OnStart event
complete, so that Windows doesn't think the service is hung.
Worker method for the single-thread approach.
/// <summary>
/// This function will do all the work
/// Once it is done with its tasks, it will be suspended for some time;
/// it will continue to repeat this until the service is stopped
/// </summary>
private void WorkerFunction() {
// start an endless loop; loop will abort only when "serviceStarted"
// flag = false
while (serviceStarted) {
// do something
// exception handling omitted here for simplicity
EventLog.WriteEntry("Service working",
System.Diagnostics.EventLogEntryType.Information);
// yield
if (serviceStarted) {
Thread.Sleep(new TimeSpan(0, interval, 0));
}
}
// time to end the thread
Thread.CurrentThread.Abort();
}
OnStop method for the single-thread approach.
protected override void OnStop() {
// flag to tell the worker process to stop
serviceStarted = false;
// give it a little time to finish any pending work
workerThread.Join(new TimeSpan(0,2,0));
}
Source: http://tutorials.csharp-online.net/Creating_a_.NET_Windows_Service%E2%80%94Alternative_1%3a_Use_a_Separate_Thread (Dead Link)
I've been running lots of Windows Services like this for years and it works for me. I still haven't seen a recommended pattern that people agree on. Just do what works for you.

Some misinformation here. Windows Scheduler is perfectly capable of running tasks in the background without windows popping up and with no password required. Run it under the NT AUTHORITY\SYSTEM account. Use this schtasks switch:
/ru SYSTEM
But yes, for accessing network resources, the best practice is a service account with a separate non-expiring password policy.
EDIT
Depending on your OS and the requirements of the task itself, you may be able to use accounts less privileged than Localsystem with the /ru option.
From the fine manual,
/RU username
A value that specifies the user context under which the task runs.
For the system account, valid values are "", "NT AUTHORITY\SYSTEM", or "SYSTEM".
For Task Scheduler 2.0 tasks, "NT AUTHORITY\LOCALSERVICE", and
"NT AUTHORITY\NETWORKSERVICE" are also valid values.
Task Scheduler 2.0 is available from Vista and Server 2008.
In XP and Server 2003, system is the only option.

In .NET development, I normally start off by developing a Console Application, which will run will all logging output to the console window. However, this is only a Console Application when it is run with the command argument /console. When it is run without this parameter, it acts as a Windows Service, which will stay running on my own custom coded scheduled timer.
Windows Services, I my mind, are normally used to manage other applications, rather than be a long running application. OR .. they are continuously-running heavyweight applications like SQL Server, BizTalk, RPC Connections, IIS (even though IIS technically offloads work to other processes).
Personally, I favour scheduled tasks over Window Services for repititive maintenance tasks and applications such as file copying/synchronisations, bulk email sending, deletion or archiving of files, data correction (when other workarounds are not available).
For one project I have been involved in the development of 8 or 9 Windows Services, but these sit around in memory, idle, eating 20MB or more memory per instance. Scheduled tasks will do their business, and release the memory immediately.

What's the overhead of starting and quitting the app? Every two minutes is pretty often. A service would probably let the system run more smoothly than executing your application so frequently.
Both solutions can run the program when user isn't logged in, so no difference there. Writing a service is somewhat more involved than a regular desktop app, though - you may need a separate GUI client that will communicate with the service app via TCP/IP, named pipes, etc.
From a user's POV, I wonder which is easier to control. Both services and scheduled tasks are pretty much out of reach for most non-technical users, i.e. they won't even realize they exist and can be configured / stopped / rescheduled and so on.

The word 'serv'ice shares something in common with 'serv'er. It is expected to always be running, and 'serv'e. A task is a task.
Role play. If I'm another operating system, application, or device and I call a service, I expect it to be running and I expect a response. If I (os, app, dev) just need to execute an isolated task, then I will execute a task, but if I expect to communicate, possibly two way communication, I want a service. This has to do with the most effective way for two things to communicate, or a single thing that wants to execute a single task.
Then there's the scheduling aspect. If you want something to run at a specific time, schedule. If you don't know when you're going to need it, or need it "on the fly", service.
My response is more philosophical in nature because this is very similar to how humans interact and work with another. The more we understand the art of communication, and "entities" understand their role, the easier this decision becomes.
All philosophy aside, when you are "rapidly prototyping", as my IT Dept often does, you do whatever you have to in order to make ends meet. Once the prototyping and proof of concept stuff is out of the way, usually in the early planning and discovering, you have to decide what's more reliable for long term sustainability.
OK, so in conclusion, it's highly dependent on a lot of factors, but hopefully this has provided insight instead of confusion.

A Windows service doesn't need to have anyone logged in, and Windows has facilities for stopping, starting, and logging the service results.
A scheduled task doesn't require you to learn how to write a Windows service.

It's easier to set up and lock down windows services with the correct permissions.
Services are more "visible" meaning that everyone (ie: techs) knows where to look.

This is an old question but I will like to share what I have faced.
Recently I was given a requirement to capture the screenshot of a radar (from a Meteorological website) and save it in the server every 10 minutes.
This required me to use WebBrowser.
I usually make windows services so I decided to make this one service too but it would keep crashing.
This is what I saw in Event Viewer
Faulting module path: C:\Windows\system32\MSHTML.dll
Since the task was urgent and I had very less time to research and experiment, I decided to use a simple console application and triggered it as a task and it executed smoothly.
I really liked the article by Jon Galloway recommended in accepted answer by Mark Ransom.
Recently passwords on the servers were changed without acknowledging me and all the services failed to execute since they could not logon.
So ppl claiming in the article comments that this is a problem. I think windows services can face same problem (Pls. correct me if I am wrong, I am jus a newbie)
Also the thing mentioned, if using task scheduler windows pop up or the console window pops up.
I have never faced that. It may pop up but it is at least very instantaneous.

Why not provide both?
In the past I've put the 'core' bits in a library and wrapped a call to Whatever.GoGoGo() in both a service as well as a console app.
With something you're firing off every two minutes the odds are decent it's not doing much (e.g. just a "ping" type function). The wrappers shouldn't have to contain much more than a single method call and some logging.

Generally, the core message is and should be that the code itself must be executable from each and every "trigger/client". So it should not be rocket science to switch from one to the other approach.
In the past we used more or less always Windows Services but since also more and more of our customers switch to Azure step by step and the swap from a Console App (deployed as a Scheduled Task) to a WebJob in Azure is much easier than from a Windows Service, we focus on Scheduled Tasks for now. If we run into limitations, we just ramp up the Windows Service project and call the same logic from there (as long as customers are working OnPrem..) :)
BR,
y

Windows services want more patience until it's done.
It has a bit hard debug and install. It's faceless.
If you need a task which must be done in every second, minute or hour,
you should choice Windows Service.
Scheduled Task is quickly developed and has a face.
If you need a daily or weekly task, you can use Scheduled Task.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart