How can I ensure my Topshelf service stops before SQL Server on a computer restart? - windows-services

I have a windows service running using Topshelf. This service makes a lot of SQL server queries. When the hosting computer is restarted it almost always causes errors in my service due to SQL Server stopping in the middle of my service making a query. I've been asked to solve this so the logs won't have so many errors as these computers are restarted frequently.
Topshelf has some built-in WhenShutdown logic that you can use to run when the computer is shutdown/restarted, but there is still no guarantee that my service will stop before SQL Server, and based on the error frequency it pretty much always happens that way. I have tried to also use Topshelfs WhenCustomCommandReceived to listen for windows PreShutdown event as shown here, but my tests when logging any custom command received and then rebooting my computer shows no logs. I also tried adding SQL Server as a dependency to my service, but this still doesn't guarantee mine will stop before SQL Server.
I have also tried adding in the logic from this solution, but again I never see any logs indicating this code is even being executed on a restart. Any tips on how I can better solve this issue?
tldr: how to ensure my topshelf service stops before SQL server on a computer restart/shutdown
Thanks!

Related

Docker container http requests are somehow not working after some time

I have the following scenario:
docker containers running making requests to my backend (locally and in python)
OS is Windows for local development
containers start and make their requests (port 8080) without any issues
however, after some time and I guess because I have multiple containers running that do the same thing the http requests stop working, In the log it just says -> Max retries exceeded
I cannot see the requests even getting to my backend, so they have to be blocked before they reach it
I have around 15 containers running at a time making the requests.
What I have tried:
Windows Firewall is disabled
My router has an option to detect potential bot-nets, disabled as well
changing the ip of my backend -> the whole process starts from zero, so the requests are fine and after some time they just stop working
I am 100% sure it is no problem with the code etc. since requests are working initially.
Backend running in spring if that matters, but since I don't even see requests being made, I guess the issue is not laying here either.
Since I am pretty new to docker and have no idea about possible causes of this problem I hope anyone here can maybe help me. Thanks in advance!

WinRM Connection Issue

I can't quite explain the problem, because I myself do not understand it. I'd appreciate getting help with defining/locating/dealing with the issue.
The Setup
I have a Win10 VM having tests run on it, and a Jenkins VM (Windows Server 2008) running those tests on it.
I am using a testing app called JSystem. Sadly, it does not support Windows 10 officially, as it uses Telnet to communicate with target SUTs (which was removed from Windows 10), so I had to create a way to use WinRM to communicate with that type of VM.
The Problem
The gist of it, is that at some point in time, the test on Jenkins just 'freezes'. The connection is still on 'established' state, the VM (host and client) are still working. It does not happen every time, and it might happen a few minutes after the testing started, or a couple of hours. The test that causes is is almost never the same, but naturally it happens when there's some form of communication between the SUT and the testing VM. It can be file transfer, or a simple command like "dir". It can happen during the request for the command to happen, or when sending the result back.
More Information
I did gather some more information that might help.
I did not see it happen when I try to run the test from my own development environment (that is, not using Jenkins as a medium) - However, it might've been because I was unlucky and did not try enough. My own environment is a Windows 10 as well, and not a VM.
Looking at the event viewer on the SUT, there was a warning "Time-Service" event ID 50, an NTPClient time sync issue one minute after the freeze happened. However, the Jenkins VM had no events at all. That said, the event repeats itself a lot on the SUT and it does not always freeze the test, but it's possible it causes interference if it happens during a communication attempt between the VMs.
I can still connect to the SUT with WinRM just fine with other sources, and it responds as well.
Rather than frozen, it's more like SUT is waiting for a request from Jenkins, and Jenkins is waiting for a response from the SUT. The weird thing, however, is that normally these tests have a timeout of 30-60 seconds, it should not wait longer than that (unless configured otherwise in the test, of course) before failing the test step.
I can't be sure if this has anything to do with it, but I do have time sync issues between VMs. I've asked in another question about how to solve it, so if that's the issue in your opinion, please let me know, especially if you have a solution.
What is a good way to approach this?

How to know when Neo4j is ready to serve

I've developed an application which connects to Neo4j and creates a bunch of nodes. I've also developed a plugin for Neo4j using Graphaware. And both these are run in separate dockers (one for the code and one for the Neo4j with plugin).
Now, since I start these containers automatically and simultaneously, the code should wait for the Neo4j to completely start before it tries creating the nodes. For that, I'm testing the availability of the Neo4j by trying to connect to it using bolt protocol (Neo4j's driver).
The problem I've got is that it seems Neo4j starts accepting incoming connections before it completely loads the plugins. As the result, the connection is made before Neo4j is actually prepared and also something goes wrong (I don't know what) and the whole code halts (I don't think this issue is important) all because the connection is made before the plugins are loaded. I know that since if I delay the connection manually, everything goes forward smoothly.
So my question is how to make sure that Neo4j is warmed up (fully) before starting to connect to it? Right now I'm checking the availability of management (http://localhost:7474) but what if there's no management, to begin with?
At the moment you'll find that you can keep the management interface local, but you can't actually turn it off (unless you're working in embedded mode), so waiting for http://localhost:7474 is a good approach. If you want to be more fine-grained, you can check yourinstallation\logs\debug.log
2017-07-27 03:58:53.643+0000 INFO [o.n.k.AvailabilityGuard] Fulfilling of requirement makes database available: Database available
2017-07-27 03:58:53.644+0000 INFO [o.n.k.i.f.GraphDatabaseFacadeFactory] Database is now ready
Hope this helps.
Regards,
Tom

Redis flushall command is randomly being called

I have a ruby app in production that uses sidekiq (that uses redis) and I have managed to discover that flushall commands are being called which cause the database to be wiped (thus removing all the processed and scheduled jobs).
I don't know or understand what could be causing this.
Does anyone know how I can begin to trace the call to flushall?
Thanks,
It is most likely that your Redis server is open to the public network without any protection - that is just calling for trouble because anyone can connect and do much more damage than just a FLUSHALL. If that it the case, use password authentication at the very least, after burning the compromised server - the attacker may have gained access to your server's operating system and from there who knows where. More information at: http://antirez.com/news/96
If that isn't the case and you have a rogue application somewhere that randomly calls unwanted commands, you can try tracking it by combining the MONITOR and CLIENT LIST.
Lastly, you can consider renaming/disabling the FLUSHALL command, at least temporarily, until you get to the bottom of this.

Issue with non responding website. How to debug?

We have a website created in asp-mvc4 running on iss on windows server 2012 and using MSSQL 2012 as data storage. Connections are done using entity framework-6... Very standard stuff.
We are not a high volume website (max 3000 users around the world so hitting it in different timezones)
The issue is that sometimes without warning the site becomes unresponsive (browser does not show it and time out). Nothing special but here is the strange issues:
The server itself is working fine if you terminal server into it
Restarting the ISS does not help er there are no error logs
SQL server have around 100 connections from the website all sleeping (but killing theses processes does not make the site recover)
SQL server at the time show half of them as waiting tasks but it is still responsive if executing sql from SSMS or even remote from excel (remote reporting)
Looking at SQL Profiler website is still sending in SQL request despite being down but they are all request like this: if db_id('dbname') is not null else select... (Not something specifically written in the website)
the really strange one: If we restart the SQL server the website becomes responsive again)
I know this is not a lot to go on but we are very puzzled and don't really know how to proceed. Northing indicate error in any kind of log (website, iss, sql server or windows). I can deduct it must be the website thinking SQL cannot give it what it need because connection pool or something is used up but why it is only freed up with a complete sql server restart and not just killing the processes really puzzles me, and why the connection pool buildup happen in the first place since and sql is handled in entity framework
Any advice on how to debug further is most welcome

Resources