We use Redgate's SQL Monitor to keep an eye on our mission critical servers. I like the product. It is reasonably user-friendly, reliable and it helps my team monitor our servers efficiently. I've used other products before and I think SQL Monitor compares favourably. No, I'm not paid by RedGate!!!
With that in mind, my team has been struggling a bit with an ongoing issue: every Monday morning one of the SQL Monitor processes was running amok and maxing out CPU! This was strange, as the product is generally pretty stable. After some investigation, we realised that the security team had started performing regular vulnerability scans over the weekends. Part of this test involved intentional failed login attempts. SQL Monitor can't seem to handle this, and the web server, specifically the "xsp4" process goes haywire and trashes the CPU.
Our small issue (according to management) didn't warrant a reworking of the new security procedures - fair enough, I guess. So, it was decided that we should restart the service that controls the xsp4 process, MonitorWebServiceNetwork, every Monday morning. And the best way to do this would be using a PowerShell script executed through Windows' Task Scheduler. I'm new to Powershell, so I had to scour the web to help me come up with the solution.
I knew that I didn't want to restart the service unnecessarily, so I needed a script that established if CPU usage for the particular process was high or not. I found this great post: Powershell: Get CPU Usage for a Process Using Get-Counter by Kris Powell. In it he shares the script below that outputs the CPU usage of a process (or processes as he uses a wildcard for the process name).
This script is great. The only problem for me is that it returns the CPU usage value as a string and I need a decimal value in order to compare it to my threshold value (50% CPU usage). Incidentally, I also didn't need to return the InstanceName. after much digging around, I modified the script resulting in the following:
Since I didn't need to print any of the counter values I completely removed the $Samples select statement. Instead I replaced the $Samples parameter with one called $cpu (just to make it more readable) and selected just the CookedValue property using the -ExpandProperty select parameter. This still returns a string value, so it is here that I convert it by defining the parameter as a decimal. Finally, I calculate the actual CPU usage by dividing the CookedValue by the number of CPU cores. I store this value in another parameter, $Usage. Kris Powell explains how he gets all this info in his post, so I won't repeat it.
Once I have this value I can use it in an if statement and restart the service if it surpasses my threshold:
At this point I've added one more parameter, $ServiceName. This is because the process that is maxing out CPU is controlled by a service and in my case, the names are different. So if "xsp4" is using too much CPU, I need to restart the "MonitorWebServiceNetwork" service. And to make it fully flexible I've added one last parameter, $threshold. The final script allows the parameters to be passed as arguments so you can run it as a script file:
The last hurdle I ran into was being able to execute this script from the task scheduler. This was resolved by the following blog post: Schedule PowerShell Scripts that Require Input Values by Ed Wilson, the Microsoft Scripting Guy. There's also a good explanation of scheduling PowerShell scripts here: How to Schedule a PowerShell Script by Dmitry Sotnikov. The Action in my scheduled task is defined as follows:
Action: Start a program
Add arguments (optional): -Command "& 'D:\Red Gate\CPU_Usage_Restart_Service.ps1' -ProcessName 'xsp4' -ServiceName 'MonitorWebServiceNetwork' -threshold 50"
Finally, I needed to schedule the task. Unfortunately, the times when the vulnerability scan would trigger the problem were variable. The vulnerability scan is run against a large estate of servers and there was no guarantee as to how long it would take each time. In the end I decided to schedule the task to run every 10 minutes over a several hour period on Monday mornings.
As always, I hope you've found this post helpful.