SCOM – Windows Service Monitor Management Pack

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

NOTE – If you already have a previous version of this MP imported into SCOM and are updating to a newer version, backup your existing MP and follow the instructions below to copy the service monitoring configurations to the updated PM.

The progression should go:

  • Export your current Windows Service Monitoring MP.
  • Download the updated version, copy it to another location
  • Open the MP you exported from SCOM and copy the old service configurations to the new MP
  • So from this (search for “Dim Arr” to find it):wsm_updatemp
  • To this:wsm_update1mp
  • Now import the new MP into SCOM and you’re good to go

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

** Changelog **

Updates 2/25/2016

  • Added console tasks for service starting/stopping/status
  • Added a recovery to the service monitor for automatic service restart (disabled by default). It uses the three strikes and you’re out rule. By that I mean by default it will try to start a service 3 times in a 24 hour period and then it will not try to start the service again. This is to prevent a constant start/stop loop and is able to be configure to your preference
  • Added a timer reset monitor for when the 3 strike limit is reached, this is enabled by default and will automatically close by default
  • Fixed a small bug in the “ServiceMPEditor.ps1” tool where it wouldn’t import if you had 0 entries

Blog post begins here (with edits to reflect current updates)

I dislike the default Windows Service Monitoring Template in SCOM. Here’s a few reasons why:

  • The templates create a lot of monitoring bloat. Each service template configuration comes with a lot of baggage:
    • Its own class
    • Its own discovery
    • 3 monitors
    • 8 overrides
  • It takes forever to add/edit/remove a large amount of services from the console
  • If you override “Alert only if service start type is automatic” and set it to “false” it will alert for all startup types. This is a bummer when you have some disabled ones out there or want to temporarily set one to disabled and not have SCOM alert.
  • No built in flexibility for days of the week or hours of the day filtering
  • This method will not scale very well

So how do we make things better, it appears there’s some room for improvement. I’ve been thinking about and working on solving this problem for years and in that time two approaches to solving the problem have always float to the top:

  1. A scripted monitoring solution
  2. An optimized version of the built in template method

There’s pros and cons to each of these approaches as well

A Scripted Monitoring Solution

Pros:

Cons:

  • Performing overrides and alerting at the individual service level is a bit cumbersome

An optimized version of the built in template method

Pros:

  • You get most of the features you get with the template monitoring and some added extra ones that aren’t in the default template monitoring.
  • Each service has its own alert/monitor.

Cons:

  • Although it will scale better than the built in solution, it won’t scale up as well as the scripted solution.

An early version of the optimized solution is what I have here for you today. I’ve been working the last couple weekends on getting this put together and here’s a high level overview of the features I have so far:

  • I used the same datasource for service monitoring that SCOM uses so whatever magic they have going on to optimize that process should be included
  • Date and time filtering so you can exclude certain days/times from monitoring on a per service or service object basis
  • Console tasks for service starting/stopping/status
  • Automatic service recovery (disabled by defuault). Works on a 3 strikes and you’re out format. After 3 failures in a 24 hour period it will stop trying to restart the service. This is overridable to your needs.
  • Timer reset monitor (closes itself after 24 hours) to watch for and alert on the 3 strike out situation. This is enabled by default
  • Monitor all service startup types with the exclusion of disabled services from alerting
  • Created a custom discovery which discovers and adds all the service objects to one class rather than scattering them about like the templates do
  • Created a Service MP Editor to allow you to add/remove or mass import services to be monitored

Getting started

  • First download the MP and Service MP Editor here: https://gallery.technet.microsoft.com/SCOM-Windows-Service-c9fc3f95
  • Second, this is still really early days for this MP. Don’t go throwing it in production right away without testing. I’m releasing the MP in its current state hoping to get feedback from other people who might be interested in testing it out at this point.
  • Third, the Service MP Editor is really rough and only sort of tested. You’ll want to double check your MP before you import it back into SCOM or just edit the MP directly to add service monitoring configurations (more on how to do those things later)
  • Ok, so now that we have these things out of the way lets get started, you’ll see that you have three files from the download. The Readme.txt directs  you to this blog, the ServiceMPEditor.ps1 is used to add/modify/remove services to the MP, and WindowsServiceMonitor.xml is the management pack.wsmfiles
  • You can go ahead and import the management pack into SCOM, there’s no services that are discovered by defaults so out of the box it doesn’t really do much until its configured
  • Once you have it imported go ahead and run the ServiceMPEditor.ps1 script, this will launch the service add/edit/remove powershell form

wsm_poshgui1

  • Basically, what this form does is export the management pack from SCOM and reads the service discovery vbscript contained within the MP. It then allows you to edit the xml utilizing the GUI, and then you can import it back in after saving the changes.
  • Again, remember to have the MP imported and then lets go through adding the first one together

wsm_poshgui2

  1. Management Server – Put in the server name of one of your SCOM management servers.
  2. Management Pack Location – This is where the MP will be exported to and updated. This is also where any service exports/imports will go to/from.
  3. Get MP Config – Select this button after steps 1 and 2 are completed, this will export the MP to the “Management Pack Location”.
  4. New Service – Select this button to start a new service configuration
  5. Service Name – Put the “Service name” in here. Do not put in the “Display Name”, that will not work (unless they are the same).
  6. Confirm Service Edit – Select this after choosing all your service monitoring options. Dates that are checked will be days that the service is monitored and the time between “Monitor Start Time” and “Monitor End Time” is when the service will be monitored.
  7. Save MP Config – Once you’ve added your service(s) to be monitored select this button to save the service monitoring configuration

Next you’ll probably want to take a look at the MP and verify the GUI added in the services properly. A little about what is going on behind the scenes

The discovery script in the MP’s xml has an array that will hold the configurations for each service that will be discovered:

wsm_vbs1

After you save select “Save MP Config” in the GUI anything that’s in the “Services” list will be placed in the discovery script:

wsm_vbs2

Add the numbers for each day you want monitored together

After you see a couple of them in there you’ll get the idea of how it works and maybe you’ll prefer editing it manually, its up to you, the key is you have choices =)

  • Ok, back to main bullet points. So next you’ll want to head over to the console and check to make sure your service(s) is discovered:

wsm_discoveredinv

  • Looks pretty good – Some words about that.
    • The discovery interval is 86400 seconds (once a day). If you don’t want to wait for the discovery interval just restart the monitoring agent and it will run the discovery right away. You can also override the discovery interval if you like (Go to SCOM Console – Choose “Authoring” – “Management Pack Objects” – “Object Discoveries” – Search for: Windows Service Monitor Discovery)
    • You can also enable debug if you like, this will write 101 events to the SCOM event log each time the discovery is ran. It will let you know that it ran and which services the script discovered.
  • Lastly, lets take a look at the monitor configuration:

wsm_monitor

  • The big thing to mention here are the overrides: You can override the start/end times and the days of the week mask for one or a group (if you create one) of service objects here. Or just stick with your discovery settings. The world is your oyster.

Additional Monitoring Features Explained

  • Console tasks to start/stop and get service status. Pretty self explanatory, they will show up on the alerts as well.

wsm_consoletasks

  • Next is the recovery task. This is disabled by default. When enabled it will try to start a service automatically 3 times in a 24  hour period by default. If you would like to use this feature or change the settings you can override these values in the “Windows Service Monitor (Custom MP)” monitor

wsm_monitorrecovery

  • Also, there is another monitor that watches for when the service is restarted 3 times in a 24 hour period. This is enabled to alert by default and is called “Excessive automated service restart monitor”. Its a reset timer based monitor meaning that when it alerts it will stay open and then close after a period of time (24  hours by default)

 

I think that’s the basics on how things work. Again this is really almost a POC version of this management pack. I would love it if you would try it out and let me know your feedback! You can reach me via the comments or LinkedIn: www.linkedin.com/pub/andy-leibundgut/ab/23/a72

Honey do list:

  • See about adding performance gathering rules (disabled by default)
  • Add some views
  • Maybe some reporting?
  • Add an alert description that can be configured per service in the discovery (ServiceMPEditor.ps1) so each service can have its own unique “KB” in the alert description.
  • Make ServcieMPEditor.ps1 a little prettier.
    • Maybe add a folder picker.
    • Prefill Management Server/Folder with last location (config file or something)
  • What else? let me know your ideas!

Site Index

Advertisements