Blog

SCOM: Add community power and keep the engine running…

 

Let’s face it: a good program is like a car. You need to maintain it properly to keep it in running condition. Well this is also the case with SCOM. I visit a lot of clients and one of the main questions I get is in fact how to make sure SCOM stays healthy and running.

Field_maintenance_on_a_1956_model_Cessna_172

Well there are some indicators in SCOM itself suggesting that there are issues with the install but unfortunately they are easily missed or looked over.

So this is where the awesome SCOMunity steps in!

This post should become your one stop location to find some of the leading community management packs you’ll need to keep your SCOM environment going or at least very easily pinpoint where there are (potential) issues.

These are management packs I actually install at almost every client I visit:

TAO Yang’s Self Maintenance management pack

Tao has been an active member of the Scomunity for quite some time now  and his self maintenance management pack is already in version 2.4.0. This management pack features a lot of tasks and checks that every SCOM admin should perform but it’s always cool to have a management pack doing it for you. Before I used TAO’s management pack I had a standard PowerShell toolkit to automate some of the tasks but now if the customer approves it (remember it’s still an unsealed MP so sometimes you need approval of customers) I load up the management pack and configure it. TAO really went all in and included also a PDF to assist you in installing and configuring the MP.

image_thumb7

Image (Tao Yang)

Some of the tasks I like the most (this is not a full list but just to highlight the things I personally find handy in there):

  • Automatic scheduled distribution of agents across the  management servers. Even with the possibility to limit the amount of agents distributed between the management servers
  • Auto approve agents in management pending based on a input mask to make sure they are allowed in the MG.
  • Check whether a management server is placed in maintenance mode
  • Find orphaned alerts
  • ….

This is an invaluable management pack for every scom admin out there. Whether you are visiting a lot of clients and need to get a clear view on the health of the management group or have only one client. This will free up a lot of your time and also reduce the chance of problems because there are early warning systems build-in. More info here:

http://blog.tyang.org/2014/06/30/opsmgr-2012-self-maintenance-management-pack-2-4-0-0/

SCOM Health Check Reports V3 (Oskar Landman + Pete Zerger)

One of the other hard things to do is in fact give a small report to the SCOM admin / supervisor telling how SCOM is actually doing and whether things are well in your SCOM environment.

Just recently Oskar Landman and Pete Zerger have updated their SCOM Health Check reports to give you a proper status in one glance.

This set of reports will give you an even more in depth view how you’re environment is doing and what are the key points to work on to further enhance your environment. One of the key benefits is the fact that you can check in detail that every aspect of your dbase and what is coming into them is valid and not too much. This is really helpful if you start your noise cancelling to really focus on the big consumers concerning space and cpu time of your SQL dbase.

Make sure you read the manual thoroughly before proceeding as you need to take additional steps prior to installation.

More info here:

image_thumb12

Image (SystemCenterCentral)

Check the article here: http://www.systemcentercentral.com/scom-health-check-reports-v3/

Download here: https://gallery.technet.microsoft.com/SCOM-Health-Check-Reports-c32e8f93

Let’s crank up that download count because this is definitely something you need in your SCOM environment

TAO Yang’s SCOM datawarehouse health script

This one is clean and simple. All the different things you would need to check on your datawarehouse but actually probably never did combined in a PowerShell script.

All the different aspects of what you need to know about your Datawarehouse are reported and gathered on a html page. This is one of the things you actually need to do at every customer site you come across to get an instant view on how the datawareshouse and more important the SCOM environment is setup and performing.

More info can be found here: http://blog.tyang.org/2015/06/11/opsmgr-2012-data-warehouse-health-check-script/

In conclusion

These are just 3 community provided tools which are freely available to help you get more insight in your environment or the environment you need to troubleshoot.

Special thanks goes out to TAO Yang, Oskar Landman and Pete Zerger in particular to invest their time in making these solutions possible / available and of course also thanks to all the other active community members who keep developing new things for SCOM and system center in general.

If you are just starting with SCOM: This is not an exhaustive list of all the add-ons out there. If you are looking for a 1 place stop to start your journey take a look at my: SCOM Link overview blog which is currently under revision: http://scug.be/dieter/2012/12/30/scom-2012-overview-link-blog/

Microsoft Operations Management Suite: Remove workspace

 

This blog post is part of the Microsoft Operations management Suite Quick start guide which can be found here: http://scug.be/dieter/2015/05/08/microsoft-operations-management-suite-quickstart-guide/

 

One of the things I’ve noticed right away when I fist opened the Microsoft Operations Management Suite (OMS) was the fact that I had different workspaces. They were all created in opinsights because the fact I added 3 different management groups in their respective SCOM console.

No sweat of course. I now build 1 management group in my lab environment where I configured everything so I wanted to get rid of the other workspaces.

Turns out there are 2 ways you can delete a workspace and in fact this was not clear in the beginning.

How to get to the “close workspace” option

The remove option is well hidden in the menu’s to probably avoid deletion by accident which is actually a good thing but it’s a little bit too hidden in my humble opinion.

To get to the remove option follow the steps below:

Log on with your account. You will actually get all the different workspaces which are configured and hold data:

printscreen-0310

In this case I would like to remove the DWIT workspace as this is my ancient lab environment.

Select DWIT and open the workspace.

printscreen-0312

Select DWIT in the right upper corner and select the DWIT EUS | administrator wheel:

printscreen-0313

At this point you will have the settings of your workspace and right at the bottom there’s an option to close the workspace.

NOTE: Make no mistake your workspace will be removed and your data will be erased!

printscreen-0315

Now here is where things can go either way. There are 2 different options here:

  • Workspace connected to a MS account
  • Workspace connected to a Azure subscription

Close a workspace connected to a MS account

This one is actually very simple.

If you see the printscreen of the post above just click close workspace…

printscreen-0316

OMS will present you with a nice message box with what’s going to happen and kindly asks you why you want to close.

Note: It’s not required to select an option but please do so to help Microsoft further develop the product to whatever direction you want it to go.

Close a workspace linked to an azure subscription

When your workspace was created with the azure management portal you will not be able to close your workspace from the OMS interface but you will need to delete the workspace in azure itself. You will get the message “This account can only be deleted from the Azure Management Portal”

printscreen-13-05-2015 0003

Open your Azure management portal and navigate in the bar in the left to Operation Insights (note this name can be changed when you read this article as MS is aligning all the naming toward the OMS brand):

    printscreen-0308

Select the account you want to delete and press the delete button at the bottom of the page

printscreen-13-05-2015 0002

Are you really sure?

printscreen-0309 

At this point the account is deleted and within a couple of minutes it should disappear from the available workspaces.

  printscreen-0318

Note: The accounts that are created outside of the Azure portal will have a GUID like name. This name is generated when you link a workspace to your Azure account.

Microsoft Operations Management Suite: Connect Datasources

 

This blog post is part of the “Microsoft Operations Management Suite: Quickstart guide” which can be found here: http://scug.be/dieter/2015/05/08/microsoft-operations-management-suite-quickstart-guide/

 

After we have successfully created our workspace and have installed our Solutions it’s now time to bring in our data to start the magic and witness the insight in our data that OMS can bring

Here you have 3 options:

printscreen-8-05-2015 0000

  • Attach Servers directly (limited to 64 bit): This is used if you want to attach a server which is not monitored by SCOM. A certificate will be generated and inserted into a package that downloads and installs the Microsoft monitoring agent service onto the desired server and connects the server to your OMS.
  • Attach System Center operations manager: You can attach various management groups in OMS. If you click connect you will be guided to the on boarding process for connecting a SCOM environment to OMS. More on this later
  • Attach Azure Storage account: you can add a Azure storage account to facilitate the availability options regarding backup restore etc. More on this later in this blog series.

Note: If you receive errors when connecting these servers to your environment review this troubleshoot article to set the firewall correctly: http://blogs.technet.com/b/momteam/archive/2014/05/29/advisor-error-3000-unable-to-register-to-the-advisor-service-amp-onboarding-troubleshooting-steps.aspx

Connecting a standalone server to OMS:

If you want to attach several servers which are not monitored by SCOM you can easily download the agent and installed. No need to fiddle with the certificates yourself any more!

Download the agent and install it on a server:

printscreen-8-05-2015 0008

The agent package is around 25mb and will be downloaded to your local machine. Transfer the package to a machine which is not monitored by SCOM and install the package.

Note: The same restrictions as installing an agent from the console apply. It’s not possible to onboard a server which has a SCOM component installed such as a gateway server , management server,… Which makes sense because if you have these servers in place you have a SCOM environment and it’s far more easy to onboard the management group entirely instead of doing this per server.

Copy the MMASetup-AMD64 package to your server and run as administrator

printscreen-8-05-2015 0009

The standard manual install dialog for a Microsoft Monitoring Agent Starts

printscreen-8-05-2015 0010

click through the first screens

printscreen-8-05-2015 0011

printscreen-8-05-2015 0012

The next screen is interesting. Here we need to decide whether we are going to actually install the microsoft monitoring agent exclusively for OMS or also for the on prem SCOM. In this scenario we are choosing to exclusively use the agent for OMS

printscreen-8-05-2015 0014

Now we need to fill in the GUID keys which are shown on the OMS page right under “connect a server”.

The workplace ID is straight forward: The workplace ID noted in the OMS console

The Workspace key is in fact noted as the “private key” in the OMS console.

Note: Again this probably will be aligned after the SCOM console is aligned with the new OMS system.

printscreen-8-05-2015 0015

Click next and install

printscreen-8-05-2015 0016

printscreen-8-05-2015 0017 printscreen-8-05-2015 0018

Finish. Wait 5 min and refresh your console:

printscreen-8-05-2015 0019

Note: if you have more than one workspace make sure you select the correct workspace where you want to connect the server to as the id will be unique per workspace.

Connecting a System Center operations manager management group:

Open your SCOM environment and navigate to Administration > Operational Insights > Operational Insights Connection

Note: These names will probably change in the next UR or management pack release.

printscreen-8-05-2015 0001

Click configure or Re-configure Operational Insights

printscreen-8-05-2015 0002

printscreen-0301

Select whether you are using a work or Microsoft account. I’m using a Microsoft Account:

The associated workspaces with your account are loaded and selectable

printscreen-0302

Select your workspace and click update or create

printscreen-8-05-2015 0004

Next choose which groups or servers you would like to send data to your OMS workspace. Click add a computer / group in the tasks bar on the right.

printscreen-8-05-2015 0005

Select the servers / groups you want an click add

printscreen-8-05-2015 0006

 

So now all the servers are coming into your Operational Insights Managed view.

printscreen-0305

This management group will show up in your OMS workspace as 1 connected management group:

printscreen-8-05-2015 0007

The name / number of servers and the last data received is shown to give you a clear view on the status of your management groups.

Configure log collection

A lot of solutions are dependent on the logs received. As this was one of the first valuable additions that opinsights brought this is almost mandatory to have in OMS as well.

Go to the last step of the “wizard” and select what logs that need to be gathered on the connected servers:

printscreen-8-05-2015 0021

When configured we’ll get a nice 100% mark and we are ready to go!

printscreen-8-05-2015 0022

Summary

Connecting is a breeze if your servers are able to reach the OMS service on port 443. You can connect individual servers or entire management groups where you decide which servers are actually sending data to the OMS service.

For now the agents for linux are not available yet but they will become available very soon.

So now you are all set to start playing with the Solutions you have installed while data is pooring in!

Microsoft Operations Management Suite: Configure Workspaces

This blog post is part of the Microsoft Operations management Suite Quick start guide which can be found here: http://scug.be/dieter/2015/05/08/microsoft-operations-management-suite-quickstart-guide/

 

A wokspace is basically the same as your management group in SCOM. It contains all the differernt Solutions, connected datasource and azure account to start working. You can have several workspaces based with one account but interaction between different workspaces is not possible.

Create a workspace

In this scenario we are going to build a new workspace. Just choose the name / email and the region and click create

printscreen-4-05-2015 0001

Next up we need to link the Azure subscription we have associated to our Microsoft or corporate account. Note that having an Azure subscription is not a prerequisite for this step (you can just click not now) but it is highly recommended.

printscreen-4-05-2015 0002

To make sure you are the proper owner of the email (note that it doesn’t have to be an email that is by default the email address associated to your account) Microsoft is sending you a confirmation mail which you need to follow.

Click confirm now and continue.

printscreen-0300

At this point your workspace will be ready and you will have all the standard tiles but no data is poring in just yet.

Configure a Workspace

Head over to the Settings tile where you will be guided to connect your sources to the OMS service. In the past this involved setting up proxy servers and complicated settings as since the integration with SCOM this has become peanuts. OMS is also using the same entry point that Opinsights was using to get connected.

printscreen-4-05-2015 0003

First step is in fact to add solutions. Formerly known as Integration packs (IPs) these solutions each will have their own purpose to tailor the way you want to use OMS. There are by default already some Solutions installed so you can click “connect a data source” to continue.

printscreen-4-05-2015 0004

 

Now that you have your workspace configured it’s time to connect your datasources to get your data in!

 

 

Microsoft Operations Management Suite: Quickstart guide

 

So Microsoft Operations Management Suite (OMS) was launched during Ignite 2015 and is awaiting your data to show its power to give you the insights in your environment and actually manage your environment not limited to the boundaries of your own environment or your azure environment. But before we can play with the goodies we need to configure everything correctly.

printscreen-6-05-2015 0000

This guide will grow in time to be your one stop to get you going, configuring and using Microsoft Operations Management Suite (OMS) . Bookmark this post to get regular updates on my journey through OMS to help you save some time while exploring the possibilities of OMS.

Below is a list of topics that can be used to already start your journey:

Microsoft Operations Management Suite: A first glance

This blog post is part of the “Microsoft Operations Management Suite: Quickstart guide” which can be found here: http://scug.be/dieter/2015/05/08/microsoft-operations-management-suite-quickstart-guide/

It has been a while since i was been blown away by news about SCOM and monitoring in general. During the recent keynote of Ignite in Chicago however Microsoft delivered… I personally was surprised by the vast number of announcements regarding System Center in general and monitoring and management tools in particular.  One of the coolest things for me personally was the announcement of Microsoft Operations Management Suite (OMS).

printscreen-6-05-2015 0000

A little bit of history is in its place to show you this is not a product which was born overnight. The first sign that Microsoft was working on a service to monitor and aggregate data in the cloud emerged when System Center Advisor was launched. System Center Advisor was a small tool which gave you a quick overview of your compliance level of your environment and check to see how you are doing in installing and configuring System Center. With an update of once a day and not a lot of adoption this tool was not widely spread. Although it wasn’t this heavily used it actually paved the road for Opinsights preview. The Opinsights preview was leveraging the power of Azure to give you even more control on finding out how your data center was doing by using serveral free apps to make assessments based on data you’ve sent to the Azure cloud services. The integration was created in SCOM making it a usable tool and easier to configure. The service was free so I personally encouraged a lot of customers to start exploring it. The fact you could also connect machines directly without having SCOM added to the level of adoption.

So what brings OMS more than the previous versions?

Well in OMS will give even more integration to different services you will need to do to manage your datacenter, it will integrate even more into your Azure environment to become your one tool to deal with different aspects of exploring your datacenter.

The following 4 groups of tools are at this point integrated into OMS:

printscreen-6-05-2015 0001

Log Analytics

Log Analytics was already present in Opinsights but has been fine-tuned. You can now gather all logs of different tools and servers and see what events are actually the most common in your environment and take corrective actions accordingly. This is in my  personal opinion a very valuable addition if you would like to find out what the most common problems on your servers are. In fact in SCOM you actually need to configure what to monitor. Log Analytics however uses the power of the Azure storage to collect and keep all the events for you to easily query them and find out patterns and such.

Automation

This feature is new and will actually integrate Automation across the different components you have in your datacenter. The Automation module will integrate with  Websites, Virtual Machines, Storage, SQL Server, and other popular Azure services. The automation runbooks will be easily created through a drag and drop interface giving you basically the opportunity to create automation in seconds. Tying in to all the different components you can automate repetitive tasks across your on-prem and cloud services. This will decrease the margin for human error and like all the different automations if it’s done correctly you will actually lower downtime and increase your view on your environment.

Availability

Availability is not only keeping your applications and data online but also making sure that they stay online or can be restored after a breach in service. The availability tools will give you the power to actually synchronize data between different locations to facilitate the different dataflows between the different locations to ensure that your data will be safe. In this automation tap the different tools will be place to make sure you have all you need to keep your environment up and running and restore as quick as possible. The automation apps will actually tie in to your Azure backup services such as: azure backup, azure site recovery,…

Security

Besides getting everything online and keeping it online a lot of companies are also concerned about keeping everything safe. In the modern world it is a challenge to find a right balance between a workable system and a secure system. The security apps will give you the insights you need to actually Identify malware and missing system updates, collect security related events, perform forensic, audit and breach analysis.

So how does it work?

If you were already using the opinsights preview feature your account is automatically transferred to a free account in OMS. This frree account will give you a 7 day retention and a maximum amount of data uploaded of 500Mb. This is solely for testing purposes to get you going. The integration remains in the SCOM management group and will actually upload all the data in CAB files to the OMS cloud service. Your tools will still be there in your dashboard with the possibility to actually connect more data sources to the OMS service. For more detailed instructions make sure to check out my series on OMS found here on my blog.

Check out the following links for more info:

SCOM: Configure a monitor recovery task for a healthy state

During a recent project a client had a small request to create a monitor and run a command when a device was not accessible anymore. Easy right! But (yep there’s always a but) they wanted to run a command when the monitor was returning back to a healthy state to restart a service when the device came back online… Hmmm and all in 1 monitor.

So the conditions were as follows:

Monitor:

  • Action: Run a PowerShell based monitor to test the connection with the device
  • BAD: Device is down => Run recovery task to remediate
  • GOOD: Device is up again => Run recovery task to restart service

(note: Always do this small matrix of a monitor design to exactly know what the customer wants)

I don’t have the device to simulate but came up with a small example in my lab to show you how to get this working with just 1 monitor. The situation in my lab is very simple. I want to turn on my desk lighting when my pc is on (and I’m working) and turn it off when my pc is not online.

My conditions:

Monitor:

  • Action: Run Powershell based monitor to test the connection and pass the result to SCOM
  • BAD: PC is offline: => turn off my desk lighting
  • GOOD: PC is online:=> turn on my desk lighting

So first things first we need to test the connection to see whether my pc is running. To check this I’m using this small script:

[xml]

param ([string]$target)
$API = New-Object -ComObject "MOM.ScriptAPI"
$PropertyBag = $API.CreatePropertyBag()

$value = Test-connection $target -quiet

$PropertyBag.AddValue("status", $value)

$PropertyBag
$API.Return($propertybag)

[/xml]

So I’m testing the connection and sending the response to SCOM. The  PowerShell “Test-Connection $target –quiet” command will just return true or false as a result whether the target is accessible or not

Creating the Monitor with Silect MP Author

The creation of this monitor consists of 2 parts:

  • Defining the class where the monitor will be targeted to and therefore the machine which will test the connection to the desktop
  • Passing the status from the machine to SCOM and take action by using a monitor

Defining a class:

To properly target this monitor we need to create a class in SCOM which identifies the servers that need to test the connection. In this case I’ve added a reg key to all servers who need to ping the desktop so I’m starting a Registry Target to create my class:

printscreen-0254printscreen-0255

I fill in a server that has the key already in there to make it much easier to browse the registry instead of typing it in with an increased margin for errors.

printscreen-0256

Select the Registry key you want to look for

printscreen-0257

In my case I’ve added a key under HKEY_LOCAL_MACHINE\Software\pingtestwatchernode

printscreen-0258

Select the key and press add and ok

printscreen-0259

Identify your registry target:

printscreen-0260

Identify your discovery for the target

printscreen-0261

In my case I just check whether the key is there. No check on the content.

printscreen-0263

The discovery will run once a day.

printscreen-0264

Review everything and press finish

printscreen-0265

At this point our class is ready to be targeted with our script monitor.

Next up is to create the monitor:

Create a new script monitor:

printscreen-0266

Browse to the PowerShell script and fill in the parameters. In this case I have 1 parameter which is “target” and will hold the IP of the desktop.

printscreen-0267

Define the conditions:

Healthy condition is when the status is true and type boolean

printscreen-0268

Critical condition is when the status is False

printscreen-0269

Note: I’m using a “boolean” Type

Configure the script and select the target you have created earlier on and the availability parent monitor

printscreen-0270

Identify your script based monitor

printscreen-0271

Specify a periodic: run every 2 minutes

printscreen-0272

No alert generation necessary.

printscreen-0273

Review all the parameters and create the script based monitor.

printscreen-0274

Load the management pack in your environment and locate the monitor:

printscreen-0278

Check the properties => recovery tasks and create 2 recovery tasks for the Health state “critical”.

Note that the screenshot below already shows the correct healthy state after config of the mp.

printscreen-0279

Export the managment pack and open it in an editor and locate the “recoveries” section to find your recovery tasks we just created:

printscreen-0280

scroll to the right and locate the “ExecuteOnState” parameter and change the one you want to run when the monitor goes back to healthy from “Error” to “Success”

Save the management pack and reload it in your environment.

printscreen-0281

So all we need to do is test it…

My pc is on: IT-Rambo has his cool backlight:

20141130_230930098_iOS

My pc is off and the light is automatically turned off…

20141130_230904267_iOS

Final Note: If you use this method you need to make sure to NOT save the recovery tasks in the console anymore otherwise the different settings we just changed in our management pack will be again overwritten as SCOM can’t natively configure a recovery task for a healthy state.

You can use this basically for anything where you want to run 2 conditions on the same monitor or even 3 if you have a 3 state monitor.

SCOM: Monitor the monitor part 1: PowerShell

Recently I got a question of an engineer during a community event why SCOM didn’t notify him when SCOM was down.

My first response was very similar to the response of my favorite captain below: printscreen_surf-0018

But this got me thinking actually because the engineer made a good point. That to have a full monitoring you should have another mechanism in place to monitor the monitoring system. Most companies still have a legacy monitoring system in place that can be leveraged to monitor the servers of SCOM but let’s face it: keeping another monitoring system alive just to monitor the SCOM servers only adds complexity to your environment for a small benefit.

That’s why I started building a small independent check with PowerShell. In part 1 of this series I’ll go over how to monitor whether your management servers are still up and running.

To do this we need to make sure that we have a watcher node which is able to ping the management servers. This watcher node may be any machine capable of running PowerShell and does not need to have operationsmanager PowerShell module available. This to make sure we are operating completely independent from SCOM.

Process used

The graph below shows the process used:

monitorthemonitor_servers

In my environment I have 2 management servers which are reachable from the watcher node. The first step is to dynamically determine how many management servers are in my environment. To do this I’m creating the input file which is generated by PowerShell on a management server and updated once a day. This is an automated process because face it: if we need to think about changing the infile.txt when we add or delete another management server we will forget.

This file will be available on the watcher node to do the ping commands even when the management servers are down.

Configuration on the Management server

(this is action 1 in the graph above)

To generate the infile containing all the management servers which are currently in our environment we need to execute the following PowerShell command on the watcher node:

[xml]
#=====================================================================================================
# AUTHOR:    Dieter Wijckmans
# DATE:        03/12/2014
# Name:        Readms.PS1
# Version:    1.0
# COMMENT:    This script will read out all the Management servers in a management group and saves it
#           into a txt file which is used to ping the servers from an external watcher node.
#           This script is scheduled on a management server via scheduled tasks.
#           Make sure to fill in your destination (which is your watcher node) in the variable
#
# Usage:    readms.PS1
# Example:
#=====================================================================================================
$destination: "fill in the destination on the watchernode here"
$ms = get-scommanagementserver

foreach ($mstemp in $ms){
$ms.DisplayName | Out-File $destination
}

[/xml]
Schedule this script on the management server via scheduled tasks and run it once a day.

The program to run is: powershell.exe c:\scripts\readms.ps1

This will generate the infile for the ping command to check the management servers and will place it on the watcher node.

Configuration on the Watcher node

(this is action 2 in the graph above)

Next up is to configure the watcher node to monitor our management servers and alert when they are unreachable. This is done by executing the following PowerShell on a regular basis through schedule tasks. I schedule this task every 5 minutes. This means that you get a mail every 5 min until it’s resolved. Better annoy a little bit more than just send 1 mail which just drowns in the mail volume.

[xml]
#=====================================================================================================
# AUTHOR:    Dieter Wijckmans
# DATE:        03/12/2014
# Name:        Pingtest.PS1
# Version:    1.0
# COMMENT:    This script will ping all the Management servers in a management group according to the
#           input file and escalate when a server is not reachable.
#           Make sure to fill in all the parameters in the parameter section.
#           This script is scheduled on the watcher node via a scheduled tasks.
#           Make sure to fill in your destination (which is your watcher node) in the variable
#
# Usage:    pingtest.PS1
# Example:
#=====================================================================================================

#parameter section: Fill in all the parameters below
$infile = "Location of file with management servers listed"
$outfile = "Location of file which will keep historical data on the pings"
$smtp = "fill in your smtp config to send mail"
$to = "The destination email address"
$from = "The from email address"

#reading the date when the test is executed for logging in the historical file
$testexecuted = Get-Date
#reading in all the objects listed in the infile
$objects = get-content $infile

#running through all the objects and taking action accordingly
foreach ($object in $objects)
{
$pingresult = Test-Connection $object -quiet
if ($pingresult -eq $True)
{
$pingresult = "Online"
}
else
{
$pingresult = "Offline"
$subject = "SCOM: Management Server " + $object + " is down!"
$body = "<b><font color=red>ATTENTION SCOM support staff:</b></font> <br>"
$body += "Management Server: " + $object + " is down! Please check the server!"
send-MailMessage -SmtpServer $smtp -To $to -From $from -Subject $subject -Body $body -BodyAsHtml -Priority high
}
$result = $object + " :ping result: " + $pingresult + " :" + $testexecuted | Out-File $outfile -append

}

#read the length of the inputfile and validate the same amount of lines in the outfile to validate whether all management
#servers are down.
$filelength= Get-content $infile | measure-object -Line
$numberoflines = $filelength.Lines
$file = Get-Content $outfile -Tail $numberoflines
$wordToFind = "Online"
$containsWord = $file | %{$_ -match $wordToFind}
If($containsWord -notcontains $True)
{
$subject = "SCOM: ALL Management Servers are down!"
$body = "<b><font color=red>ATTENTION SCOM support staff:</b></font> <br>"
$body += "All Management servers are down. Please take immediate action"
send-MailMessage -SmtpServer $smtp -To $to -From $from -Subject $subject -Body $body -BodyAsHtml -Priority high
}
[/xml]
Note: Make sure that you change all the parameters in the parameter section.

This script will ping all the machines which are filled in in the infile we created earlier and writes this to the out-file. The outfile is than evaluated and a mail is automatically send when a management server is down. If ALL management servers are down a separate mail is sent to notify that SCOM is completely down.

You can change the mail appearance in the $body fields in the PowerShell.

The outfile will have the following entries:

printscreen_surf-0020

My servers were Offline last night at 21:13:38. So the mailing was triggered and  the mail will look like below when SCOMMS2 is down:

printscreen_surf-0019

When all management servers are down it will look like this:

printscreen_surf-0021

So now we get completely independent from SCOM mails telling us there’s an issue with the SCOM management servers.

  • So what if our watcher node is down? Well I’ve installed a SCOM agent on this machine with a special subscription to notify me when it’s down.
  • So what if our management servers are down AND my watcher node is down… Well then you probably have a far greater problem and your phone will probably be already red hot by now…

You can find the PowerShell scripts and the files here on Technet Gallery:

download-button-fertig11

In Part 2 I’ll go over the ability to monitor your SQL connection of the management servers.

SCOM: PowerShell tip: Set Resource Pool Automatic members

 

Today I ran into a situation where I had to test an advanced notification setup to send alerts to another helpdesk system.

The notification channel activated a PowerShell script with parameters out of the alert to send data to the other system. After creating the notification channel there was no way to check whether the server I already configured was functioning correctly. My 2 management servers were automatically part of the Notifications resource pool thus making it not possible to force my testing through my configured management server.

These are the steps to troubleshoot the notifications on 1 management server and rectifying the situation again after testing and configuring both management servers:

These are my resource pool:

printscreen_surf-0008

Notice the difference in Icon for an automatic and manually populated resource pool.

Right click the notifications Resource Pool and select manual membership.

printscreen_surf-0009

An automatic properties dialog will pop up to give you the possibility to change the membership of this resource pool. Even if you press cancel at this point the resource pool will be converted to manual membership:

printscreen_surf-0010

The active members are shown here. I’ve removed my SCOMMS2 server to continue my test on SCOMMS1 for the PowerShell notification channel.

printscreen_surf-0011

 

printscreen_surf-0012

So after my tests were successful and I configured the SCOMMS2 I wanted to reset the resource pool back to automatic. The catch however is the fact that this is not possible via the GUI.

The following PowerShell oneliner will do the trick however:

get-scomresourcepool –displayname “notifications resource pool” | set-scomresourcepool –enableautomaticmembership $true

printscreen_surf-0015

After hitting F5 the notifications Resource Pool is back to automatic and the 2 management servers are back in the Resource pool

printscreen_surf-0008

printscreen_surf-0016       

Note:

  • If you are executing a PowerShell script on the management server make sure to have the same version of the script on both management servers in the same location
  • Always make sure that the notifications resource pool is set back to automatic to actively divide the load between all the management servers. Otherwise you will loose the great benefit of Resource pools.

SCOM: Connect management groups between on-prem and Azure

 

During a recent project I explored the benefits on hosting a 2 legged SCOM environment for both on-prem and cloud services. Although this is possible with just one management group and site to site VPN to the cloud they opted for a 2 management group approach to keep a certain sort of divider between the on-prem and the cloud.

In this blog post (who knows it could become a series) I’ll show you how to connect the management groups to each other so they can exchange alerts and use 1 console but benefit from presence of a management group on both platforms.

wall2top_z23gd-129

In this scenario I’m going to use connected management groups. As explained here http://technet.microsoft.com/en-us/library/hh230698.aspx

Connecting management groups in SCOM 2012 gives you a couple of benefits. The biggest one in my opinion is the fact you can have multiple management groups with different settings but use 1 console to get all the alerts. The customer wanted the ability to monitor their clients on different thresholds than their own systems. The own systems were mainly situated on site although the other systems were at the clients site or in the cloud.

The management group which will have the consolidated view is called the local management group. In my example it is VLAB which is on prem. The other management groups are called “connected management groups” in this case VCLOUD.

They relate to each other in a hierarchical fashion, with connected groups in the bottom tier and the local group in the top tier. The connected groups are in a peer-to-peer relationship with each other. Each connected group has no visibility or interaction with the other connected groups; the visibility is strictly from the local group into the connected group.

So in this scenario it’s a good idea to connect these management groups to see all data in 1 console for both on-prem and client based. In VCLOUD it’s not possible to see the alerts of VLAB but the other way around it’s possible.

So what do we need to do to obtain this (even without different AD domains and firewalls in between).

First of all prep the VCLOUD in Azure:

Create endpoints on Azure machine

In order to be able to resolve the Azure management group from the on prem we need to make sure that connection is possible to the VCLOUD management server. This is done through port 5723 and 5724.

Open the Azure management portal:

My server is called vcloud-ms1

printscreen-0231

Open the endpoints and add 5723 and 5724 to the endpoints. This in fact opens the firewall of azure to your machines. All communication will happen over these 2 ports.

printscreen-0232

Click add and fill in the endpoints as shown below.

printscreen-0233

Next find the following

  • The Public Virtual IP address (VIP) and take a note. In my case it’s 23.101.73.xxx
  • The DNS name: in my case vcloud-ms1.cloudapp.net

 

printscreen-0234

Prepare the onsite management server

Now that the management server of our VCLOUD management group is configured we need to configure the management server in our VLAB environment to become the local management group which will receive the alerts.

First we need to make sure that the onsite server can resolve AND reach the server in VCLOUD management group.

This can be done by changing the hosts file on the VLAB management server.

Go to c:\windows\system32\drivers\etc\ and open the hosts file:

printscreen-0235 

Note: I’ve deleted the last 3 digits of all the IP addresses above you need to fill in the full IP address as documented in the Windows Azure console.

Let’s check whether this works now from the VLAB management server. Doing THE route check: ping the hostname:

printscreen-0236

hmmm not working. Did we configure something incorrect? Check, double check. NO.

Well this makes perfect sense because: PING IS DISABLED towards Azure machines. Therefore you will get a Request timed out all the time you test no matter what you configure!

Connecting the management groups

Now that we have both ends configured it’s time to see whether we can connect the management groups. Remember: initiate the connection from the local management group (the one who needs to see all alerts and is on top of the hierarchy)

So let’s connect to the management server in VLAB:

Open the Administration pane and select Connected Management Groups and click

printscreen-0237

Right click and choose Add Management Group

printscreen-0238

Fill in all the data requested:

  • Management Group Name: The name of the VCLOUD management group
  • Management Server: The name of the management server in VCLOUD (make sure to use the exact name as filled in in the host file)
  • Account: Because the account we use as SDK service resides in the VLAB AD and is not known in the VCLOUD we need to use the VCLOUD credentials

printscreen-0239

Note: You need to initiate this from the management server where you have changed the host file so make sure there’s a console on there

You will get the message below because it’s not possible to validate the account in the local AD:

printscreen-0240

Just click next and normally you should be connected at this point:

printscreen-0241

Success!

So now all we have to do is configure what we want to show on the local management group.

 

I’ll explain this further in the next blog in this series.

Enough talk, let’s build
Something together.