Wednesday, December 29, 2010

Configuring System Center Operations Manager 2007 (SCOM) Monitoring of SQL Server Agent Job Status

As part of the migration from Operations Manager 2005 to System Center Operations Manager 2007, we are configuring centralized monitoring of SQL Server job status.  Previously SQL Server jobs reported their status using notification configured within each job.  Configuring the job notification is a multi-step process:

  1. In the Authoring section of the SCOM console, select Management Pack Objects—Object Discoveries then search for “agent jobs”.  There are separate discovery types for SQL 2000, 2005 and 2008, and each needs to have an override created to allow discovery to occur.  Right click on the object and choose “Overrides—Override the Object Discovery—For all objects of class…”  Check the box for the Enabled parameter and set the override value to True.  Repeat for each version of SQL.  You will need to wait for the discovery to run which can take up to one day by default.  To verify the jobs have been discovered use the Monitoring section and select Microsoft SQL Server—SQL Agent—SQL Agent Job State.
  2. In the Authoring section of the SCOM console select Management Pack Objects—Monitor then search for “Last Run Status”.  Again, there are separate monitors for each version of SQL server.  Right click on the Last Run Status object, choose “Overrides—Override the Monitor—For all objects of class…”.  Check Alert on State and set the Override Value to “The monitor is in a warning state”.  Check Generates Alert and set the Override Value to “True”.
  3. In the Administration section of the SCOM console select Notifications—Subscriptions, right click and choose New subscription.  Provide a subscription name, a criteria that includes the Windows server you are monitoring (for example an alert raised by an instance with a specific name or an instance in a group), a subscriber (generally resolving to an email address) and a channel (generally SMTP for email).
  4. Test the Subscription by running a job that alternately succeeds and fails.  Wait 15 minutes between each run for testing purposes.

Troubleshooting:

  1. Check that the SQL job is succeeding or failing by viewing the job history in the SQL Management console.
  2. Check that the state is changing by checking the SQL Agent Job State in the SCOM Monitoring console.  Right click the job and choose the Health Explorer.  Navigate to Availability—Last Run Status and choose the State Change Events tab.  You should see previous state changes for the job.
  3. Remember to refresh the Management console to get current information.
  4. Remember that the alerts are not sent immediately.  The monitors have an interval parameter that specifies the number of seconds between checks of the job status.
  5. Changing the Alert Severity to critical and the Alert of State to “The monitor is in a critical state” does not appear to generate alerts (as of SCOM 2007 R2).

No comments:

Post a Comment