SharePoint 2010 Logging Database – An administrators best friend

by Brendan Griffin on 2/19/2010 6:14 AM

Category: Administration; SharePoint 2010

One of the features of SharePoint 2010 that is getting me really excited is the logging database. The purpose of this database (which is named WSS_Logging by default) is to aggregate logging information from the farm into one central location. As somebody who spends a lot of time in front of customers helping them to maintain a healthy and stable farm this feature will be a killer app in SharePoint 2010 for me, I’m sure that it will save me heaps of time.

In a WSS 3.0/MOSS 2007 farm if I want to find out the average CPU utilisation of a server within my farm over the course of a week for example, I would need to setup Performance Monitor with the relevant counters then leave it for a week and then take the data and analyse it.

Similarly if I wanted to analyse the ULS logs or event logs I had to manually extract these from each server and merge together which could be time consuming, unless I was using SPDiag – http://blogs.msdn.com/sharepoint/archive/2009/02/05/sharepoint-diagnostics-spdiag-tool-v1-0-for-sharepoint-products-and-technologies.aspx

What a pain! SharePoint 2010 offers so much functionality that allows administrators to be pro-active and detect issues before they actually begin to affect end users thus keeping the farm in tip top shape.

The logging database by default contains the following information from all servers within the farm and it is fully supported to query this database directly.

· ULS Logs

· Event Logs

· Selected Performance Monitor Counters:

o % Processor Time

o Memory Available Megabytes

o Avg. Disk Queue Length

o Process Private Bytes (OWSTIMER and all instances of w3wp)

· Blocking SQL Queries

· SQL DMV Queries

· Feature Usage

· A host of information on search crawling and querying

· Inventory of all site collections

· Timer job usage

SharePoint 2010 still writes to data the local event log and ULS logs on each server within the farm, the logging database isn’t a replacement for this it’s merely a way to make administrators life easier by aggregating and centralising the wealth of information that SharePoint 2010 provides.

Several SQL views are provided to simply the process of extracting information from the database. The information collected for crawling and querying is surfaced using a number of out of the box reports which are brilliant – these warrant a Blog post of their own!

The logging database is a fantastic way to get an insight into what is happening within a farm. I’ve lost count of the number of times customers have complained about random performance problems but as they didn’t have any proactive monitoring in place this was nearly impossible to pinpoint. If I had the logging database at my disposal I could have simply queried this database directly to find out exactly what was happening during the periods of poor performance instead of having to manually setup Performance Monitor and then instructing them to grab copies of the ULS logs every time performance within the farm became unacceptable!

Below is a screenshot of the PerformanceCounters view, don’t worry about the CounterID. There is a table that contains the mapping between CounterID and the actual counter itself – this table is named PerformanceCountersDefinitions

Here is a screenshot of the Event Logs view:

Other useful scenarios that I can think of for using this database are:

· Listing all site collections within the farm for reporting/billing purposes

· Base lining and trending the performance of a farm over time – great way to pre-empt when the farm may need to be scaled up or out to meet performance objectives – you may even find you have deployed too much hardware and can remove a server from the farm

· Determining how widely a feature is actually used?

· Understanding why searches are taking forever to return results?

· Identifying SQL queries that SharePoint generates that are expensive in terms of SQL performance – A great way to identify performance bottlenecks.

· Providing the ability to go back in time and identify timer jobs that have failed in the past

Brendan Griffin