I was helping a customer recently to investigate a performance issue that they were experiencing in their MOSS 2007 farm which was running on Windows Server 2003 and using NTLM for authentication. During peak times throughout the day SharePoint sites were becoming incredibly slow to load, the strange thing about this issue is that it didn’t affect all WFE’s at any one time – for example we could submit the same request to an alternatives WFE and sites were loading instantly. This issue didn’t always affect the same WFE so we couldn’t isolate it to a specific server.
When a WFE experienced this issue it had an exceptionally high number of current connections when compared to other WFEs that were not exhibiting the issue, this information was obtained from the PerfMon counter Web Service\Current Connections – http://technet.microsoft.com/en-us/library/cc782186(WS.10).aspx.
We ruled out the load balancer, network, SQL server…..pretty much everything! The ULS logs didn’t help either in this particular case so we decided to review the IIS logs, it was here that we discovered something that was alarming – each and every HTTP request that a user was making was being authenticated! This is not the default behaviour of IIS and is generally an indication that the metabase property AuthPersistSingleRequest – http://www.microsoft.com/technet/prodtechnol/WindowsServer2003/Library/IIS/b0b4ec5c-74f8-43e9-ac64-d8b852568341.mspx?mfr=true has been set to true.
I thought we were onto a winner; unfortunately this property was not set and IIS was therefore exhibiting the default behaviour of persisting authentication for a user when using the same connection.
After taking a closer look at IIS we discovered that URLScan 2.5 was running on each of the WFE’s, here is an excerpt from – http://technet.microsoft.com/en-us/security/cc242650 which provides an overview of the tool:
UrlScan version 2.5 is a security tool that restricts the types of HTTP requests that Internet Information Services (IIS) will process. By blocking specific HTTP requests, the UrlScan security tool helps prevent potentially harmful requests from reaching the server. UrlScan 2.5 will install on servers running IIS 4.0 and later.
As a test we disabled URLScan on one of the WFE’s and this returned IIS to its default behaviour of not re-authenticating each and every request – great! But what was the root cause? It turned out to be the URLScan option to remove the server header from responses; instead of disabling URLScan (it had been installed for a reason) we simply changed the configuration via the URLScan.ini to not remove the server header from HTTP responses.
Since this has been changed no further instances of the issue have occurred. My feeling is that this issue was ultimately caused by an authentication bottleneck on the WFE, basically it couldn’t keep up with end user requests as it was unable to authenticate users quickly enough, the root cause of this issue could be that the DCs that the WFE was communicating with weren’t servicing authentication requests quickly enough and a backlog of requests was being created which would explain why Web Service\Current Connections was so high.
Chris Gideon has a fantastic post on the NTLM authentication process that explains by default a maximum of two concurrent NTLM authentication requests can be made – http://sharepoint.microsoft.com/blogs/cgideon/Lists/Posts/Post.aspx?ID=2 , this can be amended by changing the MaxConcurrentAPI setting which is one other potential resolution to NTLM authentication bottlenecks.