Don’t DROP Temp Tables in SQL Stored Procs

I’ve seen BizTalk log this obscure error message after calling a stored procedure via the WCF-SQL adapter:

System.Data.SqlClient.SqlException: The current transaction cannot be committed and cannot support operations that write to the log file. Roll back the transaction.

I traced the error back to the use of DROP TABLE #tempTableName statements, particularly inside CATCH blocks.  Removing the DROP TABLE statements from the CATCH blocks surfaced the real, underlying SQL error messages.

It’s bad practice, and simply unnecessary, to explicitly drop temp tables in stored procs.  SQL Server caches and reuses temporary objects such as temp tables, so a DROP TABLE statement doesn’t always drop the object anyway.  It’s better to let SQL Server manage them by itself.

Here are some references on the topic:

“A local temporary table created in a stored procedure is dropped automatically when the stored procedure is finished.” (Reference)

 

“Dropping a temporary table in a procedure does not count as DDL, and neither does TRUNCATE TABLE, nor UPDATE STATISTICS.  None of these things prevent temporary table caching (so it does not matter whether you explicitly drop a temporary table at the end of a procedure or not).” (Reference)

 

Choosing a TFS 2010 Process Template for Scrum

I recently had to select a TFS 2010 process template to support a new development project using Scrum. Besides the obvious need for the template to work well for a Scrum project, I also wanted to keep things simple and flexible for my client. It was critical to choose a fully supported template with a future upgrade path for TFS v.Next.

Some of the considerations included:

  • How well does the template support Scrum?
  • How well rounded are the various TFS artifacts (work items, reports and SharePoint)?
  • Is there any extra tooling, documentation or other benefits?
  • How well supported is the template today and (best guess) into the future?

One of the great debates around TFS best practices is whether to create a separate team project for every development project, or whether to use a single team project to contain multiple dev projects. There are valid arguments for each option, which have largely been discussed elsewhere. To reduce the administrative overhead of managing many team projects and to provide consistency of the process template configuration, I decided that multiple dev projects in one team project was the right choice. If there are ever any custom modifications to the process template, it will only need to be done in one place.

That led to a few more questions:

  • Does any process template tooling assume that the team project contains only one dev project?
  • Do the reports included with the process template support multiple dev projects, or can they be easily modified? This applies to both SSRS and Excel reports.
  • Can the work item queries be easily modified to support multiple dev projects?

All of these requirements narrowed the field to three:

  1. MSF for Agile Software Development V5.0
  2. Visual Studio Scrum 1.0
  3. Scrum for Team System V3.0

I quickly ruled out Visual Studio Scrum due to concerns about its lack of maturity and support. My guess is that this template will morph into an out-of-the-box template as part of TFS v.Next. Today, however, it’s very basic. It doesn’t have any Excel reports and very few SSRS reports, no document templates and only very basic SharePoint support. It has a handful of seven basic work item types. On the project forum, I found a lot of questions and issues raised with no response from Microsoft, and there haven’t been any posts from Microsoft for months. I got the feeling that this template wasn’t going to receive any more attention until it becomes part of the TFS product (if it ever does).

That narrowed the field to two: MSF vs SfTS. SfTS is in its third release, now under the EMC brand, so it has been refined through real-world experience over many years. Support was a concern right away, because the SfTS forums are pretty dead and there is no official support program. The extra tooling is nice (TFS Workbench), the work item templates make sense and there’s a good selection of SSRS reports. There’s even a Windows service to do time rollups and so on. There are good feature lists and comparisons elsewhere, so I won’t spend time on that here.

In the past, SfTS hasn’t always had a clear upgrade path. There was little documentation about upgrading an SfTS project from TFS 2005 to 2008, for example. Today there’s a migration tool from V2 to V3, which is great. However, it’s anyone’s guess if and when the EMC employees who maintain the template will carry it on to TFS v.Next. That’s a definite concern.

One big sticking point with SfTS is that it has a built-in assumption that one team project holds a single dev project. If you use the project setup wizard in TFS Workbench, it will wipe out and replace the areas and iterations, etc. defined in the project. It also relies on a very specific iteration hierarchy which builds off of a release, not a product. There didn’t appear to be any good way to use this template with multiple dev projects in a single team project, and that’s what finally got it crossed it off the list.

The choice was MSF for Agile Software Development V5.0, thanks to its clear support from Microsoft, robust process guidance documentation and document templates, extensive reporting, good SharePoint support, reasonable selection of work item types and ability to be tweaked to support multiple dev projects. Is it the best Scrum template? Maybe not, but I think it will work fine for our needs. We’re not purists about Scrum or any agile process, so we’ll take the good parts and tune it to work best with the client’s culture and the particular skills on the team.

Workaround for infinite “Please wait while the installer finishes determining your disk space requirements” dialog during MSI install

Tonight I was attempting to test the MSI installer for the Deployment Framework for BizTalk when I encountered an infinite dialog box stating "Please wait while the installer finishes determining your disk space requirements.”  This is by no means the first time that I’ve seen this occur, but usually it goes away after restarting the install once or twice.  Unfortunately, this time every single attempt at a GUI install was blocked by this issue.  (As documented elsewhere, using msiexec.exe to start an unattended install worked, but I needed to test the GUI.)

image

The environment was a virtual machine running Windows Server 2008 on Windows Virtual PC (Windows 7).  As others have reported, this problem appears to happen much more often, if not exclusively, on virtual machines.

I figured that since disk space calculation was behind the issue, the best approach was to eliminate as many disks as possible from my virtual machine until the problem (hopefully) disappeared.  I had one (virtual) floppy drive, one hard drive and one DVD drive.

Using Device Manager, I disabled the floppy controller.  The floppy drive disappeared, but the problem didn’t.  Next, I disabled ATA Channel 1 of the IDE ATA/ATAPI controllers, which controls the DVD drive.  The DVD drive disappeared – and so did the dialog box!  I did some experimentation with a few combinations of drive configurations, and the problem definitely followed the DVD drive.  It seemed that leaving the IDE controller enabled in the virtual machine, but setting Virtual PC to None for the DVD drive option also worked fine.  The common default setting that maps the virtual DVD drive to a physical DVD drive or to an ISO file caused the problem to appear.

So, bottom line, if you’re seeing this happen in either a VMWare or Virtual PC VM, try disabling the IDE controller attached to the virtual DVD drive.

Fix for BizTalk ESB Toolkit 2.0 Error 115004 in ALL.Exceptions Send Port

On occasion we have had messages suspend on the ALL.Exceptions send port with the error:

Error 115004: An unexpected error occurred while attempting to retrieve the System.Exception object from the ESB Fault Message.

The source of the error is the pipeline component Microsoft.Practices.ESB.ExceptionHandling.Pipelines.ESBFaultProcessor, part of the ESB Toolkit’s custom pipeline on the ALL.Exceptions port.

Some suspicions and hunting through code using .NET Reflector led to an explanation.

The ExceptionMgmt.CreateFaultMessage() method, which is used to create a fault message in an orchestration exception handler, automatically locates and stores the exception object that was previously thrown.  It stores the exception by binary-serializing it, Base64 encoding it and storing it in a property on the first message part of the fault message.  Later on, the ESBFaultProcessor pipeline component attempts to de-serialize the exception.

The trouble arises when the thrown exception contains a non-serializable inner exception more than one level deep.  The method ExceptionMgmt.IsExceptionSerializable() only checks the root exception and the first InnerException.  If a non-serializable exception happens to be nested further, the code does not detect it.  As a result, the ESBFaultProcessor fails while attempting to de-serialize it.

In our case, we are pulling a flat file down from a web service and disassembling it inside of an orchestration using the XLANGPipelineManager class.  If there is a problem with the file format, an XLANGPipelineManagerException is thrown.  It contains an InnerException of XmlException, which in turn contains an InnerException of Microsoft.BizTalk.ParsingEngine.AbortException – which has no serialization constructor.

To solve this issue, I wrote a short helper method in C#.  I call it immediately after ExceptionMgmt.CreateFaultMessage() and pass it the newly created fault message and the caught exception’s Message property value.  It checks whether the stored exception can be de-serialized, and if not, replaces the stored exception with a special exception class.  This is the same thing that would have happened had the IsExceptionSerializable() method correctly detected the situation.

I submitted this bug to Microsoft Connect.

To use this code, you’ll need a C# class library with references to:

  • Microsoft.Practices.ESB.ExceptionHandling
  • Microsoft.Practices.ESB.ExceptionHandling.Schemas.Faults
  • Microsoft.XLANGS.BaseTypes

For convenience, I added a couple of using statements at the top.

using System;
using Microsoft.XLANGs.BaseTypes;
using ExceptionHandling = Microsoft.Practices.ESB.ExceptionHandling;
using ExceptionHandlingSchemas = Microsoft.Practices.ESB.ExceptionHandling.Schemas.Property;

namespace BizTalkHelpers
{
    public static class OrchestrationHelper
    {
        /// <summary>
        /// Work around a bug in the BizTalk ESB Toolkit 2.0 related to
        /// non-serializable exceptions. When
        /// ExceptionMgmt.CreateFaultMessage() creates a message, it
        /// automatically locates and stores the caught exception. If the
        /// exception contains an InnerException more than one level deep
        /// that is not serializable, the ESBFaultProcessor pipeline
        /// component will later fail when it attempts to deserialize the
        /// exception, resulting in the error:
        /// Error 115004: An unexpected error occurred while attempting to
        /// retrieve the System.Exception object from the ESB Fault Message.
        /// </summary>
        /// <param name="msg">
        /// Message created by ExceptionMgmt.CreateFaultMessage()</param>
        /// <param name="exceptionMsg">
        /// Message property value of the caught exception</param>
        public static void FixNonSerializableExceptionInFaultMsg(
            XLANGMessage msg, string exceptionMsg)
        {
            // Incoming msg must have been created by
            // ExceptionMgmt.CreateFaultMessage()
            XLANGPart p = msg[0];

            if (p == null)
            {
              return;
            }

            // Extract the Base64-encoded string representation of the
            // exception serialized by CreateFaultMessage().
            string str =
              p.GetPartProperty(
                typeof(ExceptionHandlingSchemas.SystemException)) as string;

            if (str == null)
            {
              return;
            }

            try
            {
              ExceptionHandling.Formatter.DeserializeObject<Exception>(str);
            }
            catch (Exception)
            {
              // If an exception is not serializable, the correct behavior
              // is to store a serialized instance of
              // SetExceptionNonSerializableException.
              ExceptionHandling.SetExceptionNonSerializableException ex =
                new ExceptionHandling.SetExceptionNonSerializableException(
                  0x1c13e, new object[] { exceptionMsg });

              p.SetPartProperty(
                typeof(ExceptionHandlingSchemas.SystemException),
                ExceptionHandling.Formatter.SerializeObject<
                ExceptionHandling.SetExceptionNonSerializableException>(ex));
            }
        }
    }
}

BizTalk Adapter for DB2 Error SQLSTATE: HY000, SQLCODE: -270

My client had been using the BizTalk Adapter for DB2 from the BizTalk Adapters for Host Systems for quite some time with no significant issues.  The target system was DB2 on an AS/400 (iSeries/System i).  Late last week, the AS/400’s i5/OS was upgraded from V5R4 to V6R1 and the DB2 team did a database restore.

Somewhere around that time, BizTalk started logging the following errors:

An internal network library error has occurred. The requested command encountered an implementation-specific error condition on the target system. SQLSTATE: HY000, SQLCODE: –270

Searches for this error turned up only one post on a Host Integration Server newsgroup, which didn’t give us any answers.

I started out by looking up the SQLCODE –270 in the IBM SQL Messages and Codes book.  That description had to do with unique indexes or constraints on distributed tables, which didn’t seem relevant to our situation.

I found the actual meaning of –270 in DB2OLEDB.H on the Host Integration Server 2006 CD (MSI\x86\PFiles\SDK\Include).  It’s defined there as DB2OLEDB_DDM_CMDCHKRM, and CMDCHKRM means “command check reply message.”  The problem is that the error code(s) contained in the reply message are not surfaced, so this was still a dead end.

The Microsoft OLE DB Provider for DB2 is the foundation for the DB2 Adapter, so in order to rule out BizTalk and the DB2 Adapter as possible problems, we created a five-line C# command-line app:

OleDbConnection cn =
    new OleDbConnection("Provider=DB2OLEDB;REST_OF_CONNECTION_STRING_HERE");
cn.Open();
OleDbCommand cmd =
    new OleDbCommand("SELECT COUNT(*) FROM A-TABLE-IN-DB2", cn);
int rc = Convert.ToInt32(cmd.ExecuteScalar());
Console.WriteLine("Rows in table: " + rc.ToString());

Sure enough, the test app encountered the same error.  The problem was definitely in the OLE DB Provider for DB2.

The OLE DB Provider requires various “packages” (a DB2 concept) to exist in the DB2 system.  The packages correspond to various transaction isolation levels, so they are named READ UNCOMMITTED, REPEATABLE READ, etc.  We did not enable transactions in the DB2 connection string, nor did we configure isolation level in BizTalk, so we still don’t know which isolation level (and thus which package) is being used.  SERIALIZABLE is a good guess since it is often the BizTalk default.

When a connection is opened, if the DB2 Provider finds that the package associated with the active isolation level does not exist, it is supposed to automatically create it.  The active user account must have sufficient rights in DB2.  If that is not an option, then the Data Access Tool can be used to manually create the packages (the Packages button on the next-to-last page of the New Data Source wizard).

In our case, the user account should have had enough permissions to automatically create a package, but evidently that process failed and resulted in the obscure SQLSTATE: HY000, SQLCODE: –270 error.  As soon as I manually created the packages in the Data Access Tool, the error disappeared and everything began working normally again!

This page of the OLE DB Provider for DB2 documentation is an excellent resource for understanding the DB2 packages, the auto-create process, various error messages that may result and more.

An Optimization for the BizTalk ESB Toolkit 2.0 Portal Faults Page

While debugging the issues described in my previous post, I looked at how the ESB.Exceptions.Service’s GetFaults() method was implemented.  In our case, we had stack traces inside the Description text, so the size of the data returned for each fault was quite large.  Multiplied by thousands of faults, this is why we overflowed the default setting for maxItemsInObjectGraph.

However, this raised an important question: why was the value for Description (and many other fields) being returned from the service when the web pages never show it?

The answer?  The service’s GetFaults() method returns every column from the Fault table, including potentially large values like ExceptionStackTrace, ExceptionMessage and Description.  These fields are never used by the ESB Portal, so this behavior only serves to slow down the queries and cause issues like that described in my last post!

I modified the GetFaults() method’s Linq query to select only the columns used in the portal:

select new
{
    f.Application,
    f.DateTime,
    f.ErrorType,
    f.FailureCategory,
    f.FaultCode,
    f.FaultGenerator,
    f.FaultID,
    f.FaultSeverity,
    f.InsertMessagesFlag,
    f.MachineName,
    f.Scope,
    f.ServiceName
};

And then created the actual Fault objects just before returning from the method:

List<Fault> faults = new List<Fault>();

foreach (var fault in result)
{
    Fault f = new Fault()
    {
        Application = fault.Application,
        DateTime = fault.DateTime,
        ErrorType = fault.ErrorType,
        FailureCategory = fault.FailureCategory,
        FaultCode = fault.FaultCode,
        FaultGenerator = fault.FaultGenerator,
        FaultID = fault.FaultID,
        FaultSeverity = fault.FaultSeverity,
        InsertMessagesFlag = fault.InsertMessagesFlag,
        MachineName = fault.MachineName,
        Scope = fault.Scope,
        ServiceName = fault.ServiceName
    };
    
    faults.Add(f);
}

return faults;

This avoids the expense of SQL Server selecting many large data values that are never used, and can greatly reduce the amount of data that must be serialized and de-serialized across the service boundary.

This change provided a very noticeable boost in performance on the Faults page when searching, filtering and moving between pages.

BizTalk ESB Toolkit 2.0 Portal Timeouts and (401) Unauthorized Errors

The Problem

During application testing in our recently-built test and newly-built production BizTalk 2009 environments, we started having problems with the ESB Portal throwing a System.TimeoutException or a (401) Unauthorized error.  This was happening with increasing frequency on the portal home page and the Faults page.  On the home page, the problem seemed to be localized to the Faults pane.

When we saw the (401) Unauthorized errors, they contained a detail message like this:

MessageSecurityException: The HTTP request is unauthorized with client authentication scheme ‘Negotiate’. The authentication header received from the server was ‘Negotiate,NTLM’.

De-selecting some of the BizTalk applications in My Settings seemed to decrease but not eliminate the problem.  We had already checked and re-checked virtual directory authentication and application pool settings, etc.  Needless to say, everyone was tired of being unable to reliably view faults through the portal.

Debugging

A couple of issues complicated the debugging process, both related to the portal pulling fault data from a web service – specifically the ESB.Exceptions.Service.

First, the ESB.Exceptions.Service uses the webHttp (in other words, REST) binding introduced in .NET 3.5.  REST is fine for certain applications, but it also lacks many features of SOAP.  The one that stands out in particular here is REST’s lack of a fault communication protocol.  SOAP has a well-defined structure and protocol for faults, so from the client side it’s easy to identify and obtain information about a service call failure.  With REST, you’ll probably end up with a 400 Bad Request error and you’re on your own to guess as to what happened.

In other words, one can’t really trust the error messages arising from calls to the ESB.Exceptions.Service.

Second, the ESB.Exceptions.Service does not have built-in exception logging.  [In another post I’ll have a simple solution for that.]  Combined with REST’s lack of a fault protocol, any exception that occurs inside the service is essentially lost and obscured.

One of our first debugging steps was to run SQL Profiler on the EsbExceptionDb and see which queries were taking so long.  To our great surprise, when we refreshed the Faults page in the portal we saw in Profiler the same query running over and over, dozens or hundreds of times!

Fortunately, I was able to obtain permissions to our test EsbExceptionDb, which had over 10,000 faults in it, and run the portal and WCF services on my development machine.  Sure enough, I kept hitting a breakpoint inside the ESB.Exceptions.Service GetFaults() method over and over until the client timed out.  However, there were no loops in the code to explain that behavior!

Next, I turned on full WCF diagnostics for the ESB.Exceptions.Service, including message logging, using the WCF Service Configuration Editor.  Using the Service Trace Viewer tool, I indeed saw the same service call happening again and again – but the trace also captured an error at the end of each call cycle.

The error was a failure serializing the service method’s response back to XML.  The service call was actually completing successfully (which I had also observed in the debugger).  Once WCF took control again to send the response back to the client, it failed.  Instead of just dying, it continuously re-executed the service method!  This could be a bug in WCF 3.5 SP1.

Problem Solved

The solution to the WCF re-execution problem was increasing the maxItemsInObjectGraph setting.  On the service side, I did this by opening ESB.Exceptions.Service’s web.config, locating the <serviceBehaviors> section, and adding the line <dataContractSerializer maxItemsInObjectGraph="2147483647" /> to the existing “ExceptionServiceBehavior” behavior.

With that simple configuration change, the service call now returned promptly and the portal displayed a matching error about being unable to de-serialize the data.  As with the service, I needed to increase the maxItemsInObjectGraph setting.  I opened the portal’s web.config, located the <endpointBehaviors> section, and added the line <dataContractSerializer maxItemsInObjectGraph="2147483647" /> to the existing “bamservice” behavior.  The error message didn’t change!  I eventually discovered that the <dataContractSerializer> element must be placed before the <webHttp /> element.

The portal now displayed the home page and Faults page properly, and the timeout and unauthorized errors disappeared.

Race Condition in BizTalk ESB Toolkit 2.0 Exception Notification Service

We are using the ESB Exception Notification (aka ESB.AlertService) Windows service in conjunction with the ESB Portal website.  On occasion, we have a problem where the service indefinitely sends out duplicate emails for the same alert.  In the server’s Application Event Log, we see the error: “An exception of type ‘System.Data.StrongTypingException’ occurred and was caught.”  The log entry also includes “The value for column ‘To’ in table ‘AlertEmail’ is DBNull.”

We are allowing the service to pull user email addresses from Active Directory by configuring the LDAP Root under Alert Queue Options to LDAP://DC=company, DC=com.  With Active Directory you don’t need to specify a server name in your LDAP path.  Just point to the domain itself and Windows will figure out which domain controller to contact.

The vast majority of the rows in AlertEmail contain the correct email address in the To column, but every once in a while there is a NULL.  Looking at the service code (QueueGenerator.cs), we can see that the email address in CustomEmail is always used first, if one was provided when the alert subscription was created.  We do not set this value, so the code next attempts to pull the email address from Active Directory using the GetEmailAddress() method (ActiveDirectoryHelper.cs).

In order to reduce the number of AD queries, the code caches email addresses using the Enterprise Library caching block.  The cached entries expire after a configurable interval, which defaults to 1 minute.  If the username is already in the cache, then the corresponding email address is returned.  Otherwise, the code looks up the username in AD, grabs the email address and caches it.  The lookup code throws an exception if it doesn’t get back a valid email address, so it doesn’t explain how we got a NULL email address.

The problematic code is the cache lookup:

if (CacheMgr.Contains(name))
{
  Trace.WriteLine("Reusing email address for " + name + " from cache.");
  return (string)CacheMgr.GetData(name);
}

This is a classic race condition.  The code checks to see if the username is in the cache, then runs a Trace.WriteLine(), then asks for the cached data associated with the username.  In the time between the Contains() and the GetData() calls, the cached data can expire and drop out of the cache, in which case GetData() will return null.  Most of the time it gets lucky and the data is still cached.  This probably explains how we sometimes get NULL values in the database.

The proper code is simple because GetData() simply returns null when the requested data is not in the cache:

string cachedEmail = (string)CacheMgr.GetData(name);

if (!string.IsNullOrEmpty(cachedEmail))
{
    Trace.WriteLine("Reusing email address for " + name + " from cache.");
    return cachedEmail;
}

The new version of the code eliminates the race condition and should prevent us from ever seeing NULL values in the database.

I also created a bug report on Microsoft Connect.

Fix for BizTalk ESB Toolkit 2.0 Portal Message Viewer Error About BizTalkMsgBoxDb.dbo.ProcessHeartbeats

When we recently configured the ESB Portal website, we encountered a number of permissions-related issues.  Our initial experience was the same as that of many others who have discovered that the Portal’s included permissions script is inadequate.  Once we granted additional permissions to the existing database roles the permission errors cleared up – but we couldn’t overcome one last error: Invalid object name ‘BizTalkMsgBoxDb.dbo.ProcessHeartbeats’.

As most of you know, Microsoft decided not to ship the source code for the ESB Toolkit 2.0 aside from the Management Portal “sample”.  In order to diagnose this error, I pulled out Red Gate’s .NET Reflector and started digging through disassembled code.  The source of this particular issue lies in the ESB.BizTalkOperationsService.

In our environment, as in most high-performance BizTalk installations, the message box database is on a different SQL Server instance than the other BizTalk databases.  In a great oversight, the BizTalkOperationsService was hard-coded to expect the message box database to be present on the same server as the management database.  The operations service attempts to run this SQL query on the database that holds the management database: SELECT 1 FROM BizTalkMsgBoxDb.dbo.ProcessHeartbeats with (nolock) where uidProcessID='{0}’.

You’ll note another potential issue here: the message box database name is hard-coded in the query.  That has also caused trouble for people.

To solve this problem, I first used .NET Reflector to re-create Visual Studio 2008 projects for the ESB.BizTalkOperationsService ASMX web service and Microsoft.Practices.ESB.BizTalkOperations.dll class library.  Once the projects were cleaned up and building successfully, I modified the code to query the management database for the primary message box database name and server using the existing stored procedure adm_MessageBox_Enum.  With that information in hand, I updated the code to create a connection string to the message box database and execute the ProcessHeartbeats query there.  I also removed the hard-coded database name.

I tested my version of the BizTalkOperationsService using the ESB.BizTalkOperations.Test.Client included with the Toolkit source code and verified that everything still worked as expected.

Since this was a fairly time-consuming issue to fix and it is a problem that should affect a good percentage of the installations out there, I decided to post my updated service and source code (download link at the end of this post).  I cannot make any guarantees about the correctness of the code, so consider it as-is and use at your own risk.  (That said, I believe that it works just fine.)

Let’s hope that Microsoft reconsiders its unfortunate decision not to ship source code.

ESB.BizTalkOperationsService.zip

Performance Tips for the WCF SQL Adapter for BizTalk Server

I’ve recently experienced (and largely solved) some serious performance issues with the BizTalk WCF Adapter for SQL Server (aka WCF-SQL).  This post describes the problem and the solution that I discovered.

The BizTalk application in question has a fairly simple data flow:

  1. Receive a file containing multiple data records (i.e. an interchange) in XML format
  2. Use the standard XML Disassembler pipeline component to split the interchange into multiple messages
  3. Assign a static ESB Toolkit 2.0 itinerary to each message (still in the pipeline)
  4. Execute the itinerary as follows:
  5.   Map to canonical format (itinerary step 1 – messaging)
  6.   Execute a custom orchestration “service” to send the message to a SQL Server stored procedure (itinerary step 2 – orchestration)
  7.   Route the message to an off-ramp

In this application I was doing things “the ESB Toolkit way” so everything was fairly dynamic.  The maps were identified and executed on the fly and the stored procedure was called through a dynamic one-way port configured on the fly by an ESB resolver.  If you’re not using the ESB Toolkit, keep reading – these tips still apply to you.

There’s really not much to the application.  A batch of records comes in, gets split up into individual messages, and each message gets sent to a stored procedure in SQL Server using the WCF SQL adapter.  Except the performance was terrible.  On my (not-so-quick) machine, a batch containing 100 records was taking over one minute to process!

I ruled out stored procedure performance as a factor by simply changing it to immediately return without doing any work.  Surprisingly, that barely increased the speed (a few seconds at most) even though the stored procedure call now returned instantly.

I discovered a couple of things with SQL Profiler that led to the solution.

First, we were sending an XML message to the stored procedure, so the parameter was typed as ‘xml’ (the SQL Server XML data type).  However, BizTalk can’t send messages to SQL Server in that format.  It always sends them as a Unicode string.  SQL Server (or the .NET SQL client that underlies the adapter) was automatically inserting a CONVERT() on each call to turn the Unicode string into a variable of type ‘xml’, then executing the stored procedure.  To avoid this “magic” conversion, we converted the stored proc parameter to NVARCHAR(MAX) and added a CONVERT() inside the stored proc.  That moved the CONVERT() into the stored proc where SQL Server could pre-compile, optimize and cache it along with the rest of the code.

Always type your stored procedure parameter(s) as NVARCHAR(MAX) when sending an XML message to SQL Server.

Second, the major performance loss was related to the fact that this adapter is based on WCF and the fact that we were using a dynamic send port.  I realized that for every call to the stored procedure, there was also a second dynamic SQL call to obtain metadata about the stored proc’s parameters.  This was effectively doubling the number of calls to SQL Server, and running a relatively slow query to boot.

For those of you who have worked with WCF, hopefully you know that creating WCF proxy clients is a relatively expensive operation.  It is always best to cache proxy objects or at least a ChannelFactory, or take advantage of the built-in caching added in .NET 3.0 SP1.  Details on all of that are here.  The important thing is that if BizTalk is not able to cache the WCF proxy objects that it uses to talk to the WCF SQL adapter, then performance is definitely going to be bad.

That’s where the dynamic port comes in.  Since the port is dynamically configured on every call to SQL Server, the proxy objects are not cached.  This explained a lot!  On every call we were taking a hit from creating and setting up a WCF proxy object, then taking a second hit because the WCF adapter has to obtain metadata about the stored procedure before it calls it.

Avoid dynamic ports with the WCF adapters, and in particular the WCF SQL adapter, in favor of static ports with a dynamic Action.

The solution in my case was to create a static WCF-Custom port configured for the SQL adapter, leaving the Action setting blank (because we call multiple stored procedures).  Instead of fully configuring the port on the fly, I now dynamically configure only the Action property.  This produced a 45-50% increase in performance.

The end result of these changes was that processing 100 messages went from over 65 seconds to about 20 seconds.

The final tip is only relevant when you are using a fully dynamic send port with any of the WCF adapters on BizTalk 2009 and is described in this post.  Here’s another post on how to do it with the ESB Toolkit.  Performance can be modestly improved by explicitly setting the EnableTransaction and IsolationLevel context properties.  In my fully dynamic scenario, this improved performance by about 25%.  I am not clear how these settings interact with the SQL binding’s own useAmbientTransaction property.

When using dynamic ports with the BizTalk 2009 (only) WCF adapters, set the EnableTransaction and IsolationLevel context properties.

Our application is now performing at the speed that we expected, and hopefully these tips will give your own apps a nice speed boost too.

%d bloggers like this: