Thursday, 15 August 2013

Performance Boost for XML Stored Procedures

xml[Concrete / Mildly Interesting] Sometimes SQL Server makes you want to pull your hair out. If you are using XML types in your stored procedures and seeing some  very poor performance, I may have an answer for you in this post.

 

 

XML is working, but maybe a little too hard

Today I was looking at the performance profile of one of our applications and after a little digging I noticed that a stored procedure was performing rather poorly. We pass a lot of parameters to some of our stored procedures and when we do this we use XML. This technique allows you build parameters using LINQ XElement types and with LINQ to SQL, simply pass the XElement as a parameter to the stored procedure. It’s a great technique so it’s even more disappointing to see it perform so badly. And what’s worse, it seems it’s the simple operation of fetching out the data using nodes() that causes all the problems.

To illustrate this, take a look at the following XML:

<Transaction>
<Parameter01>1234</Parameter01>
<Parameter02>14.99</Parameter02>
<Parameter03>2012-01-31T00:00:00Z</Parameter03>
<Parameter04>0</Parameter04>
<Parameter05>14.99</Parameter05>
<Parameter06>false</Parameter06>
<Parameter07>23</Parameter07>
<Parameter08>14.99</Parameter08>
<Parameter09>2012-01-31T21:51:35.0498047Z</Parameter09>
<Parameter10>0</Parameter10>
</Transaction>

Ok, not very exciting, but you get the idea. Here I have 10 parameters in my XML. And here is a simple stored procedure that is passed the XML and fetches out the parameters:

CREATE PROCEDURE usp_XMLTest
@doc XML
AS
BEGIN
SET NOCOUNT ON;

DECLARE
@Param01 INT, @Param02 MONEY, @Param03 DATETIME, @Param04 INT, @Param05 MONEY,
@Param06 BIT, @Param07 INT, @Param08 MONEY, @Param09 DATETIME, @Param10 INT

SET @Param01 = (SELECT doc.col.value('.', 'INT') FROM @doc.nodes('/Transaction/Parameter01') AS doc(col))
SET @Param02 = (SELECT doc.col.value('.', 'MONEY') FROM @doc.nodes('/Transaction/Parameter02') AS doc(col))
SET @Param03 = (SELECT doc.col.value('.', 'DATETIME') FROM @doc.nodes('/Transaction/Parameter03') AS doc(col))
SET @Param04 = (SELECT doc.col.value('.', 'INT') FROM @doc.nodes('/Transaction/Parameter04') AS doc(col))
SET @Param05 = (SELECT doc.col.value('.', 'MONEY') FROM @doc.nodes('/Transaction/Parameter05') AS doc(col))
SET @Param06 = (SELECT doc.col.value('.', 'BIT') FROM @doc.nodes('/Transaction/Parameter06') AS doc(col))
SET @Param07 = (SELECT doc.col.value('.', 'INT') FROM @doc.nodes('/Transaction/Parameter07') AS doc(col))
SET @Param08 = (SELECT doc.col.value('.', 'MONEY') FROM @doc.nodes('/Transaction/Parameter08') AS doc(col))
SET @Param09 = (SELECT doc.col.value('.', 'DATETIME') FROM @doc.nodes('/Transaction/Parameter09') AS doc(col))
SET @Param10 = (SELECT doc.col.value('.', 'INT') FROM @doc.nodes('/Transaction/Parameter10') AS doc(col))
END

Of course, in the real world the parameters would have meaningful names and you’d want to do more than just set a parameter with values from the XML, but even in this simple example the problem raises it’s ugly head.

Testing the stored procedure

I’ve put together a console application to gather average timings for 100 calls to the stored procedure. I’m using XElement to represent the data and LINQ to SQL to call the stored procedure:

Ok, time to light the fuse and see what develops….

The Results

The results are very unexpected and a big cause for concern:

  • TotalExecution = 6573ms
  • AverageExecution = 65ms

The Explanation

Well, that’s going to cause any program to run like a dog. So why so slow?

After some digging into MSDN articles I found http://msdn.microsoft.com/en-US/library/ms345118. This talks about ’Performance Optimizations for the XML Data Type in SQL Server 2005’ and deep in the article is discusses problems with un-typed XML and this is the root of the problem. With un-typed element based XML, it appears that selecting a node selects the  value of the node and all the children under that node. Even if there are no children as in the example above, the operation is costly because there is no XML index to work with so the XML is parsed to match the XQuery expression.

However, there are ways to improve on this situation:

Solution 1 – Use attributes, not elements

Solution 2 – Use .text() accessor

If you can’t use attributes because you don’t own the schema of the XML, use a .text() accessor get the node as text and stop any further parsing of the XML. To achieve this, alter the .nodes() expression to include the accessor:

Conclusion

XML in SQL is powerful and extremely useful. But you need to check your implementation for performance. I’ve touched on one pitfall but be warned, there are more out there.

Tip of the day - where possible, let SQL Server know as much as possible about your XML.

No comments:

Post a Comment