Today is yet another T-SQL Tuesday – the monthly blogging party started by Adam Machanic (b|t) to get everyone in the SQL Server blogosphere together and to blog about a topic. Today’s party is hosted by none other than Matt Gordon (b|t), and the topic that he has chosen for us to blog about is “Fixing old problems with Shiny New Toys”. Specifically, he says:
What I’d like to see from the blog responses for this T-SQL Tuesday is how you’ve used a “new” Microsoft data platform toy to fix an old problem.
I’ve decided that I’m going to turn this around. Sometimes, those Shiny New Toys cause problems. Lets talk about one such case.
Recently, I was helping a client with an issue where when a job would run, the CPU on all processors would max out. This in turn would cause other performance issues on the server:
After a bit of investigation, I was able to determine that the statement causing this was a MERGE statement. The MERGE statement was a Shiny New Toy introduced in SQL 2008. This MERGE statement had both a WHEN MATCHED and a WHEN NOT MATCHED BY TARGET clause inside it.
What does this mean?
BOL defines the MERGE statement as:
Performs insert, update, or delete operations on a target table based on the results of a join with a source table.
This is performed with just one pass of the target table.
There are three possible operations that can be used within the MERGE statement. These are:
WHEN MATCHED – this allows you to perform an UPDATE or DELETE against the target table using the source table. There is a row existing in both tables for this match. If you were performing just the individual UPDATE or DELETE statement, you would use an INNER JOIN between the source and target table.
WHEN NOT MATCHED BY TARGET – this allows you to perform an INSERT of rows into the target table from rows in the source table that are not in the target table. If you were performing just the INSERT statement, you would drive it with a SELECT statement that performs a LEFT OUTER JOIN between the source and target tables, checking for a (not-null) column in the target table being NULL (signifying that there wasn’t a matching row).
WHEN NOT MATCHED BY SOURCE – this allows you to perform an UPDATE or DELETE against the target table when the target table has rows that the source table does not. If you were performing just the UPDATE or DELETE statement, you would use a RIGHT OUTER JOIN between the source and target tables… or how most people would write it, a LEFT OUTER JOIN between the target and source tables.
Keep in mind that my examples above are what you would do if you were separating the commands. Individually, these different conditions perform just fine with the MERGE statement. However, when the MERGE statement combines multiple match conditions, then think about what SQL has to do. See the chart at this link for when the different types of join conditions are used in the MERGE statement.
In this case, it’s doing both a WHEN MATCHED and a WHEN NOT MATCHED BY TARGET. In order to process the target table in just one pass, it will frequently end up performing worse. It can also end up using more resources in the process. I frequently find it better to just use the individual statements.
By separating this out into the respective statements, each can be optimized to perform better. Which is what I did… an UPDATE statement followed by an INSERT statement. Not only did the job run faster overall, but this change also stopped the CPU from maxing out:
Ah, much better.
There are other issues with the MERGE statement. It has bugs… some of which can cause database corruption.
Here we have a Shiny New Toy (feature), supposed to make life easier, yet it causes problems. Until it can perform better (and the bugs are eliminated), I just don’t use it.
Beware the Shiny New Toys.