Pages

Wednesday, April 16, 2014

MERGE Statement

One of the common tasks in database management is the maintenance of lookup tables.  Sample lookup tables are Currency Codes , Country Codes, U.S. State Codes, Product Types and so on.  Maintenance of these look-up tables are usually done using stored procedures where the normal task is to add the new record if it does not exist or update the existing record if it already exists based on the identified primary key of the table.  This involves 2 statements, issuing either an INSERT statement or an UPDATE statement depending on the existence of the record.
SQL Server 2008 introduces a new statement called MERGE statement which combines these functionalities into just a single statement.  The MERGE statement performs an INSERT, UPDATE or DELETE operation on a target table based on the results of a join with source table.
To illustrate how the MERGE statement is used, let's look at the process of maintaining a table using a stored procedure that performs either an INSERT statement if the record does not exist or an UPDATE statement if the record already exists in the target table.  Prior to SQL Server 2008, your stored procedure may look like the following:
CREATE TABLE [dbo].[Employee] (
    [EmployeeNumber]        VARCHAR(10),
    [FirstName]             VARCHAR(50),
    [LastName]              VARCHAR(50),
    [Position]              VARCHAR(50)
)
GO

CREATE PROCEDURE [dbo].[usp_ProcessEmployee]
    @EmployeeNumber         VARCHAR(10),
    @FirstName              VARCHAR(50),
    @LastName               VARCHAR(50),
    @Position               VARCHAR(50)
AS

IF NOT EXISTS (SELECT 'X' FROM [dbo].[Employee]
               WHERE [EmployeeNumber] = @EmployeeNumber)
    INSERT INTO [dbo].[Employee] ( [EmployeeNumber], [FirstName], [LastName], [Position] )
    VALUES ( @EmployeeNumber, @FirstName, @LastName, @Position )
ELSE
    UPDATE [dbo].[Employee]
    SET [FirstName] = @FirstName,
        [LastName]  = @LastName,
        [Position]  = @Position
    WHERE [EmployeeNumber] = @EmployeeNumber
GO

EXECUTE [dbo].[usp_ProcessEmployee] 'ABC123', 'John', 'Smith', 'Vice President'

SELECT * FROM [dbo].[Employee]
GO

EmployeeNumber   FirstName    LastName    Position
---------------  -----------  ----------  ------------
ABC123           John         Smith       Vice President
The MERGE Statement Way
With the new MERGE statement, the stored procedure above will now look as follows:
CREATE PROCEDURE [dbo].[usp_MergeEmployee]
    @EmployeeNumber         VARCHAR(10),
    @FirstName              VARCHAR(50),
    @LastName               VARCHAR(50),
    @Position               VARCHAR(50)
AS

MERGE [dbo].[Employee] AS [Target]
USING (SELECT @EmployeeNumber, @FirstName, @LastName, @Position)
   AS [Source] ( [EmployeeNumber], [FirstName], [LastName], [Position] )
ON [Target].[EmployeeNumber] = [Source].[EmployeeNumber]
WHEN MATCHED THEN
    UPDATE SET [FirstName] = [Source][FirstName],
               [LastName]  = [Source].[LastName],
               [Position]  = [Source].[Position]
WHEN NOT MATCHED THEN
    INSERT ( [EmployeeNumber], [FirstName], [LastName], [Position] )
    VALUES ( [Source].[EmployeeNumber], [Source].[FirstName], 
             [Source].[LastName], [Source].[Position] );
GO
Here are the results when calling the stored procedure containing the MERGE statement:
DELETE FROM [dbo].[Employee]

EXECUTE [dbo].[usp_MergeEmployee] 'ABC123', 'John', 'Smith', 'Vice President'
SELECT * FROM [dbo].[Employee]
GO

EmployeeNumber   FirstName    LastName    Position
---------------  -----------  ----------  ------------
ABC123           John         Smith       Vice President



EXECUTE [dbo].[usp_MergeEmployee] 'ABC123', 'John', 'Smith', 'President'
SELECT * FROM [dbo].[Employee]
GO

EmployeeNumber   FirstName    LastName    Position
---------------  -----------  ----------  ------------
ABC123           John         Smith       President

Dissecting the MERGE Statement
Now let's look at the syntax of the MERGE statement.  The basic syntax of the first part of the MERGE statement is as follows:
MERGE INTO <target_table> AS <alias>
USING <table_source>
   ON <merge_search_condition>
This syntax looks totally different from the SELECT, UPDATE, INSERT or DELETE statement that you may have been accustomed to.  The first line specifies the target table where the succeeding INSERT, UPDATE or DELETE statement will be performed.  The "AS <alias>" part of the syntax is optional.  The INTO clause also is optional.
The source table to which to get the records to merge with the target table is specified after the USING clause.  The source table can be a table or view, a function that returns a rowset, a derived table, joined tables or even the OPENXML statement.
The join condition "<merge_search_condition>" that joins the target table with the source table is specified after the ON clause.  The syntax of the MERGE statement can be compared to a SELECT statement with an INNER JOIN clause where the <target_table> is inner joined with the <table_source> using the <merge_search_condition> specified.
Extracting from the stored procedure above, the first part of the MERGE statement used is as follows:
MERGE [dbo].[Employee] AS [Target]
USING (SELECT @EmployeeNumber, @FirstName, @LastName, @Position)
   AS [Source] ( [EmployeeNumber], [FirstName], [LastName], [Position] )
ON [Target].[EmployeeNumber] = [Source].[EmployeeNumber]
In this case, the [dbo].[Employee] is the target table while [Target] is the alias associated with this table.  As for the source table, instead of using a table or view, a derived table is used using the parameters passed to the stored procedure.  The alias [Source] was assigned to this derived table followed by the column alias list corresponding to each column specified in the SELECT statement.  Make sure that the number of columns specified in the SELECT statement matches the number of columns in the column alias list.  If the number of columns in the SELECT statement is more than the number of columns in the columns list, you will get the following error message:
Msg 8158, Level 16, State 1, Line 6
'Source' has more columns than were specified in the column list.
Similarly, if the number of columns in the SELECT statement is less than the number of columns in the columns list, you will get the following error message:
Msg 8159, Level 16, State 1, Line 5
'Source' has fewer columns than were specified in the column list.
The second part of the MERGE statement tells what action will be performed for those records that match between the source table and the target table.  It also tells what action will be performed for those records that are in the source but are not in the target.  In our scenario, what we want to happen is that for those records that exist in both the target table and source table, we would want to update the data in the target table with the values in the source table.  On the other hand, for those records in the source table but are not in the target table, we would want to insert these records to the target table.  The basic syntax for the second part of the MERGE statement is as follows:
WHEN MATCHED THEN <do_either_update_or_delete>
WHEN NOT MATCHED <do_insert>
Extracting this part from the stored procedure above:
WHEN MATCHED THEN
    UPDATE SET [FirstName] = [Source][FirstName],
               [LastName]  = [Source].[LastName],
               [Position]  = [Source].[Position]
WHEN NOT MATCHED THEN
    INSERT ( [EmployeeNumber], [FirstName], [LastName], [Position] )
    VALUES ( [Source].[EmployeeNumber], [Source].[FirstName], 
             [Source].[LastName], [Source].[Position] );
From this script, you can see that for those records that matched (WHEN MATCHED THEN), the [FirstName], [LastName] and [Position] columns of the target table are updated with the values from the source table.  The name of the target table is not specifed after the UPDATE statement because only the target table can be updated by the UPDATE statement.  On the other hand, for those records that are in the source table but are not in the target table (WHEN NOT MATCHED THEN), an INSERT statement is executed inserting the records from the source table into the target table for those that don't exist.
Using Row Constructors as Source
In the article Row Constructors (or Table-Valued Constructors) as Derived Table, it discusses the use of row constructors as derived tables and it shows an example of how to use it with the MERGE statement.  The stored procedure above can be modified to use row constructors as the source for the MERGE statement.  Here's how the stored procedure will look like, with the row constructor highlighted in red.
CREATE PROCEDURE [dbo].[usp_MergeEmployee]
    @EmployeeNumber   VARCHAR(10),
    @FirstName    VARCHAR(50),
    @LastName    VARCHAR(50),
    @Position    VARCHAR(50)
AS

MERGE [dbo].[Employee] AS [Target]
USING (VALUES ( @EmployeeNumber, @FirstName, @LastName, @Position ))
   AS [Source] ( [EmployeeNumber], [FirstName], [LastName], [Position] )
ON [Target].[EmployeeNumber] = [Source].[EmployeeNumber]
WHEN MATCHED THEN
    UPDATE SET [FirstName] = [Source].[FirstName],
               [LastName]  = [Source].[LastName],
               [Position]  = [Source].[Position]
WHEN NOT MATCHED THEN
    INSERT ( [EmployeeNumber], [FirstName], [LastName], [Position] )
    VALUES ( [Source].[EmployeeNumber], [Source].[FirstName],
             [Source].[LastName], [Source].[Position] );
GO

No comments:

Post a Comment