Pages

Thursday, May 8, 2014

“IN”, “EXISTS” clause and their performance

Introduction
To improve the performance of the Query, the general guideline is not to prefer the "IN" Clause. The guideline of the MS SQL query performance says that if we needs "IN" clause, instead of using "IN" clause we must use the "EXISTS" clause.   As the "EXISTS" clause improve the performance of the query.
This article is related to "IN", "EXISTS" clause and their performance factors.

Is "IN" and "EXISTS" clause are same
IN Clause
Returns true if a specified value matches any value in a sub query or a list.
EXISTS Clause
Returns true if a sub query contains any rows.
So we see that the "IN" and the "EXISTS" cluse are not same. To support the above definition lets takes an example.
-- Base Tabe
IF OBJECT_ID('emp_DtlTbl') IS NOT NULL
   BEGIN
     DROP TABLE emp_DtlTbl;
   END
GO
IF OBJECT_ID('emp_GradeTbl') IS NOT NULL
   BEGIN
     DROP TABLE emp_GradeTbl;
   END
GO     

CREATE TABLE emp_DtlTbl
       (EMPID   INT        NOT NULL IDENTITY PRIMARY KEY, 
        EMPNAME VARCHAR(50)NOT NULL);
GO

CREATE TABLE emp_GradeTbl
       (EMPID   INT        NOT NULL IDENTITY PRIMARY KEY, 
        GRADE   VARCHAR(1) NOT NULL);
GO

-- Insert Records
INSERT INTO  emp_DtlTbl
       (EMPNAME)
VALUES ('Joydeep Das'), ('Sukamal Jana'), ('Sudip Das');
GO

INSERT INTO  emp_GradeTbl
       (GRADE)
VALUES ('B'), ('B'), ('A');
GO

-- [ IN ] Clause Example-1
SELECT *
FROM   emp_DtlTbl;
WHERE  EMPID IN(SELECT EMPID FROM  emp_DtlTbl);           


-- [ IN ] Clause Example-2
SELECT *
FROM   emp_DtlTbl
WHERE  EMPID IN(1,2,3);


-- [ EXISTS ] Clause Example
SELECT a.*
FROM   emp_DtlTbl a
WHERE  EXISTS(SELECT b.*
              FROM   emp_DtlTbl b
              WHERE  b.EMPID = a.EMPID);
  

Performance Factors
To understand the performance factors let see the actual execution plan for "IN" and "EXISTS" clauses.
Take this example:
-- [ IN ] Clause Exampl
SELECT *
FROM   emp_DtlTbl
WHERE  EMPID =(SELECT EMPID
               FROM   emp_DtlTbl
               WHERE  EMPID = 2);           

-- [ EXISTS ] Clause Example
SELECT a.*
FROM   emp_DtlTbl a
WHERE  EXISTS(SELECT b.*
              FROM   emp_DtlTbl b
              WHERE  b.EMPID = 2
                     AND b.EMPID = a.EMPID);

If we compare the total query costs of the both MS SQL query, we see that the IN clause query cost is higher than the EXISTS clause query costs.

Special notes
Please note that: Here the data of the table is limited, so we cannot measure the performance factors.

No comments:

Post a Comment