Pages

Wednesday, July 2, 2014

SSIS Architecture – SQL Server Integration Services

SSIS stands for SQL Server Integration Services. It is a platform for Data integration and Work flow applications. It can perform operations like Data Migration and ETL (Extract,  Transform and Load).
  • E – Merging of  data from heterogeneous data stores (i.e. it may be a text file, spreadsheets, mainframes, Oracle, etc.).This process is known as EXTRACTION.
  • T – Refreshing data in the data warehouses and data marts. Also used to cleanse data before loading to remove errors. This process is known as TRANSFORMATION.
  • L - High-speed load of data into Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) databases. This process is known as LOADING.
Tools used for the development of SSIS projects  are -
  1. BIDS (Business Intelligence Development Studio).
  2. SSMS (SQL Server Management Studio).
Note: - Prior to SSIS, the same task was performed with DTS (Data Transformation Services) in SQL Server 2000 but with fewer features.
Difference between DTS and SSIS is as follows:-
  • DTS :-
  1. Limited error Handling.
  2. Message Boxes in ActiveX Scripts.
  3. No deployment wizard and BI functionality.
  • SSIS :-
  1. Complex and powerful error handling.
  2. Message Boxes in .NET Scripting.
  3. Interactive deployment wizard and Complete BI functionality.
To develop your SSIS package, you need to install Business Intelligence Development Studio (BIDS) which will be available as client tool after  installing SQL Server Management Studio (SSMS).
  • BIDS: – It is a tool which is used to develop the SSIS packages. It is available with SQL Server as an interface which provides the developers to work on the control flow of the package step by step.
  • SSMS: - It provides different options to make a SSIS package such as Import Export wizard. With this wizard, we can create a structure on how the data flow should happen. Created package can bedeployed further as per the requirement.
Now, you must  be hitting your head to know about Data flow and Control flow. So, Data flow means  extracting data into the server’s memory, transform it and write it out to an alternative destination whereas Control flow means a set of instructions which specify the Program Executor on how to execute tasks and containers within the SSIS Packages. All these concepts are explained in SSIS Architecture.
SSIS Architecture :-
  1. Packages – A package is a collection of tasks framed together with precedence constraints to manage and execute tasks in an order. It is compiled in a XML structured file with .dtsx extension.
  2. Control Flow - It acts as the brain of a package. It consists of one or more tasks and containers that executes when package runs. Control flow orchestrates the order of execution for all its components.
  3. Tasks - A task can best be explained as an individual unit of work.
  4. Precedence Constraints - These are the arrows in a Control flow of a package that connect the tasks together and manage the order in which the tasks will execute. In Data flow, these arrows are known as Service paths.
  5. Containers - Core units in the SSIS architecture for grouping tasks together logically into units of work are known as Containers.
  6. Connection Managers - Connection managers are used to centralize connection strings to data sources and to abstract them from the SSIS packages. Multiple tasks can share the same Connection manager.
  7. Data Flow - The core strength of SSIS is its capability to extract data into the server’s memory (Extraction), transform it (Transformation) and write it out to an alternative destination (Loading).
  8. Sources - A source is a component that you add to the Data Flow design surface to specify the location of the source data.
  9. Transformations - Transformations are key components within the Data Flow that allow changes to the data within the data pipeline.
  10. Destinations - Inside the Data Flow, destinations consume the data after the data pipe leaves the last transformation components.
  11. Variables - Variables can be set to evaluate to an expression at runtime.
  12. Parameters - Parameters behave much like variables but with a few main exceptions.
  13. Event Handlers – The event handlers that run in response to the run-time events that packages, tasks, and containers raise.
  14. Log Providers – Logging of package run-time information such as the start time and the stop time of the package and its tasks and containers.
  15. Package Configurations – After development your package and before deploying the package in production environment from UAT you need to perform certain package configurations as per production Server.
This completes the basics of SSIS and its architecture. Now, we will learn the components of SSIS architecture in depth and will implement its features like import/export wizard, data viewers, etc. in our future tutorials.

No comments:

Post a Comment