What Is SQL Server Integration Services?
Depending on whom you ask, you might get different answers to that question ranging from descriptions such as a data import/export wizard, to an ETL tool, to a control flow engine, to an application platform, or to a high-performance data transformation pipeline. All are correct because Integration Services is a set of utilities, applications, designers, components, and services all wrapped up into one powerful software application suite. SQL Server Integration Services SSIS is many things to many people.
Data Import/Export Wizard
One of the most popular features of Integration Services is the Import/Export Wizard, which makes it easy to move data from a source location such as a flat file or database table to a flat file, table, or other destination. The Import/Export Wizard was the first utility developed back in the SQL Server 7.0 time frame and continues today as an important utility in the database administrator’s (DBA) toolbox.
ETL is an acronym for Extract, Transform, and Load and describes the processes that take place in data warehousing environments for extracting data from source transaction systems; transforming, cleaning, deduplicating, and conforming the data; and finally loading it into cubes or other analysis destinations. Although Data Transformation Services (DTS), Integration Services’ predecessor application, was considered a valuable tool for doing ETL, Integration Services is where true Enterprise ETL became available in SQL Server.
Control Flow Engine
The processes involved in moving data from location to location and transforming it along the way are not restricted to only processing data. Integration Services provides a control flow for performing work that is tangentially related to the actual processing that happens in data flow, including downloading and renaming files, dropping and creating tables, rebuilding indexes, performing backups, and any other number of tasks. Integration Services provides a full-featured control flow to support such activities.
Developers can create applications that use Integration Services as a platform, embedding the engines within their application using the provided object models. As a developer, you can embed the Integration Services engines and components within your applications using the object models.
High Performance Data Transformation Data Pipeline
That’s a mouthful and really incorporates two ideas: high performance and data pipelining. The Data Flow Task is a high-performance tool because you can use it to perform complex data transformations on very large datasets for incredibly performant processing. The pipeline concept means that you can process data from multiple heterogeneous data sources, through multiple parallel sequential transformations, into multiple heterogeneous data destinations, making it possible to process data found in differing formats and on differing media in one common “sandbox location.
A Short Integration Services History
Integration Services is the successor to Data Transformation Services (DTS). DTS had humble beginnings. It was started on a shoestring budget with very few resources. Its first incarnation was a proof-of-concept transformation, which later became known as the data pump. The proof of concept caught the attention of some folks around Microsoft, and it was given some funding.
The first release of DTS shipped with SQL Server 7.0 to receptive users. The alternatives at the time were either difficult to work with, expensive, or both. Many DBAs were forced to write custom transformation software, which was inflexible and difficult to maintain. Some tools had limitations, such as the need for source and destination schemas to match exactly, direct dependence on a particular database product, and/or no transformation capabilities. Many wrote custom parsing and transformation applications. For example, many companies are only now converting from hand-coded flat file parsers, SQL scripts, and transformation code to a standard platform such as Integration Services.
The first release of DTS addressed several of these issues and simplified life for a lot of people. By using OLEDB for its data access layer, DTS could access various data sources with little or no custom coding. DTS was also affordable because it shipped “in the box with SQL Server. Users had access to all the power of more expensive products, yet incurred no additional cost for their ETL tools. This was obviously a benefit to IT shops trying to stretch their budgets. DTS was a flexible product that was easy to use. There were also a number of standard tasks in the box, including the Transform Data, Execute Process, Active X Script, Execute SQL, and Bulk Insert Tasks.
SQL Server 8.0 added even more functionality by adding more tasks. The Execute Package, FTP, and MSMQ Tasks added incremental improvements across the product. However, users experienced some frustration with DTS when attempting to work with large datasets and some of the other limitations inherent in a script-based tool. The time was ripe to create a truly enterprise-ready integration tool.
In 2000, SQL Server decided to make a substantial investment in the ETL and Integration space and brought together some talented folks who formulated the ideas behind the Data Flow Task and the next version of Integration Services. Over a period of five years, the development time frame for SQL Server 2005, the DTS team completely redesigned and rewrote DTS to become Integration Services 2005.
Integration Services 2008 is the next incarnation of that release. While not as revolutionary a release as in 2005, Integration Services 2008 brings some new capabilities and incremental improvements across the product.
SQL Server Integration Services is a complex, flexible, and powerful product with many uses. Many people use Integration Services for enterprise extract, transform, and load purposes. Others use it for an IT management tool for doing things like backing up databases. Others use it as an integration tool for synchronizing multiple systems. However you use it, Integration Services can be frustrating and difficult to learn. Hopefully, this book will enable you to take advantage of all that power and flexibility in as little time as possible.