SQL Server DBA

Beginning SSIS

I have been thinking about the various specialisations and opportunities within the world of data, and there are two areas that really appeal to me: data analysis and ETL. As part of our KPI package, we are all allowed to choose a personal KPI. The personal KPI should be in an area of interest to us, that will give us the opportunity to develop our skills in this area. For my personal KPI, I have chosen to explore ETL and SSIS.


For the following 6 months, I get to set my own personal KPI. Our personal KPIs are designed to help us learn, grow and develop in some professional capacity. I am fascinated by the processes required to “wrangle” data, the challenges of extracting, filtering and transforming raw data into some usable structure. Broadly this is know as ETL, Extract, Transform and Load.

In the SQL Server world, SQL Server Integration Services (SSIS) is the major tool for ETL processes. SSIS crafts the control and data flow aspects of ETL into a cohesive process. Because SSIS is focused on data integration, it provides convenient APIs to connect to a range of data sources, including OLAP, OLEDB, ODBC and flat files. A wide range of filtering and transformation tools makes these tasks as simple as drag-and-drop, in the familiar environment of Visual Studio. I have decided to explore SSIS and see what makes it so exceptional.

This morning I made my first, very simple SSIS package using SQL Server Data Tools. My first package is very simple, it extracts data from a csv file, divides the data alphabetically and inserts it into two different database tables. The SSIS data flow diagram is shown below:

PackageOne: A simple data flow diagram for the extraction of employee information. The data is filtered by LastName and imported into two different database tables

PackageOne: A simple data flow diagram for the extraction of employee information. The data is filtered by LastName and imported into two different database tables

I was impressed with how easy it was to jump into SQL Server Data Tools and just get going. Visual Studio is a superb IDE, it is consistent across all platforms (web, desktop, SSMS, SQL Server Data Tools) and therefore piecing this package together was just a case of  ‘following my nose’. I was pleased, the package worked first time and with a minimum of hassle. SQL Server Central have an excellent series of tutorials, Stairway to Integration Services. In particular, Level 2 gives a great overview of the Data Flow Pipeline and how to configure a simple Data Flow pipeline.

That said, I have some niggling doubts. This package is so simple, that it would have been quicker to write a simple script in Python, or just as simple to bcp the data into a staging table and split it from there. In this case the use-friendly, convenient drag-and-drop did not simplify the task. So I am left wondering whether the full IDE-experience will be a help or a hindrance in a significantly more complex project? We shall have to wait and see…

 

Advertisements

One thought on “Beginning SSIS

  1. Pingback: The Role of General Purpose Data Integration Tools | :: NickBurns

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s